<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Technomanagers]]></title><description><![CDATA[We Decode Complex Product Strategies and reveal the hidden Technologies that power them.
]]></description><link>https://www.technomanagers.com</link><image><url>https://substackcdn.com/image/fetch/$s_!jfG3!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe778cec-f43e-418d-8ca7-155296f5dd1c_1280x1280.png</url><title>Technomanagers</title><link>https://www.technomanagers.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 19 Apr 2026 10:04:33 GMT</lastBuildDate><atom:link href="https://www.technomanagers.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Shailesh Sharma]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[shaileshsharma@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[shaileshsharma@substack.com]]></itunes:email><itunes:name><![CDATA[Shailesh Sharma]]></itunes:name></itunes:owner><itunes:author><![CDATA[Shailesh Sharma]]></itunes:author><googleplay:owner><![CDATA[shaileshsharma@substack.com]]></googleplay:owner><googleplay:email><![CDATA[shaileshsharma@substack.com]]></googleplay:email><googleplay:author><![CDATA[Shailesh Sharma]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Apple AI Strategy ]]></title><description><![CDATA[Everyone thinks Apple is losing the AI race.]]></description><link>https://www.technomanagers.com/p/apple-ai-strategy</link><guid isPermaLink="false">https://www.technomanagers.com/p/apple-ai-strategy</guid><dc:creator><![CDATA[The AI Professional]]></dc:creator><pubDate>Sun, 19 Apr 2026 05:43:54 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/194667159/bde46f6563d7cccddcbb469414e6553f.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>Everyone thinks Apple is losing the AI race. </p><p>But here&#8217;s the truth&#8230; they&#8217;re playing a completely different game. </p><p>To dominate AI, companies need four things: </p><ol><li><p>Infrastructure, </p></li><li><p>Models, </p></li><li><p>Data, </p></li><li><p>and Distribution. </p></li></ol><p>Apple may be weak in AI models, but they are insanely strong in distribution. </p><p>They sell 200+ million iPhones every year. </p><p>And think about how we use phones. </p><p>We ask simple things &#8212; summarise emails, find photos, send quick messages. </p><p>So Apple&#8217;s strategy is simple: use our personal data to give hyper-personalised AI, and build the best interface on devices. and let others build the giant models. </p><p>For this, Apple is simply partnering with Google. </p><p>So Apple may never build the smartest AI&#8230; but they might build the most useful AI on our phone. Subscribe for more business strategy breakdowns.</p>]]></content:encoded></item><item><title><![CDATA[Why Your Recommender Keeps Forgetting You?]]></title><description><![CDATA[AI Product Management Case Study]]></description><link>https://www.technomanagers.com/p/why-your-recommender-keeps-forgetting</link><guid isPermaLink="false">https://www.technomanagers.com/p/why-your-recommender-keeps-forgetting</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sat, 18 Apr 2026 12:08:46 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9698b809-65a8-48ac-a245-db55e4260e97_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine this, you buy an iPhone on Amazon. </p><p>Three days later, you buy a case for it. A week after that, you buy AirPods. Normal journey. </p><p>Your recommendation feed is doing its job.</p><p>Then life happens. You buy a birthday gift for your niece. A toy. Then a book for your dad. Then a yoga mat. Then some groceries.</p><p>Now you come back looking for a screen protector for that iPhone.</p><p><strong>Here is the problem. Your recommender has forgotten about the iPhone.</strong></p><p>The model remembers what you did most recently. <br>Toy. Book. Yoga mat. Groceries. <br>Based on that, it is now quietly convinced you are a gifting parent with a wellness streak. It is showing you more toys, more books, more yoga equipment.</p><p>Meanwhile, the single most important signal about what you want right now, the iPhone from three weeks ago, has been washed out.</p><p>This is not a theoretical problem. This is happening on most recommendation systems you use today. </p><p>Today, we are going to see how to fix that.</p><p>In our previous piece, we <a href="https://www.technomanagers.com/p/how-session-based-rnns-predict-your">explained how TikTok uses session-based RNNs</a> to predict your next swipe. At the end, we flagged three pitfalls. One of them was Catastrophic Forgetting. This article is a deep dive into the paper that solved it.</p><p>If you are preparing for AI PM interviews, recommendation system design is the most commonly asked system design topic at senior levels. <a href="https://topmate.io/technomanagers/1861184">We teach this in our course.</a></p><h2>The iPhone Problem</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QZxe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QZxe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!QZxe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!QZxe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!QZxe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QZxe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:298769,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/194601714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QZxe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!QZxe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!QZxe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!QZxe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3b2f278-6e43-45d9-9497-f59a9c7b26e1_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let us go back to your Amazon story. Why did the model forget the iPhone?</p><p>The reason lies in how most recommenders store your history.</p><p>An RNN-based recommender works like this. Every time you buy something, the model converts that item into a short numerical fingerprint. Then it mixes that fingerprint into a single vector called the hidden state. One vector. That is the model&#8217;s entire memory of you.</p><p>Think of the hidden state like a single sticky note. Every time you buy something, the model scribbles on that same note, and whatever was written before gets slightly smudged.</p><p>After your iPhone purchase, the note says &#8220;wants tech accessories.&#8221;</p><p>After the case and AirPods, it still says roughly that.</p><p>Then you buy a toy. The note gets rewritten. Now it says &#8220;tech accessories and a gift.&#8221;</p><p>Then a book. Yoga mat. Groceries. By the time you come back for that screen protector, the sticky note no longer mentions the iPhone at all. It says something like &#8220;parent on a wellness kick with household needs.&#8221;</p><p>The iPhone signal is not lost. It is buried. Smeared under four unrelated purchases.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VmfF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VmfF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!VmfF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!VmfF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!VmfF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VmfF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:432020,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/194601714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VmfF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!VmfF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!VmfF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!VmfF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64659897-9a44-49ad-8b23-b04d5d8f4035_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is called Catastrophic Forgetting. And it is not a bug you can fix by tuning the model. It is a fundamental flaw in the architecture. The sticky note itself is too small to hold what it needs to hold.</p><h2>Why This Breaks Product Experience</h2><p>This has two costs that hit you directly as a PM.</p><ol><li><p>The first cost is performance. Your model misses the highest-signal moments in a user&#8217;s journey because they get washed out by noise. A user who bought an iPhone three weeks ago is an obvious candidate for iPhone accessories. Your model does not see it. Your revenue per user suffers.</p></li><li><p>The second cost is explainability. You cannot tell a user why something was recommended. You cannot tell your leadership why the model did what it did. A single hidden vector is a black box even to the people who built it.</p></li></ol><p>If you have ever been in a meeting where your head of product asks, &#8220;Why is the model recommending this?&#8221; and your ML lead says, &#8220;The embeddings suggest...&#8221;, you have lived this problem.</p><h2>How Humans Actually Remember</h2><p>Here is the interesting part. You do not have this problem.</p><p>If someone asks you what to get for a new baby, you do not scan every memory from your entire life. You pull up the specific episode of buying baby stuff for your niece last year. You focus on that. Everything else stays quiet in the background.</p><p>You have episodic memory. You can pull up specific moments on demand.</p><p>Your recommender does not have this. It only has the sticky note.</p><blockquote><p><strong>What if we gave the recommender episodic memory?</strong></p></blockquote><h2>The Fix: A Memory Box, Not a Sticky Note</h2><p>Instead of the single hidden vector, can we give every user a small memory box?</p><p>Think of the box as a row of 20 labelled drawers. Each drawer holds one past purchase. When you buy something new, it goes into a fresh drawer. The oldest drawer gets emptied to make space.</p><p>At any moment, your box has your last 20 purchases, sitting side by side. <br>The iPhone is in drawer 17. <br>The case in drawer 16. <br>The AirPods in drawer 15. <br>The toy in drawer 14. <br>The book in drawer 13. </p><p>And so on.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n95E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n95E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!n95E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!n95E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!n95E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n95E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350071,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/194601714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n95E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!n95E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!n95E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!n95E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92926d10-a785-4b84-abb4-65f1ecb32c7d_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nothing is smudged. Nothing is averaged. Each purchase sits cleanly in its own drawer.</p><p>Now, when you come back looking for a screen protector, the model does something clever. It does not read all 20 drawers equally. It asks a question.</p><p>&#8220;Which of these past purchases is most relevant to a screen protector?&#8221;</p><p>It scans each drawer, scores the similarity, and pays attention to the ones that match. The iPhone drawer lights up. The toy drawer stays dim. The book drawer stays dim.</p><p>The model pulls out the iPhone signal cleanly and recommends the perfect screen protector.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MEvq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MEvq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!MEvq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!MEvq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!MEvq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MEvq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:527502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/194601714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MEvq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!MEvq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!MEvq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!MEvq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11d1055-fa8c-40de-a50c-df59436352b9_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>This is exactly how attention works in modern AI. The model decides what to focus on based on what it is trying to do right now.</p></blockquote><h2>Two Versions of the Same Idea</h2><p>Here this we can do in two ways, both use the same core idea. They differ in what they store.</p><ol><li><p>The first version is called item-level RUM. <br>Each drawer in the box holds an actual past purchase. iPhone in one drawer. AirPods in another. This is simple. It is also explainable. You can literally tell the user that we showed you this because of that iPhone you bought three weeks ago.</p></li><li><p>The second version is called feature-level RUM. <br>Each drawer does not hold a purchase. It holds a preference. One drawer tracks your brand preference. Another tracks your price sensitivity. Another tracks your style preference. Every time you buy something, the drawers get gently updated. Buy an Apple product, and the brand drawer leans more towards Apple. Buy something cheap; the price is more budget-friendly.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EF86!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EF86!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!EF86!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!EF86!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!EF86!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EF86!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:923468,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/194601714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EF86!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!EF86!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!EF86!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!EF86!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47058f43-cfa9-47d9-9ba2-6468ab5a0a19_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The second version tends to perform better. The first is easier to explain.</p><div class="pullquote"><p>If you work in a domain that demands explainability, such as finance or healthcare, <strong>go item-level. </strong></p><p>If you are running a pure engagement product where performance is everything, <strong>go feature-level.</strong></p></div><h2>How The Memory Updates</h2><p>The item version is simple. New purchase comes in, oldest one gets kicked out. First in, first out. A 20-slot box always holds the last 20 purchases.</p><p>The feature version is more interesting.</p><p>When you buy something new, the model does two things. </p><ol><li><p>First, it decides what to forget. If you just bought an Android phone, your brand preference for Apple should fade. The model computes a forget signal and uses it to gently erase the old preference.</p></li><li><p>Then it decides what to reinforce. Your brand preference for Android should go up. The model computes an add signal and writes it to the drawer.</p></li></ol><p>The mental model is simple. Every time you buy something, the relevant drawers in your memory box get a small dusting-off followed by a small update. </p><p>The beautiful thing is that the model learns what to forget and what to reinforce on its own. You do not write rules. You show it millions of user sequences, and it figures out the pattern.</p><h2>Thing which Product Manager needs to decide</h2><h4>Memory size</h4><p>How many drawers per user? More drawers mean richer history, but more computing. 20 might work for e-commerce. For a content platform like TikTok, where users burn through items in seconds, you might want 50 or 100.</p><h4>Item level or feature level</h4><p>Explainability or performance. Pick one. You cannot have both.</p><h4>Memory weighting</h4><p>optimal weight for recent behaviour. Start there. Then an A/B test. Stable domains like books or music can push intrinsic weight higher. Volatile domains like news or short-form video need more memory weight.</p><h4>Write strategy</h4><p>For item level, first-in-first-out is fine. For the feature level, you need the forget-and-reinforce approach. It is more powerful. It is also harder to debug.</p><p>If this article changed how you think about memory, recommendation architectures, and AI system design, you will find much more depth in our AI PM course. </p><blockquote><p><em>We cover these in 40+ Videos and 25+ Case Studies, along with AI PM interview questions from top AI companies.</em></p></blockquote><p><a href="https://topmate.io/technomanagers/1861184">Check our highest-rated AI PM course (Including AI PM Interview Preparation) &#183; 4.9/5 &#183; 600+ enrollments.</a></p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. <strong><a href="https://topmate.io/technomanagers">Weekly Live Webinars/MasterClass ( Here )</a></strong></em></p>]]></content:encoded></item><item><title><![CDATA[90-Day Plan: Become an AI PM (starting from Zero)]]></title><description><![CDATA[If I Had to Start Over in 2026]]></description><link>https://www.technomanagers.com/p/90-day-plan-become-an-ai-pm-starting</link><guid isPermaLink="false">https://www.technomanagers.com/p/90-day-plan-become-an-ai-pm-starting</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Thu, 16 Apr 2026 16:58:07 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/26b237ae-2b48-4c0d-98f3-100b3ae0913d_1920x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If I lost every skill I have tomorrow and had 90 days to get hired as an AI PM, I would not watch a single YouTube video about prompt engineering for the first three weeks.</p><p>I would not open ChatGPT. I would not read a blog post about agents. I would not touch a no-code tool.</p><p>I would do something most PMs skip entirely.</p><p>I would learn how AI systems actually work before I try to build anything with them.</p><p>This sounds obvious. It is not what people do. What people do is jump straight to the shiny layer. They learn prompting. They learn vibe coding. They built a chatbot in an afternoon and updated their LinkedIn headline to &#8220;AI Product Manager.&#8221;</p><p>Then they sit in an interview. The interviewer asks how they would design evaluations for a RAG-based support system. They freeze.</p><p>They have no answer because they skipped the foundation that the answer sits on.</p><p>90 days is enough time. But only if you sequence things correctly.</p><p>Here is the exact sequence.</p><h2><strong>Weeks 1 to 3. The Foundation Nobody Wants to Build.</strong></h2><p>Most PMs hear about AI and start with Generative AI. This is backwards.</p><p>Generative AI is a layer that sits on top of machine learning. Machine learning sits on top of data systems. If you do not understand the layers below, you cannot reason about the layer above.</p><h4><strong>Week 1 is about machine learning at a PM level. Not math. Not code. Concepts.</strong></h4><ol><li><p>What is supervised learning, &amp; What is unsupervised learning?</p></li><li><p>What is the difference between a classification problem and a regression problem?</p></li><li><p>When your team says we trained a model, what does that actually mean?</p></li><li><p>What did they feed it? What did they optimise for? What could go wrong?</p></li></ol><blockquote><p><em>You are not learning this to become a data scientist.</em></p></blockquote><p>You are learning this so you can sit in a design review and know whether the team picked the right approach.</p><blockquote><p>If you cannot do that, you are not an AI PM. You are a project manager with a fancy title.</p></blockquote><h2><strong>Week 2 is about the AI Flywheel.</strong></h2><p>This is the concept that separates AI products from regular software.</p><p>In regular software, a feature works the same way on day 1 and day 1000.</p><p>In AI, the product should get smarter the more people use it. Users generate data. Data improves the model. The better model creates better user experiences. Better experiences bring more users.</p><blockquote><p><em>If you cannot design this loop for your product, you do not have an AI strategy.</em></p></blockquote><p>You have a feature with an API call.</p><h2><strong>Week 3 is about data pipelines.</strong></h2><p>This is the week that will feel the most boring and will be the most valuable.</p><p>Your AI is only as good as the data feeding it.</p><blockquote><p>Dirty data. Biased data. Missing data. Poorly labelled data.</p></blockquote><p>These are not engineering problems. These are product problems.</p><p>The PM who understands data pipelines catches issues in the design phase. The PM who does not catch them in production, after users have already had a bad experience.</p><p>By the end of Week 3, you should be able to whiteboard a basic ML system. Data in. Feature engineering. Model training. Prediction. Feedback loop. If you cannot draw this, you are not ready for what comes next.</p><h2><strong>Weeks 4 and 5. Algorithms and Case Studies.</strong></h2><p>You do not need to derive the math behind logistic regression.</p><p>You need to know when to use it.</p><p>Here is a real scenario.</p><p>Your team is deciding between a decision tree and a linear model for a pricing feature.</p><p>The engineer explains the trade-offs. If you do not understand what either approach does, you are sitting in that meeting as a spectator.</p><p>You are waiting for someone else to make a decision that is yours to make.</p><p>Week 4: learn the three algorithms that cover 80% of PM-relevant AI decisions.</p><ol><li><p>Linear regression for predicting continuous values.</p></li><li><p>Logistic regression for classification.</p></li><li><p>Decision trees for complex, non-linear problems.</p></li></ol><p>For each one, learn what it does, when it works, when it breaks, and what the output looks like.</p><p>Do not learn these from textbooks. Learn them from product case studies. How Uber uses prediction models for pricing.</p><p><em><strong>How Netflix uses collaborative filtering for recommendations.</strong></em></p><p><em><strong>How Amazon designs the data flywheel behind Alexa.</strong></em></p><p>Week 5 is entirely case studies. Read 10 to 15 real company teardowns.</p><p>How does Lyft balance model accuracy against latency in real-time pricing, and how does Amazon show the next best category to the users? How does Netflix do creative personalisation on the Homepage? <em><strong><a href="https://topmate.io/technomanagers/1861184">(You can find Case Studies Here)</a></strong></em></p><p>Every case study you internalise becomes a mental model you can pull out in a conversation, a strategy discussion, or an interview.</p><p>The PMs who sound the sharpest in rooms are the ones with the deepest library of real-world references.</p><h2><strong>Weeks 6 and 7. Generative AI From First Principles.</strong></h2><p>Now you are ready for Gen AI. Not before.</p><p>Week 6 is about understanding generative AI, how it is different from the AI that we learnt in the last few weeks. How does Generative AI work? What are some of the applications that generative AI can solve?</p><p>Then go deeper into the hood of Generative AI</p><ol><li><p>What is a transformer? What is a token? What is a context window?</p></li><li><p>What happens when you increase the temperature from 0.2 to 0.9?</p></li><li><p>Why does the same prompt give different outputs each time?</p></li><li><p>Why does the model hallucinate, and when is hallucination more likely?</p></li></ol><p>These are not academic questions. If your AI feature is hallucinating and you do not know whether the problem is the prompt, the temperature, the model, or the retrieval layer, you cannot diagnose it.</p><p>You are dependent on an engineer to figure it out. That is a loss of ownership.</p><p>Week 7 is prompting. Not &#8220;write me a blog post&#8221; prompting. Structural prompting.</p><ol><li><p>Chain of Thought. Tree of Thought. Few-shot examples.</p></li><li><p>System-level constraints. Prompt chaining for multi-step workflows.</p></li></ol><p>These techniques are the difference between an AI feature that works 60% of the time and one that works 95% of the time.</p><p>If you are building a product used by millions of people, that 35% gap is the difference between a feature users trust and a feature users abandon.</p><p>By the end of Week 7, you should be able to write a multi-step prompt chain that produces consistent, reliable output for a defined product use case. Not a toy demo. A real workflow.</p><h2><strong>Weeks 8 and 9. Prototyping. The Phase That Changes Your Career.</strong></h2><p>Everything before this was understandable. This is where you start building.</p><p>The gap between PMs who understand AI and PMs who build with AI is the largest salary gap in product management right now. The difference is not 10% or 20%. It is 2x to 3x.</p><h4><strong>Week 8: build your first working prototype.</strong></h4><p>Not a mockup. Not a slide deck. A working thing where the AI takes an input, processes it, and returns an output a real user can interact with.</p><p>Use Cursor. Use Replit. Use Claude. The tools are too good now for any PM to say &#8220;I cannot build anything.&#8221; You do not need to write production code. You need to string together an API, a prompt, and a simple interface.</p><p>Pick a real problem you face at work. Build a tool that solves it in an afternoon. When you walk into a meeting and show your team a working prototype instead of a spec, the dynamic changes permanently. You stop being the person who requests things. You become the person who builds things.</p><h4><strong>Week 9: Make the prototype reliable.</strong></h4><p>This is where most vibe coding efforts die.</p><p>The prototype works 70% of the time. The PM calls it done. Post it on LinkedIn. Gets some likes.</p><blockquote><p>But 70% reliability is not a product. It is a demo.</p></blockquote><p>Week 9 is about model control. Temperature tuning. Choosing between a fast, cheap model and a slow, expensive one for different parts of the workflow.</p><p>Building a reliability framework. Understanding the cost-per-query math so you can tell leadership &#8220;this feature costs X per user per month&#8221; and mean it.</p><p>These are PM decisions. Not engineering decisions. The PM who owns these trade-offs owns the product. The PM who delegates them owns a spec.</p><h2><strong>Weeks 10 and 11. RAG, Agents, and Evals.</strong></h2><p>If you have done everything above, you are in the top 10% of PMs by AI fluency.</p><blockquote><p><em>The top 1% knows three more things. RAG systems. AI agents. And evals.</em></p></blockquote><h3><strong>RAG</strong></h3><p>Almost every enterprise AI product is a RAG system. The reason is simple. GPT does not know your company data. Claude does not know your Q3 metrics. No off-the-shelf model knows your customer support documentation.</p><p>RAG bridges this gap. It retrieves relevant chunks from your private data and feeds them to the model so it can generate answers grounded in your specific context.</p><p>If you are PMing an enterprise AI product and you cannot explain how RAG works, how chunking affects retrieval quality, or what a vector database does, you cannot debug the most common failure mode: the system returning wrong or irrelevant answers.</p><h3><strong>Agents</strong></h3><p>These are AI systems that do not just respond. They act. They plan a multi-step workflow, use external tools, and execute tasks autonomously. The PM challenge is different here. You need to design guardrails, failure states, and human-in-the-loop checkpoints for a system that makes its own decisions.</p><h4><strong>Now, the most important skill of all three: evals.</strong></h4><p>Evals are how you measure whether your AI system is good.</p><p>This sounds simple. It is the hardest unsolved problem in AI product management. You cannot use traditional metrics. Pass/fail does not work when the output is a paragraph of text. You need deterministic evals for things you can measure objectively. You need probabilistic evals where you use one AI model to judge another.</p><p>The PMs who understand evals ship with confidence. They set measurable quality bars. They can defend their decisions to leadership with data.</p><p><a href="https://topmate.io/technomanagers/1861184">We have covered Advanced Evals here</a></p><h2><strong>Weeks 12 and 13. Interview Preparation &amp; Portfolio</strong></h2><p>You can know all of the above. If you cannot communicate it under pressure in 45 minutes, none of it counts.</p><p>AI PM interviews do not ask you to design an alarm clock for the blind. They ask questions like these:</p><ol><li><p>How would you measure the success of GPT 5.0?</p></li><li><p>Design a reliability framework for an AI shopping assistant.</p></li><li><p>ChatGPT&#8217;s regeneration rate has increased. How would you investigate?</p></li><li><p>How would you price Gemini?</p></li><li><p>Design a RAG system for TikTok content moderation.</p></li><li><p>Imagine Google made its model free, and it is better than paid GPT. You are Sam Altman. What do you do?</p></li><li><p><strong><a href="https://topmate.io/technomanagers/1861184">We have 20 such Real AI PM Interview Questions here</a></strong></p></li></ol><p>If you have never practised these questions, you will fumble.</p><p>Not because you do not know the concepts. Because you have not built the muscle of structuring an AI PM answer under time pressure.</p><h4><strong>The structure matters.</strong></h4><ol><li><p>Start by clarifying the AI system architecture.</p></li><li><p>Define success metrics specific to AI products.</p></li><li><p>Address trade-offs unique to probabilistic systems. Show cost-per-query awareness.</p></li><li><p>Show eval thinking. Demonstrate that you can move between product sense and technical depth in the same answer.</p></li></ol><p>Week 12 is practice. Answer questions out loud. Record yourself. Listen back. Find the moments when you hedged, when you went vague, when you lost the technical thread.</p><h3><strong>Week 13 is portfolio.</strong></h3><p>Document the prototype you built in Week 8.</p><ol><li><p>Write up two case study analyses from Week 5.</p></li><li><p>Create a one-page eval framework for an AI feature.</p></li><li><p>This is your proof of work. It is the difference between &#8220;I learned about AI&#8221; and &#8220;I built with AI.&#8221;</p></li></ol><p>This plan will make you absolutely beast after 12&#8211;13 weeks. 95% of PMs cannot do these things right now. The ones who can are not smarter. They just did the work in the right order. Most of this plan maps directly to what I teach. <strong>If you want to skip the self-study phase??</strong></p><blockquote><p>Check our <strong>highest-rated AI PM course (Including AI PM Interview Preparation )&#183; 4.9/5 &#183; 600+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong></p></blockquote><h2><strong>About Author</strong></h2><p><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers">Live Webinars</a>, <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></p>]]></content:encoded></item><item><title><![CDATA[Advanced Evals - Traces in AI Evals]]></title><description><![CDATA[How to Debug AI Systems That Think in Steps]]></description><link>https://www.technomanagers.com/p/advanced-evals-traces-in-ai-evals</link><guid isPermaLink="false">https://www.technomanagers.com/p/advanced-evals-traces-in-ai-evals</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Tue, 14 Apr 2026 20:28:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3928af50-5666-48dd-a427-0ace77ca3dab_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You are a product manager at Amazon.</p><p>You just shipped Rufus. The AI shopping assistant that lives inside the Amazon app.</p><p>A user types: I am looking for running shoes for flat feet under 5000 rupees with good cushioning.</p><p>Your system does not just call an LLM and return a response. It runs a chain of operations.</p><ol><li><p>First, it classifies the user&#8217;s intent. Is this a product search? A comparison request? A return query?</p></li><li><p>Then, it extracts structured attributes from the query. Category: running shoes. Foot type: flat feet. Budget: under 5000. Feature: cushioning.</p></li><li><p>Then, it calls the product search API with those attributes and retrieves 20 candidate products.</p></li><li><p>Then, it applies a reranking model to sort those 20 products by relevance to the original query.</p></li><li><p>Then, it feeds the top 5 products and the original query to an LLM, which generates a conversational response with recommendations.</p></li><li><p>Finally, it applies a safety filter to check for hallucinated claims. Did the LLM say a shoe has orthopaedic certification when the product listing never mentioned it?</p></li></ol><p>Six places where something can go wrong.</p><p>The user sees the final response. It recommends three shoes. One of them is a basketball shoe. The cushioning claim on another is fabricated. The third recommendation is fine, but it costs 7200 rupees, which is above the stated budget.</p><p>Your VP asks what happened.</p><p>You look at the final output. It looks broken. But you have no idea which step broke it.</p><p>&#8212;&gt; Was it the intent classifier? <br>&#8212;&gt; The attribute extractor? <br>&#8212;&gt; The search API? <br>&#8212;&gt; The reranker? <br>&#8212;&gt; The LLM? The safety filter?</p><p>This is where traces come in.</p><h2>Why Traditional Evals Cannot Debug Multi-Step AI Systems</h2><p>In the previous article, <a href="https://www.technomanagers.com/p/advanced-evals-evals-for-rag">we covered RAG evals. Precision, Recall, MRR</a>. Those metrics evaluate one specific component: the retrieval layer. They tell you whether your system pulled the right documents.</p><p>But modern AI systems are not single-component systems. They are pipelines. Chains. Agents. Multiple models calling multiple tools in sequence, where the output of one step becomes the input of the next.</p><blockquote><p>Traditional evals look at the final output and ask: Was this answer good?</p></blockquote><p>Traces solve this problem. A trace is a complete record of everything your AI system did to produce a single response. Every step. Every input. Every output. Every decision in that order.</p><p>If the final answer is wrong, the trace tells you exactly where the pipeline broke.</p><h2>What Is a Trace?</h2><p>A trace is borrowed from distributed systems. In traditional software engineering, when a user requests a web application, that request might travel through an API gateway, a load balancer, a backend service, a database, and a cache. A distributed trace records each hop, so engineers can see the full journey of a single request.</p><p>AI traces do the same thing, but for AI pipelines.</p><p>A trace represents the full lifecycle of a single user interaction with your AI system. From the moment the user sends a query to the moment the system returns a response.</p><p>A trace is made up of spans.</p><p>A span is one unit of work inside the trace. One step. One operation. One model call. One API request. One tool invocation.</p><h4>Every span records four things.</h4><ol><li><p>What went in. The input to that step.</p></li><li><p>What came out. The output of that step.</p></li><li><p>How long did it took? The latency.</p></li><li><p>What type of operation it was. An LLM call, a retrieval step, a tool call, a function execution.</p></li></ol><p>Spans are nested. A parent span can contain child spans. This creates a tree structure that shows exactly how your system executed.</p><p>This is the anatomy of a trace. A tree of spans, each recording the inputs, outputs, and timing of a single step.</p><h2>Walking Through a Real Trace</h2><p>Let us go back to Rufus. The user asked: <br>I am looking for running shoes for flat feet under 5000 rupees with good cushioning.</p><p>Here is the trace your system recorded. Six spans, in order.</p><h3>Span 1: Intent Classifier</h3><blockquote><p>Input: I am looking for running shoes for flat feet under 5000 rupees with good cushioning.<br>Output: intent = product_search<br>Latency: 45ms <br>Model: Internal classifier v3</p></blockquote><p>This span worked correctly. The intent is product search. No issues here.</p><h3>Span 2: Attribute Extractor</h3><blockquote><p>Input: I am looking for running shoes for flat feet under 5000 rupees with good cushioning. <br>Output: {category: &#8220;running_shoes&#8221;, foot_type: &#8220;flat_feet&#8221;, max_price: 5000, feature: &#8220;cushioning&#8221;} <br>Latency: 120ms <br>Model: GPT-4o mini</p></blockquote><p>This span also worked correctly. All four attributes were extracted accurately from the query.</p><h3>Span 3: Product Search API</h3><blockquote><p>Input: {category: &#8220;running_shoes&#8221;, foot_type: &#8220;flat_feet&#8221;, max_price: 5000, feature: &#8220;cushioning&#8221;} <br>Output: 20 products returned. Product IDs: [A1, A2, A3, ... A20] <br>Latency: 230ms</p></blockquote><p>Here is the first problem. When you inspect the 20 products returned, you find that product A7 is a basketball shoe. The search API matched on cushioning but ignored the running_shoes category constraint. The API treated the category as a soft filter instead of a hard filter.</p><p>You also find that product A14 costs 7200 rupees. The max_price filter did not work as a strict cutoff.</p><p>Two bugs. Both in the search API span. Neither is visible in the final output without the trace.</p><h3>Span 4: Reranker</h3><blockquote><p>Input: 20 products from Span 3, original query. <br>Output: Top 5 ranked products: [A3, A7, A12, A14, A1]. <br>Latency: 180ms <br>Model: Cross-encoder reranker v2</p></blockquote><p>The reranker promoted A7 (the basketball shoe) to position 2 and A14 (the overpriced shoe) to position 4. The reranker matched on cushioning and boosted both products because they had strong cushioning scores.</p><p>The reranker did its job given the inputs it received. But those inputs were already contaminated by the search API.</p><p>This is a cascading failure. The search API lets in bad products. The reranker, operating on bad inputs, made the problem worse by promoting them.</p><h3>Span 5: Response Generator (LLM)</h3><blockquote><p>Input: Top 5 products + original query <br>Output: Based on your requirements, here are three great options for flat feet with excellent cushioning: 1. Nike Revolution 6 (Rs 3,499) with orthopaedic-grade cushioning technology... 2. Adidas CourtSmash (Rs 4,299) with premium arch support... 3. ASICS Gel-Kayano (Rs 7,199) with superior gel cushioning...<br>Latency: 1,200ms <br>Model: Claude Sonnet Tokens: 340 input, 180 output</p></blockquote><p>Multiple problems surfaced here. </p><ol><li><p>The LLM included A14 (ASICS at Rs 7,199) despite the user asking for under 5000.</p></li><li><p>The LLM fabricated orthopaedic-grade cushioning technology for the Nike shoe.</p></li><li><p>That phrase does not exist in the product listing. </p></li><li><p>And the LLM recommended the Adidas CourtSmash, which is the basketball shoe (A7) that the search API should have filtered out.</p></li></ol><h3>Span 6: Safety Filter</h3><blockquote><p>Input: Generated response. <br>Output: Response passed. No safety violations detected. <br>Latency: 85ms</p></blockquote><p>The safety filter checked for toxicity, PII, and explicit content. It did not check for factual accuracy against product listings. It did not catch the hallucinated orthopaedic-grade claim. It did not catch the budget violation.</p><p>The safety filter passed a response that contained two factual errors.</p><h2>What the Trace Reveals</h2><p>Without the trace, all you know is that the final answer was bad. With the trace, you know exactly what happened.</p><ol><li><p>The search API had two bugs. The category filter was soft instead of hard. The price filter allowed products above the stated maximum.</p></li><li><p>The reranker amplified the problem. It promoted bad products because it optimised for feature match without respecting hard constraints.</p></li><li><p>The LLM hallucinated a product claim. It added orthopaedic-grade cushioning technology, which does not exist in any source data.</p></li><li><p>The LLM ignored a constraint. It recommended a product above the user&#8217;s budget.</p></li><li><p>The safety filter was incomplete. It checked for toxicity but not for factual grounding or constraint adherence.</p></li></ol><p><em>Five distinct failure points. Three different components. Two cascading failures. One root cause (the search API) propagated through the entire pipeline.</em></p><p>You cannot find any of this by evaluating only the final output.</p><h2>Span-Level Evals: Evaluating Each Step Independently</h2><p>This is where traces and evals converge.</p><p>Traditional evals evaluate the system end-to-end. You compare the final output against a ground truth. That tells you whether the system worked, but not where it failed.</p><p>Span-level evals evaluate each span independently. You attach an evaluation metric to each span in the trace. Each step gets its own scorecard.</p><p>Let us apply this to our Rufus trace.</p><h3>Eval for Span 1 (Intent Classifier)</h3><p>Metric: Classification accuracy. <br>Ground truth: product_search<br>System output: product_search<br>Score: 1.0. Correct.</p><h3>Eval for Span 2 (Attribute Extractor)</h3><p>Metric: Attribute extraction F1. <br>Ground truth: {category: &#8220;running_shoes&#8221;, foot_type: &#8220;flat_feet&#8221;, max_price: 5000, feature: &#8220;cushioning&#8221;} <br>System output: Same. <br>Score: 1.0. All attributes correctly extracted.</p><h3>Eval for Span 3 (Search API)</h3><p>Metric 1: Category precision. What fraction of returned products match the requested category? 18 out of 20 products are running shoes. 2 are not. Score: 0.90.</p><p>Metric 2: Price constraint adherence. What fraction of returned products are under the stated max price? 17 out of 20 are under 5000. 3 are above. Score: 0.85.</p><p>Both scores reveal a leaky filter. Neither score would surface from an end-to-end eval.</p><h3>Eval for Span 4 (Reranker)</h3><p>Metric: NDCG (Normalised Discounted Cumulative Gain). Did the reranker place the most relevant products at the top?</p><p>If we define relevance as products that match ALL stated criteria (running shoes, flat feet, under 5000, good cushioning), then positions 2 and 4 in the top 5 contain products that violate at least one constraint.</p><p>NDCG@5: 0.72.</p><p>The reranker is optimising for partial relevance. It matches on some attributes while ignoring others.</p><h3>Eval for Span 5 (LLM Response)</h3><p>Metric 1: Faithfulness. Does every claim in the response have a source in the input products? The orthopaedic-grade cushioning technology claim has no source. Faithfulness score: 0.67.</p><p>Metric 2: Constraint adherence. Does the response respect all user-stated constraints? One product exceeds the budget. <br>Score: 0.67 (2 out of 3 recommendations within budget).</p><h3>Eval for Span 6 (Safety Filter)</h3><p>Metric: Hallucination detection rate. What fraction of factually unsupported claims were caught? The safety filter caught 0 out of 1 hallucinated claims. Score: 0.0 for factual grounding.</p><p>Now look at what you have.</p><p>A full diagnostic report. Each component was scored independently. You know that the intent classifier and attribute extractor are working perfectly. You know the search API has a filter leakage problem. You know the reranker needs constraint-aware scoring. You know the LLM has a faithfulness problem. You know the safety filter has a coverage gap.</p><p>This is an eval report built on traces. You cannot produce this without tracing your system.</p><h2>The Trace-to-Eval Pipeline</h2><p>Traces do not just help you debug individual failures. They create a flywheel for continuous improvement.</p><p>Here is how the pipeline works.</p><ol><li><p>Step 1: Your system logs traces from production. Every user query generates a trace with all its spans.</p></li><li><p>Step 2: You sample traces. Not every trace needs evaluation. You pick a subset. Maybe 1 to 5 percent of production traffic. Maybe all traces where the user gave a thumbs down. Maybe all traces where the response latency exceeded a threshold.</p></li><li><p>Step 3: You run automated evals on the sampled traces. LLM-as-a-judge scores each span for faithfulness, relevance, constraint adherence, whatever metrics matter for your product. This is called online evaluation.</p></li><li><p>Step 4: Traces that score poorly get routed to a human review queue. Domain experts look at the trace, examine each span, and annotate where the system failed. These annotated traces become your golden dataset.</p></li><li><p>Step 5: You use the golden dataset for offline evaluation. Before shipping any change to any component, you run the new version against your golden dataset and compare span-level scores.</p></li><li><p>Step 6: The improved system goes to production. It generates better traces. Those traces get sampled and evaluated. The cycle repeats.</p></li></ol><p>This is the trace-eval flywheel. Production traces become eval datasets. Eval datasets drive improvements. Improvements generate better traces. The system gets better every cycle.</p><p>Without traces, this flywheel does not exist. You cannot build a golden dataset if you do not know what each component did at each step.</p><h2>End-to-End Evals vs Span-Level Evals</h2><p>There is a common mistake teams make after they discover span-level evals. They stop running end-to-end evals entirely.</p><p>This is wrong. You need both. Here is why.</p><p>Span-level evals catch component failures. They tell you which step broke. But they cannot catch emergent failures. Failures that only appear when components interact.</p><p>Consider this scenario. The intent classifier outputs &#8220;product_search&#8221; correctly. The attribute extractor outputs all four attributes correctly. The search API returns 20 relevant products. The reranker ranks them well. The LLM generates a fluent response. Every span passes its individual eval.</p><p>But the final response is still bad. The LLM picked three products that are all from the same brand. The user sees no variety. The response feels like a sponsored advertisement.</p><p>No individual span failed. The failure is emergent. It exists only in the interaction between the reranker (which promoted similar products) and the LLM (which did not add a diversity constraint).</p><p>End-to-end evals catch this. They evaluate the final output as a whole. Diversity, user satisfaction, and task completion.</p><h4>The framework is simple.</h4><div class="pullquote"><p>Use span-level evals to catch component failures. Where did the pipeline break?</p><p>Use end-to-end evals to catch emergent failures. Does the full pipeline produce good outcomes even when every component looks fine individually?</p><p>Use traces to connect the two. When an end-to-end eval catches a failure, walk the trace to find the root cause.</p></div><p>If you want to go deeper on Advanced Evals (Cohen&#8217;s Kappa, Matthew&#8217;s Correlation Coefficient), Evals for Agentic Architecture, AI Product Sense, AI Strategy, AI Pricing, AI Prototyping, Advanced Prompting, ML Systems, etc., check out my AI PM course (40+ Videos and 25+ Case Studies) [Certification Included]</p><p>Check our highest-rated AI PM course (Including AI PM Interview Preparation) &#183; 4.9/5 &#183; 600+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. <strong><a href="https://topmate.io/technomanagers">Weekly Live Webinars/MasterClass ( Here )</a>, <a href="https://topmate.io/technomanagers">AI PM Resources</a></strong></em></p>]]></content:encoded></item><item><title><![CDATA[Advanced Evals - Evals for RAG]]></title><description><![CDATA[A worked example of Precision@K, Recall@K, and MRR using Google AI Overviews.]]></description><link>https://www.technomanagers.com/p/advanced-evals-evals-for-rag</link><guid isPermaLink="false">https://www.technomanagers.com/p/advanced-evals-evals-for-rag</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Mon, 13 Apr 2026 21:11:45 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/c7c6c965-d99a-477d-9d85-25f4060ea6f4_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You are a product manager at Google. </p><p>You just shipped AI Overviews. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The feature that puts an AI-generated answer right at the top of search results.</p><p>A user types: &#8220;Why does my iPhone battery drain fast after the iOS 26 update?&#8221;</p><p>Your system does two things. </p><ol><li><p>First, it retrieves five web pages from Google&#8217;s index that it thinks are relevant.</p></li><li><p>Then, it feeds those pages to Gemini and generates a summary answer.</p></li></ol><p>The answer looks clean. The formatting is right. Gemini&#8217;s language is fluent. Your VP sees it in a demo and says ship it.</p><p>But here is the question nobody in the room asked.</p><p>Were those five retrieved pages actually the right ones?</p><blockquote><p>Because if your retrieval pulled garbage, Gemini just summarised garbage. </p></blockquote><p>This is the problem every team building RAG systems runs into. And almost nobody evaluates it correctly.</p><h2>Why Evaluating RAG Is Different From Evaluating LLMs</h2><p>Traditional LLM evals test whether the model&#8217;s output is good. Did it answer correctly? Was the tone right? Did it hallucinate?</p><p><em>RAG evals test something upstream. They test whether the retrieval system fed the right inputs to the model in the first place.</em></p><p>A RAG pipeline has two components. </p><ol><li><p>The retrieval layer that selects documents. </p></li><li><p>And the generation layer that synthesises an answer from those documents. </p></li></ol><p>These are two separate systems. They fail in different ways. They need to be evaluated separately.</p><p>Most teams skip the retrieval evaluation entirely. They look at the final generated answer and if it is good, they assume the whole pipeline works.</p><p>That is a mistake. Because sometimes the model gets lucky. It generates a reasonable answer even from mediocre sources. And sometimes the retrieval is perfect, but the model fumbles the generation.</p><p>RAG evals separate these two failure modes. They tell you exactly where the pipeline broke.</p><p>And the retrieval layer? That is your job as a PM to get right. Because retrieval quality is a product decision. </p><p>&#8212;&gt; How many documents to retrieve? </p><p>&#8212;&gt; Which embedding model to use. </p><p>&#8212;&gt; What similarity threshold to set. </p><p><strong>These are all choices that show up in your PRD, not in a prompt.</strong></p><h2>The RAG Evaluation</h2><p>Let us go back to Google. You are evaluating AI Overviews for the query: &#8220;Why does my iPhone battery drain fast after iOS 26 update.&#8221;</p><p>Your retrieval system pulls five documents. Here is what it returned, in the exact order it ranked them:</p><ol><li><p>Position 1: Apple Support page on iPhone battery health settings. </p></li><li><p>Position 2: A CNET article titled &#8220;Best Android Phones With Long Battery Life in 2025.&#8221; </p></li><li><p>Position 3: A Reddit thread from r/iPhone where users share iOS 26 battery drain fixes. </p></li><li><p>Position 4: A MacRumors article covering iOS 26 release notes and known battery bugs. </p></li><li><p>Position 5: An Amazon product listing for an Anker battery case.</p></li></ol><p>Now, you need a ground truth. You need to know, which documents in your entire corpus were actually relevant to this query.</p><p>Your human evaluators (or your golden dataset) say there are exactly four relevant documents in the whole index for this query:</p><ol><li><p>Relevant Doc A: The Apple Support page on battery health settings. </p></li><li><p>Relevant Doc B: The Reddit thread with iOS 26 battery drain fixes. </p></li><li><p>Relevant Doc C: The MacRumors article on iOS 26 release notes and battery bugs.</p></li><li><p>Relevant Doc D: An Apple Developer Forum post about background app refresh causing battery drain in iOS 26.</p></li></ol><p>So here is the picture. Your system retrieved five documents. Three of them are relevant (positions 1, 3, and 4). Two are irrelevant (positions 2 and 5). And one relevant document (the Developer Forum post) was not retrieved at all.</p><p>Let us now measure exactly how good or bad this retrieval was.</p><h2>Precision@K in RAG: Are You Retrieving Junk?</h2><p>Precision answers a simple question. Out of everything you retrieved, how much of it was actually useful?</p><p>The formula is:</p><div class="pullquote"><p>Precision@K = (Number of relevant documents in the top K results) / K</p></div><p>Let us calculate it at different values of K.</p><h4>Precision@1.</h4><p>You look at only the top result. </p><p>Position 1 is the Apple Support page. That is relevant.</p><blockquote><p>Precision@1 = 1/1 = 1.0</p></blockquote><p>Perfect. Your top result is a hit.</p><h4>Precision@3.</h4><p>You look at the top three results. </p><p>Position 1: Apple Support page. &#8212; Relevant. <br>Position 2: CNET Android article. &#8212; Not relevant. <br>Position 3: Reddit iOS 26 thread. &#8212; Relevant.</p><blockquote><p>Precision@3 = 2/3 = 0.67</p></blockquote><p>Two out of three were useful. That CNET Android article diluted the quality.</p><h4>Precision@5.</h4><p>You look at all five results. </p><p>Position 1: Relevant. <br>Position 2: Not relevant. <br>Position 3: Relevant. <br>Position 4: Relevant. <br>Position 5: Not relevant.</p><blockquote><p>Precision@5 = 3/5 = 0.60</p></blockquote><p>Three out of five. 60%. That means 40% of what you fed to Gemini was noise.</p><p>Here is what this metric tells you as a PM.</p><p>A precision of 0.60 at K=5 means your context window is 40% garbage. Gemini has to work harder to ignore the Android article and the Anker battery case listing. Every irrelevant document increases the chance of a confused, diluted, or hallucinated answer.</p><p>If your precision is dropping, you need to look at your embedding model. Your similarity threshold is too loose. You are retrieving documents that are only tangentially related to the query.</p><p><em>Precision is a purity metric. It tells you whether your retrieval has a noise problem.</em></p><h2>Recall@K in RAG: Are You Missing Important Documents?</h2><p>Recall asks the opposite question. Out of everything that should have been retrieved, how much did you actually find?</p><p>The formula is:</p><div class="pullquote"><p>Recall@K = (Number of relevant documents in the top K results) / (Total number of relevant documents in the corpus)</p></div><p>We said there are four relevant documents total. Let us calculate.</p><h4>Recall@1</h4><p>You retrieved one document. It is relevant.</p><p>Recall@1 = 1/4 = 0.25</p><p>You found one out of four relevant documents, 25%. You are missing 75% of the useful information.</p><h4>Recall@3</h4><p>The top three results contain two relevant documents (positions 1 and 3).</p><p>Recall@3 = 2/4 = 0.50</p><p>You have found half the relevant information. Better. But the user is still missing context about the iOS 26 release notes and the developer forum post.</p><h4>Recall@5</h4><p>All five results contain three relevant documents.</p><p>Recall@5 = 3/4 = 0.75</p><p>75%, You captured most of the relevant information. But that fourth document, the Developer Forum post about background app refresh, never made it in.</p><p>And that missing document? It might have been the most important one. It contains the actual technical fix. </p><p>A user reading the AI Overview gets generic advice on battery health settings but misses the specific step to disable background app refresh in iOS 26. That is a coverage gap. And it is invisible if you only look at Precision.</p><p>Here is what Recall tells you as a PM.</p><blockquote><p>Low recall means your users are getting incomplete answers. They are not seeing important perspectives. </p></blockquote><p>In a search product, low recall is how you lose trust. The user tries your AI answer; it does not solve their problem. They scroll past it to the blue links, and eventually, they stop reading AI Overviews entirely.</p><p>If your recall is low, you need to retrieve more documents (increase K). Or you need a better embedding model that captures semantic similarity more broadly.</p><p>But notice the tension. Increasing K improves recall but can hurt precision. </p><p>You pull in more documents, and some of them will be junk. </p><p><em>This is the fundamental tradeoff you manage as a PM. And it is why you need both metrics, not just one.</em></p><h2>The Precision-Recall Tradeoff in RAG Systems</h2><p>Let us put the numbers side by side.</p><p>At K=1: Precision is 1.00, Recall is 0.25. <br>At K=3: Precision is 0.67, Recall is 0.50. <br>At K=5: Precision is 0.60, Recall is 0.75.</p><p>See the pattern? As K increases, precision drops and recall rises. You are pulling in more documents, which means you find more relevant ones (recall goes up), but you also let in more noise (precision goes down).</p><p>This is not a math problem. This is a product problem.</p><div class="callout-block" data-callout="true"><p>If you are building a medical information feature, you want high recall. Missing a relevant safety warning is unacceptable. You tolerate some noise in exchange for completeness.</p></div><div class="callout-block" data-callout="true"><p>If you are building a customer support chatbot where context window tokens are expensive and latency matters, you want high precision. Every irrelevant document wastes tokens and slows response time.</p></div><div class="pullquote"><p>If you are building AI Overviews at Google, you need both to be high. A wrong source embarrasses you publicly. A missing source makes users lose trust. Your job is to find the K that maximises both, and to invest in a retrieval model that pushes the tradeoff curve outward.</p></div><p>This is a classic PM decision. It lives in your PRD. Not in a prompt engineering doc.</p><h2>Mean Reciprocal Rank (MRR): How Fast Does the User Find What They Need?</h2><p>Precision and Recall measure quantity. How many relevant documents did you retrieve versus how many exist?</p><p>MRR measures something different. It measures speed. How quickly does the first relevant document appear in your ranked results?</p><p>MRR stands for Mean Reciprocal Rank. Let us break it down from first principles.</p><p>First, Reciprocal Rank.</p><div class="pullquote"><p>For a single query, the Reciprocal Rank is 1 divided by the position of the first relevant document.</p></div><p>Go back to our query. &#8220;Why does my iPhone battery drain fast after iOS 26 update.&#8221;</p><p>The first relevant document is at position 1 (the Apple Support page).</p><p>Reciprocal Rank = 1/1 = 1.0</p><p>Perfect. The user&#8217;s first result was relevant. No scrolling needed.</p><p>But MRR is a system-level metric. It averages the Reciprocal Rank across multiple queries. Because one query hitting position 1 does not mean your system is good. You need to see the pattern.</p><p>Let us say you are evaluating AI Overviews across three queries.</p><p>Query 1: &#8220;Why does my iPhone battery drain fast after iOS 26 update.&#8221; Your system retrieves five documents. The first relevant one is at position 1. Reciprocal Rank = 1/1 = 1.0</p><p>Query 2: &#8220;How to enable dark mode on MacBook Air.&#8221; Your system retrieves five documents. The results are: a Windows dark mode guide at position 1, an unrelated MacBook keyboard shortcut article at position 2, and an Apple Support page on macOS dark mode settings at position 3. The first relevant document is at position 3. Reciprocal Rank = 1/3 = 0.33</p><p>Query 3: &#8220;Is Apple Vision Pro compatible with prescription lenses?&#8221; Your system retrieves five documents. Position 1 is a generic VR headset comparison. Position 2 is Apple&#8217;s official Vision Pro page, mentioning Zeiss optical inserts. Relevant. Reciprocal Rank = 1/2 = 0.50</p><p>Now you calculate MRR:</p><p>MRR = (1.0 + 0.33 + 0.50) / 3 = 1.83 / 3 = 0.61</p><p>Your MRR is 0.61.</p><p>What does this mean?</p><p>An MRR of 1.0 means every single query gets its first relevant document at position 1.</p><p>An MRR of 0.5 means, on average, the first relevant document is around position 2.</p><p>Your score of 0.61 says users are typically finding a relevant result between positions 1 and 2.</p><h3>Here is why MRR matters as a product metric.</h3><p>In a RAG system, the order of retrieved documents affects the generated answer. Most LLMs pay more attention to the documents that appear first in the context. </p><p>This is called positional bias. </p><p>If your first relevant document is buried at position 3 and positions 1 and 2 are noise, the model might give more weight to the noise.</p><p>In a search product, position matters even more directly. Users skim from top to bottom. If the first result is irrelevant, many users bounce. Every position of delay costs you engagement.</p><p>MRR captures this. It rewards systems that put relevant results first. Not just systems that retrieve relevant results somewhere in the list.</p><h2>Precision@K, Recall@K, and MRR Together: A Complete RAG Evaluation</h2><p>Each metric tells you something different about your retrieval system.</p><p>Precision@K tells you: Are you retrieving junk? Is your context window polluted?</p><p>Recall@K tells you: Are you missing important documents? Is your coverage complete?</p><p>MRR tells you: Are the good results ranked at the top? Is your order right?</p><h4>You need all three. Here is why.</h4><ol><li><p><em>A system can have high precision and low recall. </em><br>It retrieves only two documents, both relevant. Great purity. But it missed eight other relevant documents. The user gets a narrow, incomplete answer.</p></li><li><p><em>A system can have high recall and low precision. </em><br>It retrieves fifty documents and finds all the relevant ones. Great coverage. But it also pulled in forty irrelevant documents that confuse the model and waste tokens.</p></li><li><p><em>A system can have good precision and recall but bad MRR. </em><br>It retrieves five documents; four are relevant. But the one irrelevant document sits at position 1. The model anchors on it. The answer starts wrong.</p></li></ol><p>The PM&#8217;s job is to optimise all three simultaneously. That means choosing the right K, selecting the right embedding model, tuning the similarity threshold, and potentially reranking results after initial retrieval.</p><h2>Why Most RAG Teams Skip Retrieval Evaluation</h2><p>Here is what I see happen repeatedly.</p><p>A team builds a RAG pipeline. They use an off-the-shelf embedding model. They set K to 5 because someone saw it in a tutorial. They evaluate only the final generated answer. The answers look good in a demo. They ship.</p><p>Three months later, users complain. The AI answers are kind of right, but missing the point.</p><p>The team starts debugging the LLM. They try better prompts. They switch models. They add guardrails. Nothing helps consistently.</p><blockquote><p><em>The problem was never the LLM. The problem was the retrieval. Nobody measured Precision. Nobody measured Recall. Nobody checked MRR. Nobody even built a golden dataset of relevant documents for their queries.</em></p></blockquote><p>The retrieval layer determines the quality ceiling of your entire RAG system. If retrieval is broken, no amount of prompt engineering or model upgrades will fix the output consistently.</p><div class="pullquote"><p>Evaluate the retrieval first. Fix it first. Then evaluate the generation.</p></div><p>If you want to go deeper on Advanced Evals ( Cohen&#8217;s Kappa, Mathew&#8217;s Correlation Coefficient (MCC) ), Evals for Agentic Architecture, AI Product Sense, AI Strategy, AI Pricing, AI Prototyping, Advanced Prompting, ML Systems, etc., check out my AI PM course (40+ Videos and 25+ Case Studies ) <strong>[Certification Included]</strong></p><p>Check our <strong>highest-rated AI PM course (Including AI PM Interview Preparation )&#183; 4.9/5 &#183; 600+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong></p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. <strong><a href="https://topmate.io/technomanagers">Weekly Live Webinars/MasterClass ( Here )</a>, <a href="https://topmate.io/technomanagers">AI PM Resources</a></strong></em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Will AI Create More Product Managers?]]></title><description><![CDATA[Everyone in tech is asking the same question]]></description><link>https://www.technomanagers.com/p/will-ai-create-more-product-managers</link><guid isPermaLink="false">https://www.technomanagers.com/p/will-ai-create-more-product-managers</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Wed, 08 Apr 2026 19:46:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/63d39198-acaa-41a7-bd3c-1ae1d7bd173b_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before reading this, you can also read the following articles on Technomanagers</p><ol><li><p><a href="https://www.technomanagers.com/p/nvidia-strategy-2026">NVIDIA AI Strategy</a></p></li><li><p><a href="https://www.technomanagers.com/p/memory-in-ai-part-1">Memory in AI</a></p></li><li><p><a href="https://www.technomanagers.com/p/spotifys-ai-strategy">Spotify&#8217;s AI Strategy</a></p></li></ol><p>Everyone in tech is asking the same question. <strong>Will AI replace product managers?</strong></p><p>They are making a 160-year-old mistake. By the end of this article, you will see why the answer is the opposite of what they expect.</p><p>But first. Coal.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>What Is the Jevons Paradox?</h2><p>In 1865, an economist named William Stanley Jevons noticed something strange. James Watt had made the steam engine more efficient. Everyone expected coal consumption to fall.</p><p>Coal consumption exploded.</p><p>When each unit of coal did more work, the cost of useful work dropped. When the cost dropped, people found more work to do. Factories that could never afford steam power suddenly could.</p><blockquote><p>Jevons called it a paradox. Make something cheaper per unit and you expect less total usage. That is almost never what happens.</p></blockquote><p>Cars got fuel-efficient. People drove more. </p><p>LEDs used less electricity. People installed ten times as many. </p><p>AWS made servers cheap. Companies spun up thousands of microservices where they once ran five.</p><p><strong>Efficiency does not eliminate demand. It creates it.</strong></p><p>This paradox is about to hit product management. But to see how we need to answer a question most PMs have never thought clearly about.</p><h2>What Does a Product Manager Actually Do?</h2><p>Not the job description version. The first principles version.</p><p>A PM does three things.</p><ol><li><p>Uncertainty reduction. Talking to users. Analysing data. Running experiments. All of it serves one purpose. Figuring out what to build and for whom.</p></li><li><p>Cross-functional coordination. Keeping engineering, design, data science, and marketing aligned on the same problem.</p></li><li><p>Tradeoff arbitration. Time versus scope. Revenue versus user experience. Short-term versus long-term. The PM makes the call and owns the outcome.</p></li></ol><p>Which of these three does AI make cheaper?</p><p>All three. But not equally.</p><p>And the unevenness is where the future of this profession gets decided.</p><h2>How AI Is Changing Product Management Work</h2><p>Uncertainty reduction just collapsed in cost. AI synthesises 50 user interview transcripts in two minutes. It scans competitors, generates hypotheses, and drafts experiment designs. Cost per unit down 80% in 18 months.</p><p>Coordination got somewhat cheaper. AI drafts PRDs in minutes. Summarises meetings. Translates technical requirements into business language. Maybe 40% cheaper.</p><blockquote><p>Tradeoff arbitration did not get cheaper at all.</p></blockquote><p>AI can present options and model scenarios. But the decision where you weigh strategy against user needs against tech debt against team capacity and say &#8220;we are doing this and not that&#8221; remains human.</p><p>Two of three PM functions got dramatically cheaper. One stayed the same. You already know what Jevons would predict.</p><p>The specifics are wilder than you think.</p><h2>Why AI Will Increase Demand for Product Managers?</h2><p>Most people stop at step one. Existing PMs get more productive. One PM covers what three did. Fewer PMs needed.</p><p>That is the McKinsey argument. The LinkedIn influencer argument. It is step one of four. And the only step that reduces headcount.</p><p><strong>Step two. Latent demand unlocks.</strong></p><p>Every large company has problems that never got product thinking because it was too expensive. Shopify has over 400 internal tools. Before 2024, fewer than 30 had a dedicated PM. When an AI-augmented PM covers three times the surface area, those neglected tools suddenly deserve attention. Shopify added 15 internal-product PM roles in early 2025.</p><p><strong>Step three. New AI product roles emerge.</strong></p><p>Every AI feature needs a PM. Every agent workflow needs someone to define boundaries, failure modes, and user experience. A lot of companies did not have a PM for AI-driven personalisation in 2020. Now there is an entire team. Multiply that across every e-commerce, fintech, and SaaS company.</p><p><strong>Step four. Non-tech industries hire PMs for the first time.</strong></p><p>AI makes building software cheap enough that hospitals, banks, and governments now build their own products. </p><p>Step one reduces PM headcount by 30%. Steps two through four increase it by 200 to 300%.</p><p>Jevons was right. Again.</p><p>This is also why learning to work with AI as a PM is no longer optional. The PMs getting hired in steps two through four are not traditional PMs. They understand how AI systems work and how to build products around them. </p><h2>Will AI Agents Replace Product Managers Completely?</h2><p>The serious counterargument. AI agents will handle tradeoff decisions, too. PMs become redundant.</p><p>Three problems with this.</p><ol><li><p>Processing is not judgment. Spotify&#8217;s AI knows podcast listeners churn%. That is a pattern. Whether to invest in podcasts versus audiobooks versus live audio depends on positioning against Apple, licensing economics, and creator dynamics. Data surfaces patterns. Judgment decides what to do with them.</p></li><li><p>Product decisions are not optimisation problems. Feature A serves power users but alienates new ones. Feature B grows the funnel but adds 15% support costs. Feature C needs a migration that slows everything for two quarters. No formula resolves this.</p></li><li><p>Who sets the goal in the first place? AI optimises toward objectives. Someone decides what those objectives are. Which metrics matter? Which users to prioritise? Which problems to solve? That is the core of PM work. It does not get automated. It gets more valuable.</p></li></ol><p>But here is the part that should worry you if you are a certain type of PM.</p><h2>The Future of Product Management: Three Tiers</h2><p>The market is splitting. The value distribution is brutal.</p><blockquote><p><strong>Tier one - The Compression Zone.</strong></p></blockquote><p>Execution-heavy work. PRDs, dashboards, tickets, standups. AI compresses this by 70 to 80%. If most of your week looks like this, your leverage is deflating every quarter. Not your job. Your leverage.</p><blockquote><p><strong>Tier two - The Leverage Layer.</strong></p></blockquote><p>Systems-level work. Experiment design, metrics frameworks, and feedback loops. AI augments this but does not replace it. PMs here use AI to multiply their output. Their value goes up with AI.</p><blockquote><p><strong>Tier three - The Taste Premium.</strong></p></blockquote><p>The PM who sees what others miss. Who kills the feature that looks great on a spreadsheet but feels wrong? Who sets the vision that aligns everything else.</p><p>The Taste Premium does not get cheaper. It gets scarcer. When supply drops, and demand explodes, the price goes up.</p><p>Spreadsheets arrived, and people predicted the end of accountants. </p><p>Canva arrived, and people predicted the end of designers. </p><p>Design headcount exploded. But the premium for world-class brand work went up.</p><p>Democratisation of basic work expands the market. It also concentrates value at the top.</p><h2>How Product Managers Can Prepare for the AI Era?</h2><p>One question decides your next five years.</p><p>Are you building skills that get cheaper when AI improves or skills that get more valuable?</p><p>If you spend your time on uncertainty reduction and coordination, AI is compressing your value. The market will pay less because AI does a version of it for near zero.</p><p>If you spend your time on tradeoff arbitration and taste, the market for you is about to expand. Every new AI-augmented PM and every new product surface needs someone at the top making the calls.</p><p>Moving from the Compression Zone to the Taste Premium does not happen by accident. It requires understanding how AI systems work, how to build products around them, and how to develop the judgment that AI cannot replicate.</p><p>Jevons figured this out in 1865. Coal did not disappear. It powered an industrial revolution.</p><p>Product management is not disappearing. It is becoming how every company builds.</p><p>The question is which tier you will be in when it happens.</p><p>We created a course ( 40+ Videos and 25+ Case Studies )  for PMs who want to build in the right direction. How to think about AI as a PM. How to design AI-first products. How to build the judgment layer that AI cannot replace, AI Deepdive, AI Evals and AI Interview Preparation</p><p>Check our <strong>highest-rated AI PM course (Including AI PM Interview Preparation )&#183; 4.9/5 &#183; 600+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong></p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. <strong><a href="https://topmate.io/technomanagers">Weekly Live Webinars/MasterClass ( Here )</a></strong></em></p><p>Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How Session-Based RNNs Predict Your Next Swipe in TikTok?]]></title><description><![CDATA[TikTok&#8217;s Rabbit Hole?]]></description><link>https://www.technomanagers.com/p/how-session-based-rnns-predict-your</link><guid isPermaLink="false">https://www.technomanagers.com/p/how-session-based-rnns-predict-your</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Fri, 03 Apr 2026 11:59:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0mx3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There is a comfortable lie in Product Management.</p><p>If you have enough historical data on a user, you know exactly what they want. You build your collaborative filtering models. You map out their lifetime preferences. You assume your algorithm is bulletproof.</p><p>Then you look at TikTok.</p><p>And you realise historical data is often a trap.</p><p>If your AI relies on what a user did yesterday, it fails to understand what they crave right now. In platforms where intent shifts by the minute, historical profiling is dead. You need to predict the immediate future based on the immediate past.</p><p>This is the first principle breakdown of how to build the Rabbit Hole effect. We will move from traditional recommendation engines to Session-Based Recommendations using Recurrent Neural Networks.</p><p>If you are preparing for AI PM interviews, this is one of the most important system design concepts you can learn. We teach this and many more real interview scenarios in our course.</p><h2>What is TikTok, Really?</h2><p>From a product architecture standpoint, TikTok is not a social network. It is an AI-driven bipartite matching engine.</p><p>It does not care who your friends are. It does not care what you followed last month. It cares about one thing. Matching an infinite supply of highly fragmented content with highly volatile human attention.</p><p>The moment you open the app, you are a blank slate. Not because TikTok does not have your data. But your data from yesterday is almost useless for predicting what you want in the next 30 seconds.</p><p>This is already a fundamentally different product philosophy from Instagram or YouTube. Those platforms are built on the social graph. TikTok is built on the interest graph. And the interest graph is rebuilt from scratch every single session.</p><h2>The Strategic Bet: Kill the Social Graph</h2><p>This is the part most PMs miss entirely.</p><p>TikTok did not just build a better recommendation engine. It made a strategic decision to remove the social graph as the primary distribution mechanism.</p><p>On Instagram, your content reaches your followers first. Then the algorithm decides whether to push it further. On YouTube, subscribers see your videos first. Then the algorithm takes over.</p><p>On TikTok, follower count is almost irrelevant for distribution. The algorithm decides who sees what, independent of social connections. A creator with 200 followers can get 10 million views on a single video if the algorithm detects high engagement in the first few hundred impressions.</p><p>Why does this matter strategically?</p><p>Because it means TikTok does not need network effects to retain users. Traditional social platforms are sticky because your friends are there. You cannot leave Instagram because your social circle is on Instagram. That is a network effect moat.</p><p>TikTok replaced network effects with algorithmic effects. You do not stay on TikTok because your friends are there. You stay because the algorithm understands you better than any other platform. The algorithm itself is the moat.</p><p>This is why TikTok&#8217;s valuation swings wildly depending on whether the recommendation algorithm is included in the deal. Reports from the US sale negotiations showed that TikTok, without its algorithm, could be worth as little as 40 billion dollars. TikTok with its algorithm is worth closer to 200 billion dollars.</p><p>The algorithm is not a feature. It is the entire business.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JhP0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JhP0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!JhP0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!JhP0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!JhP0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JhP0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:420556,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/193059535?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JhP0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!JhP0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!JhP0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!JhP0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8be1ab7-f765-40da-9ae5-c467afcd70f3_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The User Behaviour Problem</h2><p>On traditional platforms like Netflix or Amazon, a user&#8217;s session is slow. They search. They read reviews. They watch a two-hour movie. You have time to understand them.</p><p>On TikTok, user behaviour is chaotic.</p><p>Users do not explicitly tell you what they like. They signal it through micro-actions. A two-second linger. A rapid swipe. A share. Finishing a 15-second loop twice. These are all implicit signals.</p><p>A user might open the app wanting comedy. Three swipes later, they see a video about fixing a sink. Suddenly, their intent shifts entirely to DIY home repair.</p><p>And here is the hardest part. Even if a user is logged in, every time they open the app, their current emotional state is essentially a cold start. They might have had a bad day. They might be bored. They might be curious about something they have never explored before.</p><p>Historical profiles cannot capture this. Only the current session can.</p><h2>The Problem Statement</h2><p>How do we accurately predict and serve the next piece of content to a user when their current intent is unknown, rapidly changing, and largely divorced from their long-term preferences?</p><p>If the system relies on long-term data, it will continue to show comedy even after the user has mentally shifted to DIY. The algorithm will feel clunky. Out of touch.</p><p>This is not a theoretical problem. This directly hits the business.</p><h2>The Metrics Framework: Three Layers</h2><p>Before we solve it, we must measure the pain. And we need to measure it at three distinct layers. Most PMs only think about one layer. That is a mistake.</p><h4>Layer 1: Session Health Metrics (Does the user stay?)</h4><p>Time to First Abandonment tells you how many videos a user swipes through before killing the session. If this number is high, the algorithm is slow to adapt. Think of it as the &#8220;cold start tax.&#8221; How many bad recommendations does the user tolerate before leaving?</p><p>Session Watch Time is the total minutes spent in a continuous app session. TikTok&#8217;s average is 95 minutes per day across multiple sessions. If a single session averages less than 8 to 10 minutes, the recommendation engine is leaking users.</p><p>Swipe-to-Completion Ratio is the ratio of videos skipped within three seconds versus videos watched to 80%  or more completion. A bad ratio means the system is serving the wrong content. A healthy For You feed should have a completion ratio above 40%  within the first 10 videos of a session.</p><h4>Layer 2: Engagement Depth Metrics (Does the user care?)</h4><p>Not all engagement is equal. TikTok&#8217;s algorithm weights signals differently because some signals are more honest than others.</p><p>Watch Time Percentage is the strongest signal. A user who watches 95 per cent of a 60-second video has expressed genuine interest. They did not tap a button. They gave you their attention. Attention is the most expensive thing a human can give.</p><p>Replay Rate tracks how many users watch a video more than once. This is a signal that the content was not just good but worth revisiting. Replays are weighted heavily because they are almost impossible to fake.</p><p>Share Rate is even more telling. A user who shares a video is doing free distribution work for TikTok. They are putting their social capital on the line by recommending content to their friends. This is the highest intent signal after a purchase.</p><p>Save Rate means the user wants to come back to this content later. This is a forward-looking intent signal that most platforms underweight.</p><p>Comment Sentiment is trickier. A comment can be positive, negative, or neutral. Raw comment counts are misleading. TikTok&#8217;s system analyses whether comments indicate genuine engagement or hate-watching. Both drive views, but only positive engagement drives long-term session health.</p><h4>Layer 3: Business Impact Metrics (Does the algorithm make money?)</h4><p>Revenue Per Session connects recommendation quality directly to dollars. If the algorithm serves better content, users stay longer, see more ads, and revenue per session increases.</p><p>Ad Completion Rate measures whether users watch the ads placed between videos. If the surrounding content is relevant and engaging, users are in a positive attention state and more likely to watch an ad through. If the content is poor, users are already in &#8220;skip mode&#8221; and will reflexively skip the ad too.</p><p>DAU Retention at Day 1, Day 7, and Day 30 tells you whether the recommendation quality is good enough to bring users back. A single great session means nothing if the user does not return tomorrow. This is the ultimate test.</p><p>Most PMs stop at Layer 1. The best AI PMs connect all three layers into a single causal chain. Better recommendations lead to better session health, which leads to deeper engagement, which leads to higher ad revenue and retention.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NTxz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NTxz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NTxz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NTxz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NTxz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NTxz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ebf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1355215,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/193059535?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NTxz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NTxz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NTxz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NTxz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Febf69425-1097-409c-b381-a23f7f9b3c33_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why Traditional Recommendation Methods Fail Here</h2><p>Most PMs are familiar with Matrix Factorisation, also called Collaborative Filtering.</p><p>It works like this. Users who liked video A also liked video B. So if you liked A, the system recommends B.</p><p>This approach has powered Amazon and Netflix for years. But it fails catastrophically on TikTok.</p><p>The reason is simple. It ignores the sequence of actions.</p><p>If you watch Video A, then Video B, then Video C, the order in which you watched them contains massive contextual clues about your shifting intent. Matrix Factorisation treats them as a disorganised bucket of likes. It does not know that C came after B. It does not know that the transition from A to B was a signal.</p><p>Sequence matters. And for sequence, you need a fundamentally different architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!akyH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!akyH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!akyH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!akyH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!akyH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!akyH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1688185,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/193059535?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!akyH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!akyH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!akyH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!akyH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ad67b7-998c-4b8c-be20-473de914e5ac_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why RNN is the Way Forward</h2><p>This is where the paradigm shift happens.</p><p>Recurrent Neural Networks, specifically architectures like GRU4Rec (Gated Recurrent Units for Recommendations), are designed exclusively for sequential data.</p><p>Think of it this way. An RNN treats a user&#8217;s session like a sentence. If you read the words &#8220;I want to eat an...&#8221; your brain predicts &#8220;apple.&#8221; An RNN does the same thing with user actions. It looks at the strict chronological sequence of the last 10 swipes and uses that sequential memory to predict the 11th.</p><p>This is fundamentally different from collaborative filtering. The RNN does not ask &#8220;what did users like you enjoy?&#8221; It asks &#8220;given the exact order of what you just did, what should come next?&#8221;</p><h2>The PM Requirements Before Any Code is Written</h2><p>Before the ML engineers write a single line of code, the AI PM must define the constraints. If you fail here, the model will be a theoretical success and a production disaster.</p><p>There are three critical requirements.</p><ol><li><p>First is latency. The model must run online inference. When a user swipes, the RNN must update its state and fetch the next video in under 50 milliseconds. If inference takes 200ms, the user sees a loading spinner. On TikTok, a loading spinner is death.</p></li><li><p>Second is defining a session. Is a session defined by 30 minutes of inactivity? Or is it defined by a hard app close? This seems like a small decision but it fundamentally changes how the model trains. Usually, a 30-minute inactivity threshold works best.</p></li><li><p>Third is signal weighting. The PM must define what inputs matter and how much they matter. A like is an explicit signal. A video completion is an implicit signal. The model must ingest both. But watch-time percentage should be weighted highest because it is the most honest signal. People lie with likes. They do not pay attention.</p></li></ol><p>These are PM decisions, not engineering decisions. If you get them wrong, no amount of model tuning will save you.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0mx3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0mx3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!0mx3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!0mx3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!0mx3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0mx3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1502368,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/193059535?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0mx3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!0mx3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!0mx3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!0mx3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95409d99-6371-4001-80ea-6a571f5900b7_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>How the RNN Actually Works Under the Hood</h2><p>Let us look inside the GRU, the engine of the RNN session model.</p><p>When a user interacts with a video, that video is converted into an embedding vector. Think of the embedding as a numerical fingerprint that captures everything about that video in a compact format.</p><p>The core magic of the GRU is its Hidden State. This hidden state acts as the memory of the current session up to the current point in time.</p><p>As the user swipes to a new video, the GRU updates its memory. It uses two internal mechanisms.</p><p>The Update Gate decides how much of the past session to remember. If the user&#8217;s recent behaviour is consistent, the gate stays mostly open, preserving the session memory.</p><p>The Reset Gate decides how much of the past to forget because the user&#8217;s intent has shifted. If the user suddenly jumped from comedy to cooking, the reset gate activates and says &#8220;Forget the comedy context, something new is happening.&#8221;</p><p>The formula for the memory update looks like this.</p><blockquote><p><em>New Memory = (1 - Update Gate) X Old Memory + Update Gate X Candidate Memory</em></p></blockquote><p>In plain language, the model blends the old session memory with the new signal based on how much the user&#8217;s intent has shifted. This blending happens after every single swipe. The memory is always fresh.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1R-c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1R-c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1R-c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1R-c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1R-c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1R-c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1437108,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/193059535?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1R-c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1R-c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1R-c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1R-c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29464ab4-137a-4b20-b597-94a0795b53a9_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>How the Model is Trained</h2><p>You cannot train this like a normal classification problem. The video catalogue has millions of items. You cannot ask the model to predict the exact video.</p><p>Instead, we use a technique called Bayesian Personalised Ranking or BPR.</p><p>The idea is simple but powerful. Instead of predicting the exact next video, you train the model to rank the actual next video the user watched higher than a randomly sampled video the user did not watch.</p><p>You take the video that the user actually watched next. You call it the positive item. You randomly sample a video the user did not watch. You call it the negative item. Then you train the model so that the score for the positive item is always higher than the score for the negative item.</p><p>Over millions of such comparisons, the model learns what sequences of behaviour lead to what kinds of content. It learns the grammar of user intent.</p><h2>Measuring Model Success: Offline and Online</h2><p>You need two completely different sets of metrics here. One for the engineers during training. One for the business during A/B testing.</p><p>For offline ML metrics, you track Recall at K. Out of the top 20 videos the RNN predicted, was the actual next video the user watched in that list? If yes, the model is doing its job.</p><p>You also track Mean Reciprocal Rank. It is not enough to be in the top 20. Was it ranked number 1 or number 19? MRR heavily penalises the model if the correct prediction is buried at the bottom of the list.</p><p>For online product metrics, you track Next-Click CTR. Does the user actually watch the immediately next video served by the RNN? You also track Session Length Extension. Does the RNN variant increase the average session length compared to the control group?</p><p>If Recall at 20 is high but Session Length is flat, your model is technically accurate but not creating the Rabbit Hole effect. Both metrics must move together. This is where AI PMs earn their salary. Bridging the gap between what the ML team optimises and what the business actually needs.</p><h2>The Three Pitfalls That Will Destroy Your Product</h2><p>If you implement this blindly, you will destroy your user experience. There are three pitfalls every AI PM must guard against.</p><ol><li><p>The first is the Echo Chamber problem. RNNs are almost too good at detecting immediate intent. If a user pauses on a sad video for two seconds too long, the RNN might plunge them into a depressive rabbit hole, serving nothing but sad content for the rest of the session. The solution is to inject random exploration videos using multi-armed bandits. You intentionally break the sequence to test for new intents. This is a PM decision, not a model decision. TikTok&#8217;s own algorithm does this. It deliberately injects novelty and diversity into the feed to prevent monotony and to protect users from harmful content spirals.</p></li><li><p>The second is Catastrophic Forgetting. Standard RNNs heavily weight the most recent clicks and forget the beginning of the session. If a session is 100 swipes long, the intent from swipe 10 might still be relevant. But the GRU might have forgotten it entirely. This is why some teams use attention mechanisms on top of GRUs, allowing the model to look back at any point in the session, not just the most recent swipes.</p></li><li><p>The third is Cold Start on New Items. The RNN is great at handling new users because it builds understanding from the very first swipe. But it struggles with brand-new videos that have no embeddings yet. You still need content-based filtering to push new creator videos into the system until they accumulate enough interaction data. TikTok solves this through a tiered distribution system. Every new video is first shown to a small, highly targeted test group. If engagement is strong within that group, the video gets pushed to a larger audience. This is how a creator with 200 followers can wake up with 10 million views.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zOiJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zOiJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zOiJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zOiJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zOiJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zOiJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:550890,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/193059535?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zOiJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zOiJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zOiJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zOiJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d877533-a619-4307-9b48-b9c753bf52e6_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI Product Management is not about throwing LLMs at every problem. It is about understanding the structural reality of your user data and connecting it to the business's strategic goals.</p><p>If this article changed how you think about recommendation systems and product strategy, you will find much more depth in our AI PM course. We cover system design for recommendations, RAG architectures, AI metrics, agentic systems, and real interview questions from top companies.</p><p>Check our <strong>highest-rated AI PM course (Including AI PM Interview Preparation )&#183; 4.9/5 &#183; 600+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong></p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. <strong><a href="https://topmate.io/technomanagers">Weekly Live Webinars/MasterClass ( Here )</a></strong></em></p><p>Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p>]]></content:encoded></item><item><title><![CDATA[I Reviewed 50 AI PM Job Descriptions. Here’s What Companies Actually Want in 2026.]]></title><description><![CDATA[Crack AI PM Roles]]></description><link>https://www.technomanagers.com/p/i-reviewed-50-ai-pm-job-descriptions</link><guid isPermaLink="false">https://www.technomanagers.com/p/i-reviewed-50-ai-pm-job-descriptions</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Thu, 02 Apr 2026 16:28:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4be2a80b-d8d0-45f4-8b0d-cdaa1d1f7fe1_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There is a pattern in how PMs prepare for AI product roles.</p><p>They read about LLMs. They learn to write better prompts.</p><p>They update their resume with words like &#8220;generative AI&#8221; and &#8220;responsible AI.&#8221; They think they are ready.</p><p>The job descriptions say otherwise.</p><p>I spent three weeks reading 50 AI PM job postings. Google, Meta, OpenAI, Eightfold, and a few high-growth AI-native startups.</p><p>I was looking for one thing: what skills are companies actually asking for versus what PMs are actually building.</p><p>The gap is larger than I expected.</p><h2><strong>What Everyone Is Preparing For</strong></h2><p>The standard AI PM preparation list looks like this.</p><p>&#8594; Learn how LLMs work at a high level. Understand transformers.<br>&#8594; Read about RAG and fine-tuning.<br>&#8594; Practice writing product strategy documents that mention AI.<br>&#8594; Build a portfolio project, preferably a chatbot.</p><p>This is not wrong. These things appear in JDs.</p><p>But they appear the same way &#8220;strong communication skills&#8221; appear in a generic PM role. As table stakes.</p><p>As filters that remove clearly unqualified candidates, not signals that separate good candidates from great ones.</p><p>The companies I looked at are past the point where &#8220;I understand what an LLM is&#8221; is a differentiator. They are hiring for people who have operated at the intersection of AI and product. Operated, not studied.</p><p>Three skills kept appearing across JDs in a way that most candidates are not building.</p><h2><strong>Skill 1: Evals</strong></h2><p>This is the biggest gap I found.</p><p>The OpenAI CPO, at Lenny&#8217;s Podcast conference in 2025, said something that should have become required reading for every PM preparing for an AI role. He said, &#8220;The most important thing a product manager can learn to do is write evals.&#8221;</p><p>Most PMs I speak to do not know what an eval is. They think it means user testing. It does not.</p><p>An eval is a structured test suite for an AI system. You define a set of inputs, the expected output or behaviour, and a scoring method. You run the system against those inputs. You measure how often it performs correctly. When the model changes, when you change the prompt, when you change the data, you run the evals again. You see what broke.</p><p>In traditional software, a bug is a bug. The system does the wrong thing, and you fix it.</p><p>In LLM-based products, the system does the wrong thing in some cases, the right thing in others, and something ambiguous in a third set. Without evals, you have no way to know which category a new change falls into. You are shipping blind.</p><p>A sample question can be something like this:</p><ol><li><p>How would you measure the reliability of Rufus, the e-commerce AI Assistant?</p></li><li><p>How would you measure the success of MultiAgentic Workflow?</p></li></ol><p>Every time you change anything about the system, you run the evals again. If the score drops, you do not ship.</p><blockquote><p><a href="https://topmate.io/technomanagers/1861184">This might help you to prepare in detail about Evals</a></p></blockquote><h2><strong>Skill 2: Model Selection Logic</strong></h2><p>The second skill is the ability to choose between AI approaches, not just use them.</p><p>A year ago, &#8220;AI feature&#8221; meant &#8220;integrate GPT-4 via API and ship.&#8221; That is no longer a differentiated product decision.</p><p>Today, hiring managers want PMs who can reason through the following question: for this specific problem, what is the right approach and why?</p><p>The options a PM now needs to reason through include prompt engineering alone, RAG with a vector database, fine-tuning a smaller model, training a purpose-built model from scratch, or using a rule-based system instead of AI entirely. Each option has a different cost structure, latency profile, accuracy ceiling, maintenance burden, and failure mode.</p><blockquote><p><em>A PM who cannot reason through this tradeoff is not a PM for an AI product. They are a PM who happens to have AI on their roadmap.</em></p></blockquote><p>Let me make this concrete. Say you are building a feature that answers customer queries about an e-commerce return policy. Your choices are:</p><p>Prompt engineering: fast to build, but the model will hallucinate policies that do not exist in your documentation. You have no grounding.</p><p>RAG: You retrieve the relevant policy sections and inject them into the prompt context. The model can now answer accurately against your actual policy. Build time is higher, but accuracy is significantly better.</p><p>Fine-tuning: you train a smaller model specifically on your policy data and support conversations. Latency is lower, cost per query is lower, but you now have a maintenance responsibility. When your policy changes, you need to retrain.</p><p>Rule-based: for simple, high-volume queries like &#8220;what is your return window,&#8221; a rule-based system has zero hallucination risk and near-zero latency. AI adds no value here.</p><blockquote><p><a href="https://topmate.io/technomanagers/1861184">Real AI PM Interview Questions Click Here</a></p></blockquote><h2><strong>Skill 3: Failure Mode Thinking</strong></h2><p>Traditional products fail in predictable ways. If a button does not work, it does not work. You find it in QA. You fix it. It works.</p><p>AI products fail in ways that are probabilistic, context-dependent, and sometimes invisible until they are very visible.</p><p>The failure modes that appear repeatedly in AI PM JDs are: hallucination (the model generates confident false information), latency degradation under load, context window limits causing incomplete reasoning, prompt injection attacks in user-facing LLM features, and confidence calibration problems where the model is wrong but sounds right.</p><p>PMs who can map failure modes before a feature ships are rare.</p><p>Most teams discover failure modes after launch because they were not built into the product definition.</p><p>Hiring managers know this and look for candidates who proactively think about what can go wrong.</p><p>In an interview, this shows up as questions like: &#8220;How would you define done for an AI feature?&#8221; or &#8220;Walk me through how you would monitor this after launch.&#8221;</p><p>A PM who only talks about launch metrics and A/B tests is signalling that they have not thought about probabilistic failure.</p><p>A PM who talks about confidence thresholds, fallback logic, latency monitoring, and a human-in-the-loop escalation path for low-confidence outputs is signalling operational maturity.</p><p><strong>What This Means for Your Preparation</strong></p><p>The JDs are not asking for people who know about AI. There are thousands of those.</p><p>They are asking for people who have operated AI products. Who have thought through eval design, made model selection decisions, and mapped failure modes before launch. These are skills you build by doing, not by reading.</p><blockquote><p><em>If you are preparing for an AI PM role right now, I would stop spending time on LLM theory and start spending time on the three skills above.</em></p></blockquote><p>That work is what separates candidates who know AI from candidates who have worked with AI. The JDs are very clear about which one they want.</p><p><em>AI Product Management is the future; you can keep ignoring but this will become the baseline in 8 to 14 Months.</em></p><p>You should take action to kill the anxiety &#8212; Start today only, Learn about AI Product Management, start from basics to Advance with the Flagship AI PM Course.</p><blockquote><p>You can also check out our <strong>highest-rated AI PM course ( Including AI PM Interview Preparation )&#183; 4.9/5 &#183; 600+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong></p></blockquote><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. <strong><a href="https://topmate.io/technomanagers">Weekly Live Webinars/MasterClass ( Here )</a></strong></em></p>]]></content:encoded></item><item><title><![CDATA[Spec-Driven Development for Product Managers]]></title><description><![CDATA[Explained with Real Example of Google Maps]]></description><link>https://www.technomanagers.com/p/spec-driven-development-for-product</link><guid isPermaLink="false">https://www.technomanagers.com/p/spec-driven-development-for-product</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sun, 29 Mar 2026 18:06:15 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9485ad9d-cef7-4958-8d0b-de91660a5335_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You are a Senior Product Manager at Google Maps. </p><p>Your VP walks into your Monday standup and says three words: &#8220;Build Ask Maps.&#8221;</p><p>What is &#8220;Ask Maps&#8221;: Users can type or speak a natural language query directly into Maps and get intelligent, context-aware answers. &#8220;Find me a rooftop restaurant near Koramangala that&#8217;s open after 10 PM and has good reviews for cocktails.&#8221; No filters. No manual search. Just ask.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uq6v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uq6v!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 424w, https://substackcdn.com/image/fetch/$s_!Uq6v!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 848w, https://substackcdn.com/image/fetch/$s_!Uq6v!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 1272w, https://substackcdn.com/image/fetch/$s_!Uq6v!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uq6v!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif" width="1000" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Ask Maps in action&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Ask Maps in action" title="Ask Maps in action" srcset="https://substackcdn.com/image/fetch/$s_!Uq6v!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 424w, https://substackcdn.com/image/fetch/$s_!Uq6v!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 848w, https://substackcdn.com/image/fetch/$s_!Uq6v!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 1272w, https://substackcdn.com/image/fetch/$s_!Uq6v!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd828c6fd-2d12-417e-841f-fb8879b8d9df_1000x562.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Everyone in the room is excited. The engineers want to start prototyping immediately. Someone has already opened Cursor.</p><blockquote><p>And this is exactly where most AI-era product development goes wrong.</p></blockquote><p>Because what happens next is what the industry has started calling &#8220;vibe coding.&#8221; Someone fires a prompt into an AI coding tool. </p><p>The tool generates a working prototype in 20 minutes. Everyone is impressed. The demo looks great. </p><p><em>Three sprints later, the codebase is a mess, the AI feature behaves inconsistently across edge cases, and no one can explain why it sometimes returns results in Tamil Nadu when the user is searching in Telangana.</em></p><p>Spec-driven development is the structured alternative. </p><p>And in this article, I want to walk you through exactly what it looks like: not in abstract terms, but through the Ask Maps feature, end to end.</p><p>Before we go deep into Spec Driven Development, you can find out our articles on the following</p><ol><li><p><a href="https://www.technomanagers.com/p/uber-autonomous-vehicle-strategy">Uber&#8217;s AI Strategy</a></p></li><li><p><a href="https://www.technomanagers.com/p/memory-in-ai-part-1">Memory in AI</a></p></li><li><p><a href="https://www.technomanagers.com/p/spotifys-ai-strategy">Spotify&#8217;s AI Strategy</a></p></li></ol><h2>What Spec-Driven Development Actually Is?</h2><p>Spec-driven development (SDD) is a methodology where you write a formal, machine-readable specification before any code is generated. </p><p>This specification defines the behaviour, constraints, success criteria, and edge cases for a feature. The AI coding agent then generates code that must satisfy the spec. If the generated code does not meet the spec, the build fails automatically.</p><p>This is different from how most teams use AI tools today.</p><p>In the traditional AI-assisted workflow, a developer writes a prompt, the AI generates code, the developer reviews it, finds gaps, re-prompts, and this cycle repeats until something &#8220;feels right.&#8221; </p><p>There is no contract. There is no explicit definition of what the feature must and must not do. The AI guesses. The developer hopes. Technical debt accumulates silently.</p><p>SDD flips this. You start with the contract. You define what Ask Maps must do before you decide how it should be built. The code is a downstream artefact of the spec, not a starting point.</p><p>Three levels of SDD exist: spec-first (the spec guides the AI workflow), spec-anchored (the spec is continuously updated as the feature evolves), and spec-as-source (only the spec is ever edited by humans, never the code directly). For most product teams, spec-first or spec-anchored is the practical operating mode.</p><p>Now, let us apply this to Ask Maps, step by step.</p><h2>Phase 1: Strategic Alignment Before Writing Anything</h2><p>Most PMs think the spec is the first output of the discovery process. It is not. The first output is alignment with what you are actually building.</p><p>Before anyone writes a spec for Ask Maps, the product team needs to resolve a set of foundational questions. These are not design questions. These are strategy questions.</p><h4><strong>What is the primary job to be done?</strong></h4><p>Ask Maps could solve multiple problems. It could be a discovery tool (help users find places they did not know existed). It could be a planning tool (help users build a full day itinerary). It could be a real-time assistant (give live, context-aware answers based on current traffic, weather, and availability). </p><p>These are three different features. They share a surface but diverge completely in their backend requirements, data dependencies, and success metrics.</p><p>At Google Maps, this decision has massive downstream consequences. A discovery-focused Ask Maps integrates deeply with Google&#8217;s restaurant and business index. A planning tool needs multi-stop optimisation logic. A real-time assistant needs live data pipelines for weather, traffic, and business hours APIs.</p><p>The team needs to pick one primary job. Everything else is scope creep.</p><h4><strong>What is the explicit scope boundary?</strong></h4><p>What will Ask Maps not do? This is equally important as what it will do. Will it handle transactional requests (&#8221;book a table at this restaurant&#8221;)? Or is it purely informational? Will it work offline? Will it support voice input at launch? What languages will it support on day one?</p><p>Scope boundaries are not limitations. They are the spec&#8217;s load-bearing walls.</p><h4><strong>What are the success metrics?</strong></h4><p>Before a single line of spec is written, the team defines what success looks like. For Ask Maps, this might be: query satisfaction rate above 80% (user rates the answer as helpful), mean response latency under 2 seconds, and a 15% increase in session length compared to the current Maps search flow.</p><p>These numbers are not arbitrary. They will directly inform the non-functional requirements in the spec later.</p><h4><strong>Practical SDD tool behaviour at this stage:</strong></h4><p>In tools like Agent OS, this phase is handled by a &#8220;spec researcher&#8221; sub-agent. It ingests your product brief and roadmap, then surfaces clarifying questions with suggested default answers. </p><p>The PM does not write long paragraphs in response. They respond with &#8220;yes&#8221; or minor corrections. The agent synthesises the answers into a structured requirements brief.</p><p>For Ask Maps, the output of this phase is something like:</p><ul><li><p>Primary job: Place discovery through natural language</p></li><li><p>Scope: Informational only, no transactions, English-first</p></li><li><p>Input modes: Text and voice</p></li><li><p>Success metric: Query satisfaction above 80%, P99 latency under 2 seconds</p></li><li><p>Out of scope for v1: Itinerary building, multi-stop optimisation, transactional bookings</p></li></ul><p>This is not the spec. This is the raw material for the spec.</p><h2>Phase 2: Writing the Ask Maps Specification</h2><p>Now the spec is written. And this is where SDD requires discipline, because the instinct is to write a technical spec. SDD requires a behavioural spec.</p><p>A behavioural spec defines what the system must do and how it must behave from the user&#8217;s perspective and from a system contract perspective. It does not prescribe the implementation.</p><p>Here is what the Ask Maps spec looks like:</p><h4><strong>Feature: Ask Maps</strong> <strong>Version:</strong> 1.0 <strong>Owner:</strong> [PM Name] <strong>Last updated:</strong> [Date]</h4><p><strong>Goal:</strong> Enable Google Maps users to discover places and get location-aware answers through natural language queries, without using traditional filter-based search.</p><h4><strong>User Stories:</strong></h4><ol><li><p>As a Maps user, I want to ask a natural language question about places near me, so that I can discover options I would not find through manual filters.</p></li><li><p>As a Maps user, I want to ask follow-up questions in the same session without re-entering my original context, so that I can refine my search conversationally.</p></li><li><p>As a Maps user, I want Ask Maps to consider my current location, time of day, and day of week automatically, so that I get contextually relevant answers without explicitly stating these.</p></li></ol><h4><strong>Functional Requirements:</strong></h4><ol><li><p>FR-01: The system must accept natural language queries of up to 500 characters via text input. </p></li><li><p>FR-02: The system must accept voice input and convert it to text before processing. </p></li><li><p>FR-03: The system must use the user&#8217;s current GPS-confirmed location as the default geographic context for all queries. </p></li><li><p>FR-04: The system must return a minimum of 3 and a maximum of 10 place results per query. </p></li><li><p>FR-05: Each result must include: place name, distance from user, rating, a one-line AI-generated reason for the recommendation, and a direct CTA to navigate.</p></li><li><p>FR-06: The system must support follow-up queries within the same session, preserving the context of the initial query. </p></li><li><p>FR-07: If the system cannot find relevant results with confidence above 0.75, it must surface a &#8220;limited results&#8221; state rather than hallucinating low-quality matches.</p></li></ol><h4><strong>Non-Functional Requirements:</strong></h4><ol><li><p>NFR-01: P50 response latency must be under 1 second. P99 must be under 2 seconds. </p></li><li><p>NFR-02: The system must handle a minimum of 10,000 concurrent queries. </p></li><li><p>NFR-03: The AI recommendation layer must not surface results from businesses that have a Google rating below 3.5 unless explicitly asked by the user. </p></li><li><p>NFR-04: The system must not store the user&#8217;s query text beyond the active session without explicit consent.</p></li></ol><h4><strong>Edge Cases and Failure Modes:</strong></h4><ol><li><p>EC-01: User is in a location with no GPS signal. The system must prompt the user to manually enter a location rather than defaulting to a stale cached location. </p></li><li><p>EC-02: User queries a category with no matches within a 10km radius. The system must expand the radius to 25km and inform the user of this expansion. </p></li><li><p>EC-03: Query contains a language other than English. V1 must return a graceful &#8220;English only&#8221; message. V2 will address multi-language support. </p></li><li><p>EC-04: Query is ambiguous (for example, &#8220;good food near me&#8221;). The system must ask one clarifying question before returning results, not make an assumption.</p></li></ol><h4><strong>Out of Scope:</strong></h4><ul><li><p>Transactional bookings (restaurant reservations, ride bookings)</p></li><li><p>Multi-stop itinerary planning</p></li><li><p>Queries not related to physical places (for example, &#8220;what is the capital of France&#8221;)</p></li></ul><div><hr></div><p>Notice what this spec does not contain: no database schema, no API structure, no infrastructure decisions. Those are implementation choices. The spec is silent on them intentionally. The AI coding agent gets to make those decisions within the constraints of the spec. The spec defines what must be true. The implementation decides how.</p><h2>Phase 3: The Design Document</h2><p>The spec is human-readable. The design document is agent-readable.</p><p>Once the spec is approved (and this approval step is non-negotiable in SDD, the PM and engineering lead both sign off before any code is generated), the AI agent translates the spec into a structured design document.</p><p>This document contains:</p><h4><strong>API Contract (from FR-01, FR-02, FR-04):</strong></h4><p>The Ask Maps endpoint accepts POST requests.</p><p>Input schema:</p><ul><li><p>query (string, required, max 500 characters): The natural language query </p></li><li><p>location (object, required): Contains lat (float) and lng (float) from GPS </p></li><li><p>session_id (string, optional): For follow-up query context preservation (FR-06) </p></li><li><p>input_mode (enum: &#8220;text&#8221; | &#8220;voice&#8221;, required)</p></li></ul><p>Output schema:</p><ul><li><p>results (array, min 3, max 10): Each object contains place_id, name, distance_km, rating, ai_reason, navigate_url </p></li><li><p>state (enum: &#8220;success&#8221; | &#8220;limited_results&#8221; | &#8220;clarification_needed&#8221; | &#8220;error&#8221;)</p></li><li><p>clarification_question (string, nullable): Populated only when state is &#8220;clarification_needed&#8221;</p></li></ul><h4><strong>Confidence Gate (from FR-07):</strong></h4><p>The AI recommendation layer must include a confidence score per result. Results with confidence below 0.75 are excluded from the final output array. If this exclusion brings the total results below 3, the system sets the state to &#8220;limited_results&#8221; and returns whatever results passed the threshold.</p><h4><strong>Radius Expansion Logic (from EC-02):</strong></h4><p>Initial query radius: 10km. If the results count is below 3 after confidence filtering, expand to 25km. Append a <code>radius_expanded: true</code> boolean to the response object. The UI layer uses this flag to surface the &#8220;We expanded your search area&#8221; message.</p><h4><strong>Security Constraints (from NFR-04):</strong></h4><p>Query text must not be written to any persistent store. Session data lives in ephemeral cache only, with a TTL of 30 minutes.</p><p>This design document becomes the to-do list for the AI coding agent. Each requirement maps to a specific implementation task. Nothing is left to interpretation.</p><h2>Phase 4: Breaking It Into Testable Tasks</h2><p>In SDD, the design document is decomposed into discrete, independently testable implementation units. This is where the workflow starts looking like traditional engineering project management, except that AI agents are executing the tasks, not humans writing the code from scratch.</p><p>For Ask Maps, the task breakdown looks like this:</p><h4><strong>Task Group A: Core API Layer</strong></h4><ul><li><p>A1: Implement the POST /ask-maps endpoint with input validation (FR-01, FR-02)</p></li><li><p>A2: Implement GPS location ingestion and validation. Fail gracefully if coordinates are malformed (EC-01)</p></li><li><p>A3: Implement session management with 30-minute ephemeral TTL (NFR-04, FR-06)</p></li></ul><h4><strong>Task Group B: AI Recommendation Engine</strong></h4><ul><li><p>B1: Integrate with Google Places API for candidate place retrieval</p></li><li><p>B2: Implement confidence scoring model with 0.75 threshold gate (FR-07)</p></li><li><p>B3: Implement radius expansion logic: 10km base, expand to 25km with flag (EC-02)</p></li><li><p>B4: Generate AI-written one-line reasons per result using the LLM layer</p></li></ul><h4><strong>Task Group C: Edge Case Handling</strong></h4><ul><li><p>C1: Implement ambiguity detection. If query is flagged as ambiguous, return clarification question instead of results (EC-04)</p></li><li><p>C2: Implement rating filter: exclude results with Google rating below 3.5 from candidate pool (NFR-03)</p></li><li><p>C3: Implement &#8220;English only&#8221; language detection for V1 (EC-03)</p></li></ul><h4><strong>Task Group D: Non-Functional Requirements</strong></h4><ul><li><p>D1: Load test to confirm P99 latency under 2 seconds at 10,000 concurrent queries (NFR-01, NFR-02)</p></li><li><p>D2: Security audit on query storage to confirm no persistent writes (NFR-04)</p></li></ul><p>Each task has a direct reference back to a specific requirement in the spec. This is the core discipline of SDD. You can always trace any line of code back to a business requirement. If you cannot, that code should not exist.</p><h2>Phase 5: Execution Under Constraints</h2><p>The AI coding agent now generates code. But this is not vibe coding with a spec document sitting nearby. The spec is an active constraint.</p><p>In practice, this means:</p><p>The CI/CD pipeline has automated spec validation checks embedded. </p><p>If the AI agent generates code for the Ask Maps endpoint that does not include the confidence threshold gate, the build fails. Not a code review comment. A hard build failure.</p><p>If the agent generates a response schema that returns results without the ai_reason field, the build fails. Because FR-05 explicitly mandates it.</p><p>If the agent writes query text to a database table (even a logging table), the build fails. Because NFR-04 says it cannot.</p><p>This is what &#8220;executable specification&#8221; means. The spec is not a document someone reads. It is a contract that the system enforces.</p><p>One critical challenge at this phase is what practitioners call context fragmentation. Most AI coding tools understand a single repository. But Google Maps is not a single repository. </p><p>The Ask Maps feature will touch the core Maps search service, the Places API integration layer, the user session service, the UI component library, and the AI/ML serving infrastructure. These live in different codebases, owned by different teams.</p><p>If the AI agent only sees one of these repositories, it will generate code that is locally correct but architecturally inconsistent. It will reinvent session management that already exists in the session service. It will create a new confidence scoring library instead of using the existing ML inference wrapper.</p><blockquote><p>This is why enterprise SDD needs a context engine that maps semantic dependencies across repositories. </p></blockquote><p>For a team without access to enterprise tooling, the practical workaround is explicit cross-repo documentation injected into the AI agent&#8217;s context at task time.</p><h2>Phase 6: Debugging the Spec, Not Just the Code</h2><p>This is the phase most PMs never hear about, and it is arguably the most important.</p><p>In SDD, when the AI generates code that is wrong, you do not fix the code directly. You fix the specification.</p><p>Here is why: AI code generation is non-deterministic. If you fix a bug in the generated code without updating the spec, the next time you regenerate (for a refactor, a new feature, or a regression fix), the AI will reproduce the exact same bug. It is following the spec. The spec said nothing about this case. So the AI guessed.</p><p>Concretely: Imagine the Ask Maps agent generates code that sometimes returns results from a different city when the user is near a city boundary. The radius expansion logic triggered, expanded to 25km, and pulled in results from an adjacent city without informing the user.</p><p>In vibe coding, a developer patches this edge case in the code and moves on.</p><p>In SDD, the PM goes back to the spec, adds a new edge case:</p><p>EC-05: When radius expansion crosses an administrative city boundary, the system must segment results by city and surface a separator in the UI indicating &#8220;Results from [adjacent city]&#8221;.</p><p>The spec is updated. The design document is updated. The CI/CD validation check is updated. The AI agent regenerates the affected module. The fix propagates correctly and permanently.</p><p>This is the compounding benefit of SDD. Every bug you find and fix in the spec makes the entire feature more robust, not just the one line of code that was wrong.</p><h2>Why This Matters for Product Managers Specifically</h2><p>SDD is not just an engineering methodology. It is a PM leverage tool.</p><p>In the traditional development model, the PM writes a PRD, hands it to engineering, and then spends the next three sprints in spec review meetings clarifying requirements that were ambiguous in the document. The PM is a translator, repeatedly.</p><p>In SDD, the spec is the single source of truth that both the PM and the AI agent operate from. When engineering asks, &#8220;Why does the endpoint behave this way?&#8221; the answer is always FR-07 or NFR-03. Not &#8220;I think I mentioned it in the PRD somewhere.&#8221; The spec is precise. The behaviour is traceable.</p><p>For PMs building AI-powered features specifically, this precision is not optional. Research shows AI LLMs generate vulnerable code at rates between 9.8% and 42.1%, and a significant fraction of those vulnerabilities are rated Critical severity. </p><blockquote><p>A PM who cannot articulate the exact constraints their AI feature must operate within is not doing product management. They are doing product wishful thinking.</p></blockquote><p>SDD forces PMs to be specific. That specificity is the PM&#8217;s highest-leverage contribution in an AI-first development environment.</p><h2>The Learning Curve Is Real, But It Pays Off</h2><p>When I first started working through SDD workflows, the upfront planning phase felt slow. Writing behavioural requirements instead of just describing the feature in prose felt overly formal. Defining edge cases before writing a single line of code felt premature.</p><p>Three sprints in, the compounding became obvious. The Ask Maps spec, once written, became the source for the engineering scoping document, the QA test plan, the security review checklist, and the launch readiness criteria. The spec was written once and used six times. Every clarifying question in sprint planning was answerable by pointing to a requirement ID.</p><p>The slow part upfront makes everything downstream faster.</p><h2>Where to Go From Here</h2><p>AI Product Management is the future; you can keep ignoring but this will become the baseline in 8 to 14 Months.</p><p>You should take action to kill the anxiety - Start today only, Learn about AI Product Management, start from basics to Advance with the Flagship AI PM Course.</p><p>You can also check out our <strong>Highest rated AI PM course &#183; 4.9/5 &#183; 500+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> </p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Uber Autonomous Vehicle Strategy ]]></title><description><![CDATA[How Uber uses a hybrid network to solve robotaxi unit economics and scale AI mobility]]></description><link>https://www.technomanagers.com/p/uber-autonomous-vehicle-strategy</link><guid isPermaLink="false">https://www.technomanagers.com/p/uber-autonomous-vehicle-strategy</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Fri, 27 Mar 2026 19:41:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f5c6e4c5-24e4-4fb7-a1f0-785174e15062_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before reading about Uber,  you can read the following articles on Technomanagers</p><ol><li><p><a href="https://www.technomanagers.com/p/nvidia-strategy-2026">NVIDIA AI Strategy</a></p></li><li><p><a href="https://www.technomanagers.com/p/memory-in-ai-part-1">Memory in AI</a></p></li><li><p><a href="https://www.technomanagers.com/p/spotifys-ai-strategy">Spotify&#8217;s AI Strategy</a></p></li></ol><p>Many people think Uber is competing with Waymo and Tesla to build the best self-driving car. </p><p>This is a flawed way to look at the business. </p><p>Uber is not building a car company. </p><p>Uber is building the software layer for physical mobility. </p><p>Let us break down the Uber autonomous vehicle strategy using first principles. </p><h3>The Utilisation Problem</h3><h4>Why does Uber not just buy a massive fleet of self-driving cars and keep all the profit?</h4><p>We need to look at the fundamental unit economics of a transportation marketplace.</p><p>Ride hailing demand fluctuates drastically. A Saturday night has massive demand, while a Tuesday morning has very low demand. </p><p>If a standalone robotaxi company builds enough cars to serve the Saturday night peak, then most of their expensive cars will sit idle on Tuesday morning. </p><p>Idle hardware destroys profitability because you still pay for depreciation and maintenance. </p><p>If they only build enough cars for Tuesday morning, then wait times on Saturday night will be too long, and users will leave the platform.</p><blockquote><p><strong>Uber solves this problem through a hybrid network. </strong></p></blockquote><p>They use self-driving cars to serve the base load. </p><p>The base load is the predictable, continuous demand that happens every hour of the day. Uber then uses human drivers to handle the burst capacity. </p><p>Burst capacity is the sudden spike in demand during bad weather or weekends. By pushing the volatile demand to human drivers, Uber ensures that its partner autonomous vehicles stay constantly utilised. </p><p>High utilisation directly leads to profitability.</p><h3>The Big City Myth</h3><h4>Will autonomous vehicles just take over the major cities?</h4><p>Software scales instantly, but physical infrastructure scales very slowly. </p><p>People assume that cities like San Francisco and Los Angeles generate all the rideshare money. </p><p>The latest Uber financial data shows something completely different. Trips in the top twenty US cities represent only twenty five percent of their overall profits.</p><p>The vast majority of Uber profits come from smaller cities and suburbs. It will take a very long time for autonomous vehicle companies to map every rural road and complex suburban driveway. </p><p>Uber already has human drivers covering these areas. Uber owns the demand in these highly profitable long tail markets while the hardware companies take on the massive capital expense of mapping physical geography.</p><h3>The Aggregator Advantage</h3><h4>If cars can drive themselves, why do hardware companies need the Uber platform at all?</h4><p>When a technology becomes a commodity, the company that aggregates customer demand captures the most value. </p><p>If Waymo dominates one city and another startup dominates a different city, the end consumer will have a terrible experience. </p><p>Users do not want to download five different apps and compare wait times.</p><p>Uber is positioning itself as the universal marketplace for mobility. </p><p>It does not matter if the vehicle has a Google brain or a Tesla brain. The expensive robotaxi needs a rider to generate revenue. Uber has hundreds of millions of active users. By acting as the aggregator, Uber forces hardware companies to plug into its routing algorithm. </p><p>Uber does not need to win the artificial intelligence race. Uber just needs to be the default platform where all the artificial intelligence models come to find their customers.</p><h2>Commanding The Pricing Power</h2><p>Whoever controls the user interface controls the pricing power. </p><p>If a customer opens the Uber app, they do not care who manufactured the car. They just want the cheapest and fastest ride possible.</p><p>This consumer behaviour gives Uber total negotiating control over the hardware companies. Uber can force the different self-driving companies to compete directly against each other on the same screen. </p><p>If one hardware company wants a higher cut of the fare, Uber will simply send the customer a cheaper vehicle from a different hardware maker. This dynamic will force the hardware companies to lower their prices to win the ride, while Uber maintains its high profit margins on every single transaction.</p><p>What do you think, what&#8217;s the future of mobility?</p><blockquote><p><em><strong><a href="https://topmate.io/technomanagers/1472775">For Full Detailed cases Studies and AI &amp; Strategy &#8212; Download this Book ( 5/5 Rated )</a></strong></em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://topmate.io/technomanagers/1472775&quot;,&quot;text&quot;:&quot;Download the Book&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://topmate.io/technomanagers/1472775"><span>Download the Book</span></a></p><p>You can also check out our <strong>Highest rated AI PM course &#183; 4.9/5 &#183; 500+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> 60% OFF for a limited time &#8212; Code: NYE26</p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Nvidia Strategy 2026]]></title><description><![CDATA[The One Line From GTC 2026 That Nobody Is Talking About]]></description><link>https://www.technomanagers.com/p/nvidia-strategy-2026</link><guid isPermaLink="false">https://www.technomanagers.com/p/nvidia-strategy-2026</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sun, 22 Mar 2026 18:50:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0ce609c7-51cf-4fcf-8bdb-cdfccde499fd_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let me start with a sentence that almost no one is discussing.</p><p>Jensen Huang, speaking to Stratechery's Ben Thompson right after his GTC 2026 keynote, said this: <em>&#8220;It&#8217;s not just gonna be people banging on our SQL database now, it&#8217;s gonna be a whole bunch of agents banging on it. They&#8217;re gonna need to do it way faster.&#8221;</em></p><p>That is not a hardware announcement. That is a product warning. </p><p>Here is why I think that.</p><p>Every piece of software you have ever used was designed with one assumption baked in at the foundation. A human is on the other side. </p><p>Not just any human. A slow human. A human who reads, pauses, gets distracted, reopens the tab, waits for the page to load, and tolerates 400 milliseconds of latency without complaint. </p><p>Software was optimised for this human. The entire architecture of modern software products, rate limiting, pricing tiers, API design, and query performance budgets were calibrated to serve this specific use case.</p><p>That creature is no longer the primary user. And most products have not noticed yet.</p><h2>Why Tools Break When Agents Show Up</h2><p>Think about what actually happens when you plug an AI agent into an existing software tool.</p><p>The agent does not browse. It does not think about whether to click. It does not go to lunch. </p><p>It executes at machine speed, in a loop, with no downtime. A human sales rep might open your CRM 30 times in a workday. </p><p>An agent doing the same research task will hit your CRM API 30,000 times in an hour. The tool was never tested for this. The rate limit was never designed for this. The pricing model was never imagined for this.</p><p>The interesting thing is that this does not break gradually. It breaks at a threshold. Below it, everything works fine, and nobody notices. Cross it, and your authentication server is overwhelmed, your database connection pool is exhausted, your cost per customer quadruples overnight, and your support queue fills up with confused enterprise customers who say &#8220;we did not change anything.&#8221;</p><p>They did not change anything visible. They shipped an agent. That is the invisible change that breaks everything at once.</p><p>The moment when an AI agent becomes the dominant user of a system that was designed for humans, and all the original design assumptions fail simultaneously. It is not a gradual migration. It is a cliff.</p><p>The reason it is a cliff and not a slope comes down to something simple. Human tools were built on human time. Agents operate on machine time. These are not just different speeds. They are different operating philosophies.</p><p>When you build for human time, you tolerate latency because the human tolerates it. You accept occasional errors because the human catches them. You design for sessions because humans use things in sessions. You price per seat because humans use things individually.</p><p>None of that works in machine time. Latency compounds in a chained agent pipeline. A 10-step workflow where each step waits half a second end-to-end takes 5 seconds. Run 1,000 instances of that workflow for an enterprise customer and you are delivering 1,000 simultaneous calls that each need a sub-500ms response. The tool was tested for 50 concurrent users, not 1,000 parallel agents.</p><p>This is not a scalability problem in the traditional sense. You cannot just throw more servers at it. The fundamental design logic of the product is wrong for the new user.</p><h2>The CPU Comeback Explains Everything</h2><p>Here is something that most people found confusing at GTC. Jensen Huang, the man who spent years arguing that CPUs are bottlenecks and everything should be GPU-accelerated, is now selling CPUs. He announced the Vera CPU architecture, specifically designed for AI agent workloads. Why?</p><p>The answer reveals something deep about how agents actually work.</p><p>An agent does not run on a GPU the whole time. An agent reasons, then calls a tool, then waits for the result, then reasons again. </p><p>The reasoning part wants a GPU. The orchestration, the sequential decision making, the &#8220;what should I do next&#8221; part, runs on a CPU. One step at a time. Single-threaded. You cannot parallelise &#8220;what should I do next&#8221; because each decision depends on the previous one.</p><p>So here is what happens without a fast CPU. The agent reasons on the GPU, calls a tool, and then has to coordinate the response through a slow CPU before the next GPU operation starts. </p><blockquote><p>The GPU sits idle. </p></blockquote><p>You have bought millions of dollars of inference hardware that is waiting for a single-threaded CPU bottleneck.</p><p>Huang puts it directly: if the CPU is not fast, it holds back the GPU. The most expensive accelerators in the world are bottlenecked by the cheapest component in the chain.</p><p>The data centre was designed for a workload profile that no longer exists. Now it needs to be rebuilt around the actual agent workflow, which is not &#8220;run fast in parallel always&#8221; but &#8220;reason sequentially, then execute in parallel, then reason again.&#8221;</p><p>The product lesson here is not about CPUs. The lesson is that agent workloads expose hidden dependencies throughout the stack. Components that looked fine under the old workload become the binding constraint under the new one. This is true for hardware, and it is equally true for your product&#8217;s architecture.</p><h2>The Inference Pricing Split That Nobody Has Built Yet</h2><p>The Groq acquisition is more interesting than it looks.</p><p>Groq&#8217;s hardware is built for one thing: generating tokens extremely fast with very low latency. It trades raw throughput for speed. More tokens per second per user, fewer users served simultaneously. This is the opposite of what most inference providers optimise for.</p><p>Huang&#8217;s explanation of why Nvidia bought them is worth sitting with. He says there are two fundamentally different products in inference. One is throughput. The other is intelligence per token, and the willingness to pay for speed.</p><h4>Let me translate that into product terms.</h4><p>If your AI product is running document summaries overnight, processing invoices in batch, or generating reports on a schedule, you want cheap tokens. You are not time-sensitive. You will happily queue your requests. Cost per token is the only metric that matters. This is the utility pricing model. </p><p>If your AI product is powering a coding agent that a senior engineer is actively waiting on, the economics flip completely. That engineer costs the company somewhere between INR 3000 and INR 4000  per hour of productive time. </p><p>If your AI responds in 2 seconds instead of 8 seconds, you are not just providing a better experience. You are directly multiplying the productivity of the most expensive person in the building. At that point, cost per token is irrelevant. You will pay a significant premium for speed, because the alternative is wasting an engineer&#8217;s time.</p><p>These are not two tiers of the same product. These are two different products that happen to use the same underlying model.</p><p>Most AI companies today sell one product. Maybe they have a &#8220;basic&#8221; and &#8220;pro&#8221; tier differentiated by model quality or rate limits. Almost none of them have structured pricing around the latency-throughput tradeoff in a way that captures the actual economic value being delivered.</p><p>This is a large gap. The teams that close it first will find that their best customers are willing to pay multiples of what they are paying today for the fast tier, while their cost structure for the slow tier can be dramatically reduced through batching and scheduling.</p><p>Groq gives Nvidia the architecture to serve both ends of this spectrum simultaneously. The strategic insight is not that Groq is better hardware. It is that Nvidia is acknowledging that inference is not one product and building a system to serve both.</p><h2>The Five Layer Problem: Where Your Moat Actually Lives?</h2><p>There is a framing that Huang uses, which I want to push back on slightly, because I think the standard interpretation leads PMs in the wrong direction.</p><p>He describes AI as a five-layer stack: power, chips, infrastructure, models, and applications. The usual takeaway from this framing is &#8220;figure out which layer you are in and defend your position in it.&#8221; That is fine as far as it goes. But it misses the more important question.</p><p>Where does your customer&#8217;s willingness to pay land, and who captures it?</p><p>Consider what is actually happening in AI applications right now. An enterprise pays 50 dollars per user per month for an AI writing tool. The tool spends 30 dollars of that on inference costs, 10 on engineering and hosting, and makes 10 dollars of margin. Now the model provider raises prices because the new model is smarter. Inference cost goes to 40 dollars. Margin goes to zero. The application layer captured the customer, but the model layer captured the economics.</p><p>This is not a hypothetical. This is the P&amp;L reality for a large number of AI startups right now.</p><p>The moat in the application layer is not the AI. I want to say that again because it is not obvious enough. </p><p>The moat in the application layer is not the AI. The AI is available to everyone. The moat is everything that makes your AI better than anyone else&#8217;s in a specific context: proprietary training data, deep workflow integration, switching costs built from user behaviour over time, regulatory or compliance advantages, and domain expertise baked into your prompting and evaluation infrastructure.</p><p>Strip the AI out of your product and ask what is left. If the answer is &#8220;not much,&#8221; you do not have a moat. You have a feature built on someone else&#8217;s moat.</p><p>The model layer is also not as safe as it looks from the outside. The frontier models compete on capability. The efficient models compete on cost. Everything in the middle is getting squeezed from both directions. </p><p>A mid-tier model that costs 3x a commodity open source model and is 1.2x smarter is not a sustainable business unless that 1.2x maps directly to a specific high-value task that customers cannot accomplish with the cheaper option.</p><p>The infrastructure layer has the most durable economics, and also the highest capital requirements to enter. </p><p>The companies winning here are not winning on features. They are winning on integration depth. The harder it is to extract your data from their system, the more your workflows depend on their primitives, the stickier the business. This is not glamorous. It is the most important thing.</p><h2>What You Should Actually Do</h2><p>If you are a PM reading this, here is where first principles land for each situation.</p><ol><li><p>If you are building a tool with an API, audit it right now for Tool Inversion risk. Not in theory. Actually simulate what happens when an AI agent hits your API at 100 times your current peak load. Your rate limits, your database connection pooling, your authentication flow, your pricing model, all of it was calibrated for humans. Most of it will not survive an enterprise customer deploying a serious agent workflow. Find the cliff before your customers fall off it.</p></li><li><p>If you are pricing an AI product, separate your latency-sensitive and throughput-sensitive use cases. They have different economics and different willingness to pay. If you are charging a flat per-seat fee for a product that some customers use for async batch work and others use for real-time agent workflows, you are simultaneously undercharging your highest-value users and overcharging your lowest. That is not a pricing model. That is a cross-subsidy that your competitors will eventually arbitrage.</p></li><li><p>If you are defining your moat, be honest. Not to your investors. To yourself. Which layer does your advantage actually sit in? Domain data and workflow lock-in are durable. Model quality is not. UI differentiation is not. Feature parity is obviously not. If your honest answer is &#8220;our moat is that we were early,&#8221; that is not a moat. That is a head start, and head starts expire.</p></li></ol><p>The SQL database was not broken. It worked perfectly for everything it was designed for.</p><p>The problem was that it was designed for the wrong future.</p><p>If you enjoy reading this article, you will absolutely love our AI Tech and Strategy Book</p><blockquote><p><em><strong><a href="https://topmate.io/technomanagers/1472775">For Full Detailed cases Studies and AI &amp; Strategy &#8212; Download this Book ( 5/5 Rated )</a></strong></em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://topmate.io/technomanagers/1472775&quot;,&quot;text&quot;:&quot;Download the Book&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://topmate.io/technomanagers/1472775"><span>Download the Book</span></a></p><p>You can also check out our <strong>Highest rated AI PM course &#183; 4.9/5 &#183; 500+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> 60% OFF for a limited time &#8212; Code: NYE26</p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Extremely Simple Explanation - How MCP works?]]></title><description><![CDATA[How Model Context Protocol Solves the Biggest Bottleneck in AI]]></description><link>https://www.technomanagers.com/p/integrate-ai-agents-with-database</link><guid isPermaLink="false">https://www.technomanagers.com/p/integrate-ai-agents-with-database</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sat, 21 Mar 2026 04:05:56 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9603b3ff-bec0-4cbb-92f8-b7f50d061328_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before reading about the Model Context Protocol, you can read the following articles on Technomanagers</p><ol><li><p><a href="https://www.technomanagers.com/p/the-last-product-manager">The LAST Product Manager</a></p></li><li><p><a href="https://www.technomanagers.com/p/memory-in-ai-part-1">Memory in AI</a> </p></li><li><p><a href="https://www.technomanagers.com/p/spotifys-ai-strategy">Spotify&#8217;s AI Strategy</a></p></li></ol><p>Everyone talks about AI agents these days.</p><p>But very few product teams talk about how these agents actually connect to company data.</p><p>To understand why the Model Context Protocol, or MCP is a breakthrough, we first need to look at the main problem in AI development today.</p><h2>The Integration Bottleneck</h2><p>Right now, building an AI product requires writing custom code to connect to every single external tool.</p><p>If you want your AI to read a map you write an API integration.If you want it to read your database you write another one.</p><p>This limits how fast product teams can ship.</p><p>Developers spend all their time writing and maintaining this custom code instead of building core features.</p><p>Every time an external API changes your team has to fix the connection.</p><h2>How MCP Changes the Equation?</h2><p>MCP solves this bottleneck.</p><p>It is a new open source standard that unifies how AI agents connect to data sources.</p><p>It creates a standard interface between the AI application and external servers.The ecosystem relies on three main parts.</p><ol><li><p>First is the host which is the application the user interacts with like a chat interface or a code assistant.</p></li><li><p>Second is the protocol itself which acts as the standard transport layer in the middle.</p></li><li><p>Third are the servers built by data providers like Google or Yahoo.</p></li></ol><p>A single host can connect to many servers at the same time.</p><p>When developers build an MCP server it exposes three main capabilities to the client.</p><h4>First are tools.</h4><p>These are executable functions like fetching a website or searching a map.</p><p>The server provides standard descriptions so the language model knows exactly how to use them.</p><h4>Second are resources.</h4><p>These are knowledge bases or plain files in a drive that the AI can read to pull context.</p><h4>Third are prompts.</h4><p>These are pre-written templates provided by the server developers, making it easier to interact optimally with specific APIs.</p><h2>The End to End Workflow</h2><p>So how does this actually work when a user makes a request?</p><p>The process follows a highly structured workflow. It starts with initialisation.</p><p>When the app opens, it connects to its configured servers and asks them to list their capabilities. The servers return detailed descriptions of everything they can do.</p><p>Then comes the user query.</p><p>Imagine you ask your AI assistant for details about a hiking trip.</p><p>The host application takes your question and bundles it with the descriptions of all available tools.</p><p>It sends this combined information to the language model. The model uses its intelligence to evaluate the request.</p><p>It reads the tool descriptions and determines which specific tool to call. It is smart enough to extract the required parameters.</p><p>It maps the location to the exact longitude and latitude required by the map tool. The host then calls the appropriate server to execute the operation. The server fetches the data from the API or database and returns the response in a standardised format.</p><p>Finally, the model reads this retrieved information and generates the comprehensive answer for the user.</p><h2>How to Set Up an MCP Server?</h2><p>Setting up an MCP server is straightforward for developers. You do not need to write complex API wrappers anymore.</p><p>If you are using Claude Desktop, you only need to edit one configuration file. On a Mac you navigate to your Library folder &#8212;&gt; Application Support &#8212;&gt; Claude and open the claude desktop config json file.</p><p>On Windows, you look in your AppData folder to find the same file.</p><p>You open this file and add a block of text specifying your mcpServers.</p><p>Inside this block you tell Claude the specific command to run your local server. You might point it to a local file system or a database connector.</p><p>Once you save the file and restart Claude Desktop, it automatically connects. You will see a small plug icon showing the connection is live. If you use a code editor like Cursor, the process is even easier.</p><p>You just open Cursor settings and look for the MCP section. You click to add a new server and paste in the command. The editor handles everything else.</p><p>This simple setup is exactly why MCP is gaining traction so quickly.</p><h2>What can be some of the challenges?</h2><p>This implementation is not going to be easy. There are some very real problems product managers have to solve.</p><p>Giving an AI standard access to databases means you need strict security and permissions in place. You have to ensure the AI cannot access sensitive user data it is not supposed to see.</p><p>The second challenge is reliability. Standardising responses from hundreds of different APIs requires robust error handling. If an external server fails, the AI needs to fail gracefully instead of breaking the user experience.</p><p>MCP has actually lowered the barrier to entry for building complex AI products.</p><p>For product managers, lower effort means faster shipping. Faster shipping brings us back to solving core user problems instead of managing infrastructure.</p><p>If this framing changed how you think about AI product design, and if you want the full structured path &#8212; from AI foundations to RAG, Evals, AI strategy, and interview prep &#8212; built specifically for PMs, that is what our course covers.</p><p><strong>Highest rated AI PM course &#183; 4.9/5 &#183; 500+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> 60% OFF for a limited time &#8212; Code: NYE26</p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[The Last Product Manager]]></title><description><![CDATA[Most PMs are on the wrong side]]></description><link>https://www.technomanagers.com/p/the-last-product-manager</link><guid isPermaLink="false">https://www.technomanagers.com/p/the-last-product-manager</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Wed, 18 Mar 2026 17:48:32 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d85ac2e2-d335-4959-b84a-e2aa74c6d2f6_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>What follows is a scenario, not a prediction. Our intent is not to terrify you. It's to make you think clearly about what is already happening</p><p><strong>MARCH 18, 2028 &#183; LINKEDIN</strong></p><div class="pullquote"><p><em>Senior AI Orchestration Lead | 3&#8211;5 years experience managing AI agent systems | Vision-setting, stakeholder alignment, taste-based product direction | Compensation: &#8377;85L&#8211;&#8377;1.2Cr</em></p><p><em>Note: This role does not require traditional product management experience.</em></p></div><p>It received 4,200 applications in 48 hours. The job it replaced was posted 2 years earlier, almost to the day</p><p><strong>MARCH 14, 2026 &#183; LINKEDIN</strong></p><div class="pullquote"><p><em>Senior Product Manager | 5&#8211;7 years of PM experience | PRD writing, roadmap planning, cross-functional alignment, data analysis | Compensation: &#8377;80L&#8211;&#8377;1.1Cr</em></p></div><p>Nobody noticed the titles had changed. They were too busy applying.</p><h2><strong>We Asked the Wrong Question</strong></h2><p>For three years, every PM newsletter, every conference panel, every anxious LinkedIn post asked the same question: Will AI replace product managers?</p><p>It was the wrong question.</p><p>The right question was simpler, and more brutal: <strong>What happens when the things product managers do become free?</strong></p><p>Not cheaper. Free.</p><p>The PRD that took 7 days of procrastination and 1 day of real work? It now takes 12 minutes.</p><p>McKinsey estimated in November 2025 that AI agents could perform tasks that account for <strong>44% of US work hours today.</strong> Not in 2030. Today, with the tools already deployed, already running inside your company&#8217;s tech stack.</p><p>That 44% is not random. It is the exact centre of a product manager's role.</p><h2><strong>The Execution Collapse</strong></h2><p>For the longest time, building software was an expensive and slow process. </p><p>A PM was essential to sit between business and engineering, managing the roadmap and prioritising tasks. </p><p>That constraint has vanished. Today, AI tools can build functional prototypes in mere hours. A competent engineer with an AI assistant can replicate core SaaS functionality in weeks.</p><p>This leads to a velocity inversion. Previously, engineers needed PMs for customer context, and PMs needed engineers to build. Now, engineers can spin up prototypes without a spec, and PMs can validate concepts using AI mockups without an engineer. The bridge is no longer required because both sides can cross alone.</p><p>When execution becomes free, the bottleneck moves upstream.</p><p>From <em>&#8220;can we build it?&#8221;</em> to <em><strong>&#8220;should we build it at all?&#8221;</strong></em></p><p>The PM who spent their career answering the first question is now competing with a tool that answers it faster and cheaper</p><h2><strong>The Knowledge Moat Dissolved Overnight</strong></h2><p>Ask any senior PM what their real competitive advantage is. They will not say &#8216;I write good PRDs.&#8217; They will say something like: I know this market. I know these customers. I&#8217;ve seen what works and what doesn&#8217;t. I have judgment that took years to build.</p><p>That was true. It was also the moat.</p><blockquote><p><em>Previously your knowledge of the market, the technology, your own product was valuable. Why? Because it was difficult to get. That barrier is gone.</em></p></blockquote><p><em>&#8212; Reforge, 2025</em></p><p>LLMs arrived. And the barrier fell.</p><p>The specialised knowledge that previously took multiple quarters to develop can now be accessed, synthesised, and applied in minutes or hours. A junior PM with Claude has the same market intelligence that a decade of experience used to produce. Not better. Not worse. The same.</p><p><strong>When knowledge stops being scarce, experience stops commanding a premium.</strong></p><p>The 10-year PM and the 2-year PM now start their research from the same baseline.</p><p>Demand for AI fluency in job postings has grown <strong>7x in two years,</strong> which is faster than any other skill. Workers in AI-fluency-required roles grew from 1 million in 2023 to 7 million in 2025. </p><p>Companies are not asking for more PM experience. They are asking for a different kind of PM entirely.</p><h2><strong>What Survives &#8212;&gt;The Taste Premium</strong></h2><p>Here is what the data does not tell you, and what the panic misses entirely.</p><p>When execution is free, and knowledge is commoditised, one input becomes more valuable, not less.</p><p><strong>Taste.</strong></p><p>Taste is editorial judgment. It is knowing what &#8220;good&#8221; feels like before the data confirms it. It is the ability to look at two technically valid product decisions and know which one will resonate with users and which one will not.</p><p>Here is the uncomfortable truth about taste: it cannot be automated because it is not information. It is a judgment about information. </p><p>It is the PM who kills the feature AI says will win because she talked to 50 users last week and heard something the data didn&#8217;t capture. </p><p>It is the PM who sees that the engagement metric is going up but the product is getting worse and has the credibility to say so in a room full of people whose bonuses depend on the metric going up.</p><p><strong>That is the Taste Premium. It compounds in an AI world. It does not depreciate.</strong></p><h2><strong>The Three-Tier PM That Survives</strong></h2><p>By 2028, the PM role will not disappear. It has forked.</p><h4><strong>Tier 1: The Orchestrator</strong></h4><p>Manages fleets of AI agents building products. Thinks in systems, not features. Writes agent instructions instead of PRDs. Evaluates output instead of producing input. This is the most in-demand PM role in 2028. It requires less traditional PM experience and more systems-design thinking.</p><h4><strong>Tier 2: The Taste Maker</strong></h4><p>Sets the vision, the feeling, the why this and not that. Cannot be replaced because the job is inherently about human preference, cultural context, and aesthetic judgment that no model has been trained to simulate. This PM earns the Taste Premium. There are fewer of them than there used to be. They earn more.</p><h4><strong>Tier 3: The Domain Expert</strong></h4><p>Understands a vertical so deeply, like healthcare, fintech, e-commerce, and logistics, that no horizontal AI agent can replicate the judgment. The PM who spent five years building checkout flows at a major e-commerce company knows something no model trained on general internet text knows. Domain depth is a moat that compounds.</p><div class="pullquote"><p>None of this has fully played out yet.</p></div><p>Your job title still says, Product Manager. The 244,000 tech layoffs from last year haven&#8217;t hit your team yet. The 44% of work hours that AI can technically automate today are still mostly being performed by humans.</p><p><em><strong>The canary is still alive.</strong></em></p><p>But the feedback loop has started. The knowledge moat has already dissolved. Vibe coding already exists. The Velocity Inversion is already underway. The signals are not weak. They are just early.</p><p>You have approximately 18 months before the three tiers become obvious to everyone.</p><p>Right now, only the people paying close attention can see them.</p><h2><strong>Three Things to Do This Week</strong></h2><p>Not next quarter. This week.</p><ol><li><p>Open Claude Code or Cursor today. Build one prototype of one idea you&#8217;ve been writing specs about. Ship it to three users before Friday. Feel what it means when execution is free. That feeling is data.</p></li><li><p>List everything you did last week as a PM. Tag each item: knowledge (research, synthesis, analysis) or taste (judgment, vision, direction). Then count. If your taste column has fewer than 5 items, you are primarily operating in the 44% that AI is already automating.</p></li><li><p>Name the one insight that only you can generate, the thing that comes from your specific context, your specific users, your specific domain history. That insight is your Taste Premium. If you cannot name it in one sentence, you do not have it yet.</p></li></ol><blockquote><p><strong>The window is open. </strong>The question is which tier you are building toward.</p></blockquote><p>The Tier 1 and Tier 2 PMs reading this will act on it. The rest will bookmark it and forget it by Monday.</p><p>The three action items above are free and enough to start. If you want to fast-track ( step by step ) and want to do it in detail, I built a course for exactly this. <a href="https://topmate.io/technomanagers/1861184">Advance AI Product Management + AI Interview Prepraration (40+Videos + 25 Case Studies)</a></p><p>Top Rated 4.9/5 from 600+ PMs &#183; <strong>Use NYE26 for 60% off</strong> <strong>&#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> </p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Memory in AI — Part 2]]></title><description><![CDATA[How Instagram decides what to remember?]]></description><link>https://www.technomanagers.com/p/memory-in-ai-part-2</link><guid isPermaLink="false">https://www.technomanagers.com/p/memory-in-ai-part-2</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sun, 15 Mar 2026 06:42:46 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8a40ac1b-bcee-4dd7-b240-774fc3415026_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In Part 1, we established the core equation. If you haven;t read the article, <a href="https://www.technomanagers.com/p/memory-in-ai-part-1">please click here</a></p><blockquote><p><strong>Context &#215; Reasoning &#215; Memory = Perceived Intelligence</strong></p></blockquote><p>We named four memory types. In-context. External. In-weights. KV Cache.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Want to be in the top 1%? Subscribe!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>But naming them is not the hard part.</p><p>The hard part is this: <strong>which memory type do you use, for which feature, and why?</strong></p><p>That is what Part 2 answers. And we will answer it using a product your users open fifteen times a day &#8212; Instagram.</p><p>Instagram is not a single product.</p><p>It is a bundle of features, each with a different job, each running a different memory strategy &#8212; sometimes on the same screen.</p><p>Open Instagram right now. You will find:</p><ul><li><p><strong>Reels</strong> &#8212; a feed that knows exactly what keeps you watching</p></li><li><p><strong>Meta AI</strong> &#8212; a conversational assistant built into the search bar and DMs</p></li><li><p><strong>Explore</strong> &#8212; a discovery surface that learns your taste over time</p></li></ul><p>Each of these features has a different answer to the same question: what should the AI know about you, and when should it know it?</p><p>Most PMs look at these as three different features. </p><p>The right way to look at them is as <strong>three different memory architectures producing three different product outcomes</strong>.</p><h2>Let&#8217;s discuss Meta AI on Instagram</h2><p>When you type a message into Meta AI inside Instagram, four memory systems activate simultaneously. Most users see none of this, and most PMs don&#8217;t either.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6gZv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6gZv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!6gZv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!6gZv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!6gZv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6gZv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:233554,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/190998766?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6gZv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!6gZv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!6gZv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!6gZv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397e301a-fa2b-442b-b6e2-7b1a80912235_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>In Context Memory</h4><p><strong>In-context memory</strong> handles everything in the current conversation. Your message. Meta AI&#8217;s reply. The follow-up question you ask three messages later. All of it lives in a single context window for the duration of that session. When you close the chat, it is gone.</p><h4>In Weight Memory</h4><p><strong>In-weights memory</strong> is the foundation. Meta AI&#8217;s base model was trained on billions of documents. It knows what pasta is. It knows the difference between a North Indian restaurant and a Italian restaurant. It understands that something exciting for a Friday night means something different in Bangalore than it does in Paris. None of that required you to explain it. It is baked into the model&#8217;s parameters from training.</p><h4>External Memory</h4><p><strong>External memory</strong> is where it gets interesting. When you ask Meta AI for restaurant recommendations in your city, it does not guess. It retrieves. Meta&#8217;s interest graph &#8212; built from your likes, saves, follows, watch history, and search patterns &#8212; gets injected into the prompt before the model responds. The model did not learn your preferences. It looked them up.</p><h4>KV Cache Memory</h4><p><strong>KV cache memory</strong> is invisible to you but critical to Meta&#8217;s infrastructure. The system prompt that tells Meta AI how to behave &#8212; its tone, its safety rules, its persona guidelines &#8212; is the same across hundreds of millions of user sessions. Rather than re-processing that system prompt from scratch for every conversation, the model&#8217;s intermediate computations are cached and reused. This is not a product feature. It is a cost and latency optimisation. But at Meta&#8217;s scale, it is what makes the economics work.</p><h2>Now look at Reels &#8212; a completely different memory stack</h2><p>Reels does not have a context window problem.</p><p>It does not forget you when you close the app. It does not start from zero when you open it again tomorrow.</p><p>That is because <strong>Reels does not use in-context memory at all</strong>.</p><p>Everything Reels knows about you lives in external memory &#8212; a continuously updated store of your engagement signals.</p><p>Every video you watch to completion is a signal. Every video you skip after two seconds is a signal. Every creator you follow, every sound you save, every comment you leave &#8212; all of it writes into an external memory store that the recommendation engine reads from the moment you open the app.</p><p>This is why Reels feels more intelligent over time than Meta AI does. It is not because the recommendation model is smarter than the language model. It is because <strong>Reels was architected with persistent external memory, and Meta AI largely was not</strong>.</p><p>The Goldfish Problem from Part 1? Reels solved it. Meta AI has not.</p><h2>The PM decision framework: Memory Fit Matrix</h2><p>So how do you decide which memory type to use for a given feature? Here is the framework.</p><p>Ask three questions about each piece of information your feature needs to work well.</p><ol><li><p><strong>How often does it change?</strong> <br>Information that changes every message belongs in in-context memory. Information that changes over weeks or months belongs in external memory. Information that almost never changes can be baked into the model via fine-tuning.</p></li><li><p><strong>Is it shared across users or specific to one user?</strong> <br>A system prompt is shared. Your interest graph is not. Shared, stable information is a candidate for KV caching. Personalised, dynamic information needs external retrieval.</p></li><li><p><strong>What breaks if you get it wrong?</strong> <br>If you stuff the wrong information into the in-weights memory through premature fine-tuning, you cannot fix it without a full retraining cycle. If you miss something in in-context memory, the damage is limited to that session. The higher the cost of being wrong, the more you want the information in an editable, retrievable external store rather than baked into weights.</p></li></ol><p>A PM who understands this  will make better architectural calls than one who only asks can we use RAG for this?</p><p><em>If this framing changed how you think about AI product design, and if you want the full structured path &#8212; from AI foundations to RAG, Evals, AI strategy, and interview prep &#8212; built specifically for PMs, that is what our course covers.</em></p><p><strong>Highest rated AI PM course &#183; 4.9/5 &#183; 500+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> 60% OFF for a limited time &#8212; Code: NYE26</p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Memory in AI - Part 1]]></title><description><![CDATA[The retention problem hiding inside every AI product]]></description><link>https://www.technomanagers.com/p/memory-in-ai-part-1</link><guid isPermaLink="false">https://www.technomanagers.com/p/memory-in-ai-part-1</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Wed, 11 Mar 2026 17:55:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8a4b47c5-30b8-4801-848a-8365c6194d6b_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Everyone talks about AI features these days.</p><p>But very few product teams understand what actually makes AI feel intelligent.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>It is not the model. It is not the UI. It is not even the speed.</p><p>It is memory.</p><p>Right now, most AI products operate on a single memory variable.</p><blockquote><p><strong>What the AI knows = What&#8217;s in the current conversation window</strong></p></blockquote><p>The session ends. The knowledge disappears. The next conversation starts from zero.</p><p>To understand why, we need to examine what determines whether an AI product feels magical or mediocre.</p><h2>First Principle Breakdown of AI Usefulness&#8230;</h2><p>AI product value depends on three multiplied factors.</p><blockquote><p><strong>Context &#215; Reasoning &#215; Memory = Perceived Intelligence</strong></p></blockquote><p>Most teams obsess over Reasoning. They chase better models, bigger parameters, faster inference.</p><p>But there is a problem. Reasoning without memory is like hiring a brilliant employee who forgets everything at the end of every day.</p><p>You get a fast answer. But you never get a useful relationship.</p><p>This is the gap that separate AI tools people demo from AI tools people actually rely on. Because an AI product with well-designed memory compounds. The longer the user stays, the more context the AI holds. The more context the AI holds, the more useful it becomes. The more useful it becomes, the harder it is to leave, basically you are increasing the switching cost.</p><h2>So what does it mean by Memory in AI?</h2><p>Before we go further, let&#8217;s understand what memory is because it is being used very loosely nowadays.</p><p>When we talk about memory in AI systems, we are really talking about one question: What does the model know, and when does it know it?</p><p>There are four distinct types of memory in LLM-based products. Each works differently. Each has a different cost and a different product implication.</p><h4>In-Context Memory</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O4_A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O4_A!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!O4_A!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!O4_A!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!O4_A!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O4_A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:429048,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/190633925?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O4_A!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!O4_A!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!O4_A!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!O4_A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed194150-9097-4c2c-9d71-0f3fd82e312c_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the most immediate form of memory. The AI reads everything in the active session like your messages, its responses, any documents you've pasted and uses that to respond. When the session ends, it's gone.<br>Context windows have token limits. You can't fit infinite history into them. And the longer the context, the more expensive the inference.</p><h4>External / Retrieval Memory</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wQV7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wQV7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!wQV7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!wQV7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!wQV7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wQV7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:362312,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/190633925?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wQV7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!wQV7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!wQV7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!wQV7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b862304-6a8a-4d27-8b81-530e7efec393_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>External memory is like a database outside the model that gets queried at runtime and injected into the context.</p><p>The AI doesn&#8217;t store anything itself. Instead, relevant information is retrieved from an external store like a vector database, a CRM, a document store  and inserted into the prompt before the model responds.</p><p>Here the response quality is determined by the retrieval quality.</p><h4>In-Weights Memory</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UcZi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UcZi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!UcZi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!UcZi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!UcZi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UcZi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:637001,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/190633925?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UcZi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!UcZi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!UcZi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!UcZi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd21746ba-a6ad-4d9b-b3e6-2dad48f3b101_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the kind of memory where the knowledge is baked into the model during training or fine-tuning. It lives in the billions of parameters of the model itself. You can&#8217;t &#8220;update&#8221; it without retraining and retraining is expensive.</p><p>Fine-tuning makes sense when your domain has a consistent, learnable structure .i.e legal, medical, financial. It's a long-term investment, not a quick feature. Don't reach for fine-tuning when retrieval will do the job.</p><h4>In-Cache / KV Cache Memory</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LaFl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LaFl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!LaFl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!LaFl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!LaFl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LaFl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:471882,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/190633925?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LaFl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!LaFl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!LaFl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!LaFl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7229ac6e-9c4b-41ed-8097-290d1684f149_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the most technical type and the most invisible to users. When the same system prompt or document is used repeatedly, the model's intermediate computations (Key-Value cache) can be stored and reused. The model doesn't re-read it from scratch every time.</p><h2>The Goldfish Problem</h2><p>Let&#8217;s look at what happens when a product team ignores these memory types and relies entirely on a blank-slate session, the Goldfish Problem.</p><p>Imagine a user spends twenty minutes interacting with an AI financial planner. They input their salary, their risk tolerance, and their goal to buy a house in three years. The AI gives them a brilliant, personalised breakdown.</p><p>The next day, the user logs back in and asks, How much should I allocate to index funds this month?</p><p>The AI responds: I can help with that! To get started, what is your current salary and risk tolerance?</p><p>The magic is instantly broken. The user realizes they aren&#8217;t talking to an intelligent assistant; they are talking to a calculator that resets every time the screen turns off. That right there is a churn event.</p><div class="pullquote"><p>So the Memory Gap Is a Retention Gap</p></div><p>If this article raised more questions than it answered &#8212; that is a good sign. It means you can see the gap.</p><p>If this article made you think differently about AI products, Part 2 and Part 3 will go deeper into memory architecture.</p><p>And if you want the full structured path &#8212; from AI foundations ( ML Algorithms, Systems) to Vibe Coding, RAG, Evals, AI Strategy, AI Pricing to interview prep &#8212; in one place built for PMs, that is what our course covers.</p><p><strong>Highest rated AI PM course &#183; 4.9/5 &#183; 500+ enrollments &#8594; <a href="https://topmate.io/technomanagers/1861184">See testimonials and course details</a></strong> 60% OFF for a limited time &#8212; Code: NYE26</p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.technomanagers.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Technomanagers is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How to Crack the AI PM Interview in 2026]]></title><description><![CDATA[Please find the resources]]></description><link>https://www.technomanagers.com/p/how-to-crack-the-ai-pm-interview</link><guid isPermaLink="false">https://www.technomanagers.com/p/how-to-crack-the-ai-pm-interview</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sat, 07 Mar 2026 18:33:47 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6c4c89c1-8a99-4d79-bce2-54aad973415a_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A friend of mine got rejected last month from a Senior PM role at a well-known AI startup.</p><p>He had 7 years of product experience, two successful product launches, and a strong portfolio. He was confident going in. </p><p>He came out completely blank.</p><p>The interviewer had asked him:</p><blockquote><p><em>How would you write evaluation metrics for a travel booking agentic workflow?</em></p></blockquote><p>He had never heard the word &#8220;evals&#8221; before that moment.</p><p>Here is the thing.</p><p>The PM interview is not what it was two years ago.</p><p>Most people preparing for PM roles in 2025 and 2026 are still reading the same prep material from 2021. Decode PM, Cracking the PM Interview, a few mock interviews, and some STAR format practice. That was enough back then.</p><p>It is not enough anymore.</p><p>And the people who figure this out late are paying a real price.</p><p>Not just rejection, but the particular sting of rejection after you genuinely thought you had prepared.</p><p>That is a harder thing to recover from.</p><p>So let me walk you through what is actually changing, and why most candidates are not ready for it.</p><h2><strong>The interview shifted</strong></h2><p>When companies like Google, Meta, Notion, and a hundred AI-first startups started hiring AI PMs, they did not just add a couple of AI questions to the existing format. They rewrote the format.</p><p>You now get questions that sound like this:</p><ul><li><p>&#8220;How would you measure the success of GPT 5.0?&#8221;</p></li><li><p>&#8220;Design a RAG system for an enterprise knowledge base. What are your eval criteria?&#8221;</p></li><li><p>&#8220;Your AI feature has a 12% hallucination rate. Walk me through how you would reduce it.&#8221;</p></li></ul><p>These are not strategy questions dressed up in AI language.</p><p>These are technical product questions. And the interviewer is not looking for a vague answer about leveraging AI to improve user experience. They want to know if you actually understand how these systems work.</p><p>Most candidates do not.</p><h2><strong>Why is this happening now?</strong></h2><p>AI products are not just features anymore.</p><p>They are the product. When the core of what you are building is a language model, a recommendation engine, or an agentic workflow, then the PM sitting on top of that needs to speak the language of the system.</p><p>Think about it from the hiring manager&#8217;s side.</p><p>They need someone who can sit in a technical review and understand why the model is degrading.</p><p>Someone who can write a proper evaluation framework.</p><p>Someone who can tell the difference between a precision problem and a recall problem, and what each one means for the user experience.</p><p>If you cannot do that, you are not really managing the AI product. You are just writing user stories on top of it.</p><p>That is the gap most candidates have right now. And it is not their fault entirely. The material to prepare for this simply was not available in any structured form until very recently.</p><h2><strong>The supply and demand reality</strong></h2><p>Here is something that makes this more urgent than it might seem.</p><p>AI PM roles are genuinely sought after right now.</p><p>Everyone with any product experience is pivoting toward AI. You have traditional PMs rebranding themselves.</p><p>You have engineers trying to move into product. You have MBA graduates who did an AI course on Coursera.</p><p>Everyone is showing up to the same interviews.</p><blockquote><p>The companies on the other side are few. The good roles are fewer. And the bar for what counts as AI PM ready is rising every quarter.</p></blockquote><p>When supply is high and the bar keeps moving up, the margin between getting the role and not getting it becomes razor-thin.</p><p>One bad answer on an eval question. One moment where you stumble on what RAG actually means. That is sometimes all it takes.</p><h2><strong>What the unprepared candidate looks like</strong></h2><p>Let me be specific about where people go wrong, because it is not always obvious.</p><p>The first mistake is thinking that using AI tools makes you an AI PM. It does not.</p><p>Using ChatGPT, playing with Midjourney, and building a quick Notion AI workflow, these are user experiences.</p><p>They have nothing to do with building AI products. Interviewers have become very good at separating the two.</p><p>The second mistake is treating AI PM prep like traditional PM prep with a few AI keywords added.</p><p>You study product sense, execution, metrics, and then you memorize a few things about large language models. That does not hold up when the interviewer goes one level deeper.</p><p>The third mistake is not knowing what you do not know. This one is the most dangerous.</p><p>Many candidates feel ready because they have been in tech for years and have absorbed some AI knowledge by proximity. But there is a difference between ambient knowledge and working knowledge. The interview exposes that gap quickly.</p><h2><strong>What you actually need to prepare</strong></h2><p>The honest answer is that you need to understand the systems you would be managing. Not at an engineer&#8217;s depth, but at a PM&#8217;s depth. There is a difference, and it is a specific kind of knowledge.</p><ul><li><p>You need to understand how retrieval-augmented generation works well enough to answer a system design question about it.</p></li><li><p>You need to know what model evaluation actually involves, what an eval framework looks like, and why it matters for product quality.</p></li><li><p>You need to understand agentic workflows at a conceptual level, because that is where AI products are going in 2026.</p></li><li><p>You also need the strategic layer.</p></li><li><p>AI pricing is a distinct problem from traditional software pricing.</p></li><li><p>AI success metrics are different because the outputs are probabilistic.</p></li><li><p>AI product sense requires thinking about reliability, not just usability.</p></li></ul><p>And if you want to stand out, vibe coding helps. Being able to prototype an AI idea quickly, without waiting for engineering bandwidth, is something that impresses interviewers and actually makes you a better PM on the job.</p><p>None of this is impossible to learn. It just requires learning it in the right sequence, with the right depth.</p><h2><strong>Where most people look for this, and why it does not work</strong></h2><p>The usual routes are scattered and incomplete.</p><p>YouTube has surface-level content. Blog posts cover individual concepts but not the full picture.</p><p>Existing PM courses were built before AI PM was a real category.</p><p>What you need is something that moves from AI foundations to technical architecture to advanced systems like RAG and agents, and then into interview-specific preparation. That whole path, in one place, built specifically for PMs.</p><h3><strong><a href="https://topmate.io/technomanagers/1861184?coupon_code=NYE26">Our Course is TopRated 4.9/5, that too from more than 500 People.</a></strong></h3><p><a href="https://topmate.io/technomanagers/1861184?coupon_code=NYE26">It has 40+ Videos &amp; 25+ Case Studies. Click here for Testimonial and Description of the course. </a> <em><strong>( 60% flat Discount  for limited time frame)</strong></em></p><blockquote><p><em>If the gap I described in this article sounds familiar, <strong>this course was built for you.</strong></em></p></blockquote><p>The gap is real. The good news is it is closable. But it takes more than watching a few YouTube videos about generative AI.</p><p>Start with understanding what you actually do not know. That is usually where the real preparation begins.</p><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[What's OpenRAG? Future of RAG]]></title><description><![CDATA[AI for Product Managers]]></description><link>https://www.technomanagers.com/p/whats-openrag-future-of-rag</link><guid isPermaLink="false">https://www.technomanagers.com/p/whats-openrag-future-of-rag</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Thu, 05 Mar 2026 20:10:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/18d429c6-a630-4ffe-af17-cea86e623d4c_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before we move on, read these important articles</p><ol><li><p><a href="https://www.technomanagers.com/p/ubers-ai-strategy">Spotify&#8217;s AI Strategy</a></p></li><li><p><strong><a href="https://www.technomanagers.com/p/metrics-for-ai-product">AI Product Metrics</a> ( AI PM Interview Question )</strong></p></li><li><p><a href="https://www.technomanagers.com/p/ai-product-management-2026-winners">AI Product Management 2026 &#8212; Winner&#8217;s Playbook</a></p></li></ol><p>In our last article, we discussed the massive Context Windows of modern LLMs; it&#8217;s time to upgrade ourselves.</p><p>Imagine you are a Senior Product Manager at YouTube.</p><p>You are in charge of building a new &#8220;Ask this Video&#8221; Generative AI feature.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Dzo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Dzo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!-Dzo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!-Dzo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!-Dzo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Dzo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:529870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.technomanagers.com/i/190010931?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Dzo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!-Dzo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!-Dzo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!-Dzo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94dc92a9-6869-498c-8736-3829412d11b5_1920x1080.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ask this Video - feature is an AI-powered conversational tool on YouTube that allows viewers to interact directly with the content they are currently watching. </p><p>By analy<strong>s</strong>ing the video's transcript in real-time, it enables users to request summaries, extract specific facts, or clarify concepts without interrupting playback.</p><p>You have a massive problem: AI models are incredibly smart, but they know absolutely nothing about the 3-hour podcast a creator uploaded five minutes ago.</p><p>When a viewer clicks the &#8220;Ask&#8221; button and types, &#8220;Summarise the part where they discuss the new camera features and skip the sponsor read,&#8221; the AI cannot just guess the answer. </p><p>It needs the exact, up-to-the-minute context of that specific video.</p><blockquote><p>How do you build a system that accurately leverages millions of hours of dynamic, unstructured video data without breaking the bank?</p></blockquote><h3><strong>Option 1: Prompt Stuffing (The Context Window)</strong></h3><p>With new models boasting massive context windows, you could try to inject the entire 3-hour transcript, the uploader&#8217;s channel history, and the top 1,000 comments into the prompt every time a viewer asks a question.</p><p>But this has terrible Unit Economics. You pay per token. Stuffing a massive transcript into every single query will charge you a fortune, and it takes the model excessive time to process. </p><p>Plus, the model might get lost in the middle, confusing something said in minute 10 with something said in minute 150.</p><h3><strong>Option 2: Fine-Tuning the Model</strong></h3><p>You could retrain the underlying weights of the LLM on your massive database of YouTube transcripts.</p><p><strong>But </strong>it misses Dynamic Context. Over 500 hours of video are uploaded to YouTube every single minute. </p><p>Fine-tuning is expensive, slow, and static. If a creator uploads a breaking news video at 2:00 PM, your fine-tuned model won&#8217;t know about it until the next massive training run.</p><p>Now, let&#8217;s see how RAG (Retrieval-Augmented Generation) solves the Problem.</p><p>But before that, let&#8217;s understand: What is OpenRAG?</p><p>OpenRAG is an open-source platform where all the necessary tools are tightly integrated, making it super easy to set up an effective agentic RAG system in just a few minutes. </p><p>Instead of spending months building a custom data pipeline to connect your AI to your video servers, OpenRAG gives you a pre-configured architecture out of the box.</p><h3>Why do we need OpenRAG?</h3><p>We need it because blindly feeding data to AI is inefficient and inaccurate.</p><ul><li><p>Instead of reading the whole 3-hour transcript, RAG searches the database, finds the exact two paragraphs from minute 45 where the camera is discussed, and feeds only those specific details to the AI. This keeps token costs incredibly low and responses lightning-fast.</p></li><li><p>You need a system that gives you complete freedom to swap out underlying AI models or connect totally new external databases (like a database for YouTube Shorts) without writing complex backend code.</p></li></ul><h3>How OpenRAG Works</h3><p>For a complete and proper RAG system to work, it requires three main components. OpenRAG achieves this with three powerful tools:</p><h4><strong>1. Docling (Intelligent Data Entry)</strong></h4><p>Suppose you are processing a highly technical tutorial video. </p><p>The data isn&#8217;t just spoken words; it&#8217;s a mix of auto-generated transcripts, on-screen text (OCR), and chapter metadata. Traditional parsers dump all this raw text into a file, destroying the timeline structure. </p><p>Docling is the smart parser. It identifies these different components and extracts them properly, keeping timestamps aligned with the text. If this isn&#8217;t done right, your system fills with junk data, and the AI quotes a timestamp that points to the wrong part of the video.</p><h4><strong>2. OpenSearch (Fast Searching)</strong></h4><p>Once Docling processes the video transcripts and metadata, the data goes directly to OpenSearch. </p><p>This is your memory bank. It stores video data chunks as vector representations (mathematical embeddings of text). </p><p>When a viewer asks, &#8220;When do they talk about the battery life?&#8221;, OpenSearch performs similarity searches to retrieve the exact transcript chunks that mathematically match that concept, even if the speaker said power efficiency instead of battery life.</p><h4><strong>3. Langflow (The Main Wiring)</strong></h4><p>Think of Langflow as the execution engine. </p><p>It connects your AI models, your OpenSearch database, and the YouTube app interface together. </p><p>It provides a Visual Studio UI where you can make custom changes directly in the workflow. When you add your transcript database here, you give it a unique, descriptive name.</p><p> This is crucial so the AI agent knows to pull from the video content.</p><h3>Challenges with OpenRAG Systems</h3><ul><li><p>If Docling fails to parse a poorly auto-captioned video (where words are mumbled or overlapping), OpenSearch retrieves garbage text, and the LLM confidently hallucinates an incorrect answer.</p></li><li><p>Vector databases can be computationally heavy. Efficient indexing is required so viewers aren&#8217;t left staring at a loading spinner.</p></li><li><p>Adding an orchestration layer (Langflow) and a retrieval step adds latency before the LLM even begins generating the first word of the response.</p></li></ul><p>If you are managing an OpenRAG-based YouTube AI companion, track these:</p><ul><li><p><strong>Retrieval Latency:</strong> How many milliseconds does it take for OpenSearch to find the relevant 30-second chunk of the transcript?</p></li><li><p><strong>Time to First Token (TTFT):</strong> How long does the user wait from clicking &#8220;Ask&#8221; to seeing the very first word generated by the AI?</p></li><li><p><strong>Timestamp Accuracy:</strong> Did the retrieval engine actually fetch the correct segment? If the AI provides a &#8220;Jump to 12:04&#8221; link, users will instantly downvote the answer if the timestamp is wrong.</p></li><li><p><strong>Token Efficiency:</strong> How many tokens are you sending in the augmented prompt vs. how many were necessary? This directly impacts the feature&#8217;s computing cost at YouTube scale.</p></li></ul><div><hr></div><p><a href="https://topmate.io/technomanagers/1861184">Crack AI PM Interview | 500+ Success Stories | 4.9/5 Rated Course (having real AI PM Interview Questions from Google, OpenAI, Anthropic, etc.) - <br>(35+ Videos) &amp; (Extra 25+ Real Case studies as well).</a></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nwSS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nwSS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 424w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 848w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1272w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png" width="1400" height="788" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:788,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nwSS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 424w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 848w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1272w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highest Rated Course &#8212; 4.9 / 5 ( 500+ Enrollment in last 2 months) &#8212; <a href="https://topmate.io/technomanagers">Testimonials</a></figcaption></figure></div><blockquote><p><em>For New Year, we are giving EXTRA 60% OFF on our AI PM Flagship Course for <strong>very limited Time | ( 35+ Videos ) &amp; ( Extra </strong>25+ Real Case studies as well )<br>Coupon Code &#8212; <strong>NYE26 , Course Link &#8212; <a href="https://topmate.io/technomanagers/1861184?coupon_code=NYE26">Click Here</a></strong></em></p></blockquote><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Spotify's AI Strategy]]></title><description><![CDATA[First Principle Breakdown]]></description><link>https://www.technomanagers.com/p/spotifys-ai-strategy</link><guid isPermaLink="false">https://www.technomanagers.com/p/spotifys-ai-strategy</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Tue, 03 Mar 2026 18:24:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/20b36895-b2f3-4b67-9cff-29f653e939b1_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<ol><li><p><a href="https://www.technomanagers.com/p/ubers-ai-strategy">Uber&#8217;s AI Strategy</a></p></li><li><p><strong><a href="https://www.technomanagers.com/p/neural-networks-101">Neural Networks 101 for Product Managers</a></strong></p></li><li><p><strong><a href="https://www.technomanagers.com/p/metrics-for-ai-product">AI Product Metrics</a> ( AI PM Interview Question )</strong></p></li><li><p><a href="https://www.technomanagers.com/p/ai-product-management-2026-winners">AI Product Management 2026 &#8212; Winner&#8217;s Playbook</a></p></li></ol><p>Spotify is a leading global audio streaming platform. </p><p>It provides users with access to millions of songs and podcasts. The platform connects creators with listeners worldwide. </p><p>It operates on a freemium business model. Users can choose a free ad-supported tier or a paid premium subscription for uninterrupted listening.</p><h2>What is the ultimate Goal of Spotify?</h2><p>The primary business goal of Spotify is to maximise user engagement and retention while growing its subscriber base. </p><p>The company wants to become the ultimate audio ecosystem. They aim to keep users on the platform for as long as possible. </p><p>Higher engagement directly leads to better ad revenue from free users. </p><p>It also leads to higher retention rates for premium subscribers. This ensures long-term business profitability.</p><h2>So what should be the North Star Metric?</h2><p>A North Star metric is a single metric that best captures the core value a product delivers to its customers. </p><p>The North Star metric for Spotify is Time Spent Listening. </p><p>Increased listening time means users are discovering relevant content and finding value in the platform. </p><p>This metric perfectly aligns user satisfaction with business growth. A user spending hours on the platform is highly unlikely to cancel their subscription.</p><h2>First Principle Breakdown of North Star Metric</h2><p>Time Spent Listening can be expressed as a multiplication of three key terms.</p><blockquote><p><em><strong>Time Spent Listening = Daily Active Users x Sessions per User x Average Session Duration</strong></em></p></blockquote><ul><li><p>Daily Active Users represent the breadth of the user base. </p></li><li><p>Sessions per User represents the frequency of engagement. </p></li><li><p>Average Session Duration represents the depth of engagement. </p></li></ul><p>Improving any of these three terms will mathematically increase the overall Time Spent Listening.</p><h2>How will the AI strategy impact the North Star Metric?</h2><p>AI is the core engine driving these three levers today. </p><p>AI algorithms analyse listening habits to recommend highly personalised content. Better recommendations make users open the app more frequently. This directly increases the Sessions per User. </p><p>Users do not skip tracks when the algorithm plays exactly what they want to hear. This extends the Average Session Duration. </p><p>AI also helps reactivate dormant users through personalised notifications. This reactivation increases Daily Active Users. </p><p>All these AI-driven improvements multiply together to exponentially increase the overall Time Spent Listening.</p><h2>AI Strategic Initiatives</h2><p>Spotify is actively executing an aggressive AI strategy to transition from a passive streaming app into an interactive platform. They are rolling out several key initiatives.</p><h3>First is the AI DJ</h3><p>This feature uses a synthesised voice and generative AI to curate a personalised radio experience. </p><p>It talks to the user and explains why specific songs were chosen. This creates a deeply engaging experience that increases session duration.</p><h3>Second is Prompted Playlists</h3><p>Users can type simple text prompts to generate custom playlists. </p><p>This feature transforms passive listeners into active participants. It directly boosts session frequency.</p><h3>Third is internal engineering velocity</h3><p>Spotify uses an internal AI tool called Honk. </p><p>This tool helps developers write code and fix bugs directly from mobile devices. Faster software development means faster feature releases for users.</p><h3>Fourth is AI music derivatives. </h3><p>The platform is developing frameworks to allow creators to monetize AI enabled covers and remixes of their original music. </p><p>This expands the content library and attracts more users.</p><h2>Way Forward AI Strategy</h2><p>The future AI strategy for Spotify will focus on deep cross-format integration and hyper-contextual awareness. </p><p>The platform will predict what a user wants to hear based on their real-time environment or daily calendar. </p><p>The AI will become a proactive companion rather than a reactive tool. We will likely see AI negotiating licensing deals or automatically creating dynamic audiobooks. </p><p>The strategy will shift towards building a proprietary data moat. This data moat will ensure the AI understands user psychology better than any competitor.</p><p><em><a href="https://topmate.io/technomanagers/1861184">If you like this article, you will absolutely love our </a><strong><a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a></strong><a href="https://topmate.io/technomanagers/1861184"> ( having real </a><strong><a href="https://topmate.io/technomanagers/1861184">AI PM Interview Questions</a></strong><a href="https://topmate.io/technomanagers/1861184"> from Google, OpenAI, Anthropic, Amazon, Nvidia, Booking, etc.)</a> - <strong>( 35+ Videos ) &amp; ( Extra </strong>25+ Real Case studies as well ) | 4.9 Rated out of 5</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nwSS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nwSS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 424w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 848w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1272w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png" width="1400" height="788" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:788,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nwSS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 424w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 848w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1272w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highest Rated Course &#8212; 4.9 / 5 ( 500+ Enrollment in last 2 months) &#8212; <a href="https://topmate.io/technomanagers">Testimonials</a></figcaption></figure></div><blockquote><p><em>For New Year, we are giving EXTRA 60% OFF on our AI PM Flagship Course for <strong>very limited Time | ( 35+ Videos ) &amp; ( Extra </strong>25+ Real Case studies as well )<br>Coupon Code &#8212; <strong>NYE26 , Course Link &#8212; <a href="https://topmate.io/technomanagers/1861184?coupon_code=NYE26">Click Here</a></strong></em></p></blockquote><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Solving the GenAI Latency Problem]]></title><description><![CDATA[The 1-Line Prompting Mistake Killing Your AI's Unit Economics]]></description><link>https://www.technomanagers.com/p/solving-the-genai-latency-problem</link><guid isPermaLink="false">https://www.technomanagers.com/p/solving-the-genai-latency-problem</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Tue, 24 Feb 2026 19:02:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a87c8cb3-5039-49ac-9482-998a378d440a_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before we jump into the artcile, you can find out the <em><strong>Top Rated ( 4.9 / 5 ) Course - 500+ </strong></em>Success Stories.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nwSS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nwSS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 424w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 848w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1272w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png" width="1400" height="788" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:788,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nwSS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 424w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 848w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1272w, https://substackcdn.com/image/fetch/$s_!nwSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faed2c364-7d4d-4169-adee-17a62182b1b4_1400x788.png 1456w" sizes="100vw" loading="lazy" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highest Rated Course &#8212; 4.9 / 5 ( 500+ Enrollment in last 2 months) &#8212; <a href="https://topmate.io/technomanagers">Testimonials</a></figcaption></figure></div><blockquote><p><em>For New Year, we are giving EXTRA 60% OFF on our AI PM Flagship Course for <strong>very limited Time</strong><br>Coupon Code &#8212; <strong>NYE26 , Course Link &#8212; <a href="https://topmate.io/technomanagers/1861184?coupon_code=NYE26">Click Here</a></strong></em></p></blockquote><p>Now back to the article</p><p>Imagine you are a Senior Product Manager at Amazon. </p><p>You are building a new GenAI feature, an interactive shopping assistant, right on the product details page.</p><p>A customer is looking at a complex, high-end DSLR camera. </p><p>To make the AI truly helpful, you load the camera&#8217;s entire 50-page technical manual, hundreds of customer reviews, and your rigid system instructions into the LLM&#8217;s context window.</p><blockquote><p>The user types: Is this camera weather-sealed?</p></blockquote><p>Spinning loading wheel. &#8212;&gt; 6 seconds later &#8212;&gt; Yes.</p><p>Then the user asks: What&#8217;s the return policy on this?</p><p>Spinning loading wheel &#8212;&gt; 8 seconds later &#8212;&gt; You have 30 days.</p><p>You realise you have a massive problem. </p><div class="pullquote"><p>The latency is completely ruining the B2C customer experience&#8212;shoppers are abandoning the chat because it&#8217;s too slow. </p></div><p>Also,  your compute costs are destroying the feature&#8217;s unit economics. </p><p>Why? Because for every single question the user asks, the LLM is re-reading and re-processing that massive 50-page manual from scratch.</p><p>Today, we are doing a First Principle Breakdown of the infrastructure solution to this exact problem: Prompt Caching</p><h3>What Prompt Caching is NOT</h3><p>Let&#8217;s clear up a common misconception. When you tell an engineer you want to cache the responses, they might think of traditional output caching (like caching a standard SQL database query).</p><p>Output caching means storing the final generated response. </p><p>If User A asks What is the capital of France? and User B asks the same thing, you just serve the cached answer. </p><p>But in our  scenario, users are asking different dynamic questions about the same static product manual. Output caching won&#8217;t help us here.</p><blockquote><p><em>Prompt caching tackles the input bottleneck. It caches the prompt itself.</em></p></blockquote><h3>The First Principles of Prompt Caching</h3><p>To understand why this saves your latency and cost, we need to look under the hood at how Transformers process text.</p><p>When you send your massive  prompt to an LLM, before it can generate a single token of its answer, it goes through the prefill phase. </p><p>For every single token in your 50-page manual, the model computes Key-Value (KV) pairs across dozens of transformer layers. </p><p>Think of these KV pairs as the model&#8217;s internal mathematical understanding of your prompt&#8212;how every word relates to every other word.</p><p>Computing these KV pairs for thousands of tokens requires millions of operations. It is computationally expensive and painfully slow.</p><p>Prompt caching simply stores these precomputed KV pairs. </p><p>When your Amazon shopper asks their second question, the system recognises the product manual, retrieves the cached mathematical understanding, and only processes the few tokens of the new question.</p><h3>The Prefix Matching Rule: Structuring Your Prompt</h3><p>For this to work, the LLM relies on Prefix Matching. </p><p>The cache system matches your prompt token-by-token from the absolute beginning. The exact moment it encounters a token that differs from what is cached, caching stops, and expensive normal processing takes over.</p><p>This means your prompt structure directly dictates your unit economics.</p><p>If you are building this Amazon assistant, you must structure your prompt to put all the static content first:</p><ol><li><p>System Instructions (You are a helpful Amazon shopping assistant.)</p></li><li><p>Few-Shot Examples (How to format the output)</p></li><li><p>The 50-Page Product Manual (The static context)</p></li><li><p>The User&#8217;s Question (The dynamic content: What&#8217;s the return policy?)</p></li></ol><p>If you make the mistake of putting the user&#8217;s dynamic question at the top of the prompt, the cache will fail immediately on the very first differing token. </p><p>The LLM will have to recalculate the KV pairs for the entire 50-page manual again. Always put the dynamic components at the very end.</p><h3>Key Takeaways while Execution</h3><p>When you sit down with your engineering team to implement this, keep these constraints in mind:</p><ul><li><p>Minimum K Tokens: Prompt caching isn&#8217;t for simple queries. You typically need at least K tokens to initiate caching. Below that threshold, the overhead of managing the cache exceeds the compute savings.</p></li><li><p>Time-to-Live (TTL): Caches do not last forever. Providers usually clear them after 5 to 10 minutes to keep data fresh and manage memory (though some architectures allow up to 24 hours).</p></li><li><p>Implicit vs. Explicit API Calls: Some model providers handle this prefix matching automatically. Others require your engineers to explicitly mark which parts of your prompt should be cached in your API calls.</p></li></ul><p>By leveraging prompt caching, we can turn a slow, expensive AI feature into a highly responsive, cost-effective product that your customers actually want to use.</p><blockquote><p><em><a href="https://topmate.io/technomanagers/1861184">If you like this article, you will absolutely love our </a><strong><a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a></strong><a href="https://topmate.io/technomanagers/1861184"> ( having real AI PM Interview Questions from Google, OpenAI, Anthropic, Amazon, Nvidia, Booking etc)</a> - <strong>( 35+ Videos ) &amp; ( Extra </strong>25+ Real Case studies as well )</em></p></blockquote><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item><item><title><![CDATA[Develop a strategy for OpenAI’s fine-tuning capabilities?]]></title><description><![CDATA[AI PM Interview Question]]></description><link>https://www.technomanagers.com/p/develop-a-strategy-for-openais-fine</link><guid isPermaLink="false">https://www.technomanagers.com/p/develop-a-strategy-for-openais-fine</guid><dc:creator><![CDATA[Shailesh Sharma]]></dc:creator><pubDate>Sun, 22 Feb 2026 16:10:02 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/be200b78-da44-42a8-9644-cf2e94c79c13_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you think you can crack a top-tier AI PM interview just by throwing around buzzwords like &#8216;RAG&#8217;, &#8216;Agents&#8217;, or relying entirely on &#8216;Vibe Coding&#8217;, you are setting yourself up for failure.</p><p>See this question recently asked in a PM Interview: </p><blockquote><p><em>Develop a strategy for OpenAI&#8217;s fine-tuning capabilities?</em></p></blockquote><p>We will see how to answer this question like the Top 1%.</p><h3><strong>First Principles Breakdown</strong> </h3><p>First, let&#8217;s break down using First Principle Thinking and ask some clarifying questions. </p><p>If you look at the concept of fine-tuning on a broad level, we need to ask three fundamental questions: </p><ul><li><p>What is it? </p></li><li><p>What is it required for? </p></li><li><p>And are there any constraints?</p></li></ul><h4><strong>What is it?</strong> </h4><p>At its core, fine-tuning is taking a massive, pre-trained foundation model&#8212;like GPT-4o&#8212;and training it further on a much smaller, highly curated dataset of your own.</p><p>You are essentially providing hundreds or thousands of &#8216;prompt-completion&#8217; pairs to adjust the model&#8217;s internal behaviour.</p><h4><strong>Why is it required?</strong></h4><p>This is where candidates mess up. </p><p>A lot of people confuse fine-tuning with RAG (Retrieval-Augmented Generation). RAG is for giving the model new facts; it is not training. </p><p>But fine-tuning is for teaching the model new behaviour. </p><p>In the case of RAG, you always need to attach the document while calling the LLM, which will increase the token cost and inference latency. </p><p>Base models are generalists. Fine-tuning bridges the gap between general intelligence and domain-specific expertise. </p><p>I hope you are able to get this.</p><h4><strong>Does it have any constraints?</strong></h4><p>Fine-tuning is computationally expensive and requires high-quality data.</p><h3><strong>Clarifying Questions</strong> </h3><p>I hope you are now able to understand the fundamentals of fine-tuning. It&#8217;s time to ask some clarifying questions, as the prompt is broad.</p><ol><li><p><em>What is the Objective of building this? Are we trying to defend against open-source models? Are we trying to increase Enterprise Adoption? Or are we just improving our own model performance? </em><br>For this case, let&#8217;s assume the interviewer says the primary goal is to <strong>Increase Enterprise Adoption</strong>, as OpenAI wants to create a moat. This is how you need to ask clarifying questions to narrow your scope.</p></li><li><p><em>When we talk about fine-tuning, what exactly is on the table? Are we talking about supervised fine-tuning (SFT) only, or does it also include Reinforcement Learning from Human Feedback (RLHF)?</em> <br>Let&#8217;s assume the interviewer wants us to look at the <strong>Full Stack</strong>.</p></li><li><p>The final question we can ask is about the modality. <em>Are we sticking strictly to Text, or is Multimodal fine-tuning included?</em> <br>We know multimodal is the future, so let&#8217;s assume the scope is <strong>Multimodal</strong>.</p></li></ol><p>Now we have a good idea about the exact boundaries of the strategy we are going to build. </p><p>Basically, OpenAI needs to think about building a solution for Enterprises, where an Enterprise can fine-tune the model to get their company&#8217;s specific intelligence.</p><h3><strong>High-Level Strategy </strong></h3><p>Let&#8217;s think about the high-level strategy as to why OpenAI even needs to push fine-tuning as a service. Does it really make sense for enterprises?</p><p>A lot of candidates jump directly to the solution, and they don&#8217;t talk about the high-level strategy. </p><p>You might ask, <em>Why can&#8217;t enterprises just use RAG with a General Model? Why do they need a fine-tuned model?</em> </p><p>Because RAG is excellent for retrieving facts, but it will not change the fundamental behaviour or tone of the model. Plus, with RAG, you need to spend a lot of tokens every single time you pass context. </p><p>So definitely, there is a strong need for fine-tuning.</p><p>The second thing is the moat. When enterprises bake their proprietary data and brand voice into a custom model, they create high friction to leave. </p><p>This massively increases the Switching Cost. Rebuilding that exact customised behaviour on a competitor&#8217;s platform is costly and slow. So OpenAI would want to latch onto this opportunity.</p><p>The third thing is that by enabling fine-tuning for Vision and Audio, OpenAI moves from being just a general tool to a specialised industry expert. Think Medical, Legal, Retail&#8212;this is verticalized Intelligence.</p><p>Till now, we have understood the problem statement and thought about the high-level strategy. </p><h3><strong>Pain Points for the Customer Segment  </strong></h3><p>Now let&#8217;s think about the user segment&#8212;in this case, the Enterprises&#8212;and what their current pain points are with adopting fine-tuning. There are broadly three problem statements:</p><ol><li><p><strong>Data Quality and the Cold Start problem.</strong> Organisations have plenty of data, but it is incredibly noisy, unstructured, or unlabelled. They don&#8217;t know where to start.</p></li><li><p><strong>Security and Leakage.</strong> There is a massive fear at the enterprise level that their highly proprietary data will leak into OpenAI&#8217;s general training pool. This creates a lot of friction.</p></li><li><p><strong>Evaluation Complexity.</strong> It is incredibly hard to quantify if a fine-tuned model is actually better for their specific use case, or if it is just different.</p></li></ol><h3><strong>The Solution </strong> </h3><p>On a very high level, these are the pain points of an enterprise, correct? Now, let&#8217;s think about the solution that we need to build as an OpenAI Product Manager.</p><p>But before that, find out the most valuable for Money and the <a href="https://topmate.io/technomanagers/1861184?coupon_code=NYE26">advanced Course on AI Product Management and Cracking AI PM Interviews</a>. See the curriculum of this course, having more than 35 Videos and 25 Case Studies. Now you don&#8217;t need to spend thousands of dollars; we are running a special discount as well, so do check it out from the link in the description box.</p><p>So we can introduce a comprehensive solution: <strong>OpenAI Expert Studio</strong>. </p><p>This would be an end-to-end, secure workspace dedicated entirely to Enterprise Fine-Tuning.</p><ol><li><p><strong>The Data Refiner</strong>. This solves the Data Quality and Cold Start issue. It&#8217;s an LLM-assisted preparation tool that allows enterprises to upload raw, noisy logs. The platform automatically extracts, cleans, and formats this into high-quality Golden Samples that are instantly ready for training.</p></li><li><p>Second, <strong>The Private Space</strong>. To solve the security fear, this provides a Zero-Data Retention training and hosting environment. This guarantees, right at the infrastructure level, that an enterprise&#8217;s custom weights and training data are completely isolated from OpenAI&#8217;s general base models.</p></li><li><p>Third, <strong>The Eval &amp; Diff Dashboard</strong>. This solves evaluation complexity. It acts as a native deployment sandbox where we can run automated regression tests. It provides a visual, side-by-side comparison&#8212;a &#8220;Diff&#8221;&#8212;between the base model and the fine-tuned model so enterprises can mathematically prove performance gains before deploying.</p></li></ol><h3><strong>Metrics </strong></h3><p>Now, let&#8217;s quickly jump into metrics. What are some of the key metrics you will track to measure the success of this strategy?</p><ol><li><p><strong>Number of Custom Models Generating &#8216;X&#8217; Inferences Per Month</strong>. Why? Because simply creating a model isn&#8217;t enough. We need to measure both adoption and quality. If a custom model is generating a high volume of inferences, it means it is actively providing value in production.</p></li><li><p>Then, <strong>Time-to-First-Model, or TTFM</strong>. This measures how effectively our &#8220;Data Refiner&#8221; tool is working. If we reduce the friction of data cleaning, enterprises should be able to spin up their first test model much faster.</p></li><li><p>Finally, a very important metric is the <strong>Enterprise Repeat or Retention Rate</strong>. Why is this important? Because when you launch a new capability, there is always a novelty effect where companies will try it once. But a truly successful enterprise product retains users who come back to iterate, retrain, and refine their models over time.</p></li></ol><blockquote><p><em><a href="https://topmate.io/technomanagers/1861184">If you like this article, you will absolutely love our </a><strong><a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a></strong><a href="https://topmate.io/technomanagers/1861184"> ( having real AI PM Interview Questions from Google, OpenAI, Anthropic, Amazon, Nvidia, Booking etc)</a> - <strong>( 35+ Videos ) &amp; ( Extra </strong>25+ Real Case studies as well )</em></p></blockquote><h2><strong>About Author</strong></h2><p><em><a href="https://www.linkedin.com/in/shailesh-sharma/">Shailesh Sharma</a>! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking. For more, check out my <a href="https://topmate.io/technomanagers/1861184">AI Product Management Course</a>, <a href="https://topmate.io/technomanagers/1470531">PM Interview Mastery Course</a>, <a href="https://topmate.io/technomanagers/1472775">Cracking Strategy</a>, and <a href="https://topmate.io/technomanagers">other Resources</a></em></p>]]></content:encoded></item></channel></rss>