Tech for Product Managers | Very Important Chapter

May 08, 2025

As a Product manager, you need to start writing the PRD for a particular feature or a Product.

The PRD contains two important sections. The first section is about the Functional Use Cases, and the second is Non-functional Use Cases.

Understanding the System Design will help you give Non-functional requirements to the Engineers.

Imagine you are a product manager on YouTube. You recently launched Ads on YouTube Shorts.

You launched the Feature and started testing for 1 million users. But as soon as you scale the feature for 10 million users, it starts breaking. Why?

It can happen that the feature was built only for 1–2 million users. You haven’t given any requirements in the PRD about the Peak Concurrent Users, Traffic expected, Latency expected, etc. This can ruin the entire customer experience.

This chapter will give you a very good understanding of System Design and Non-functional Requirements ( NFRs ).

Importance of NFRs

Non-functional Requirements ( NFRs ) are very important for any product or feature. They’re not about what your product does (that’s the functional stuff), but rather how well it does it.

Let’s see what the NFR looks like -

Example 1 — The feature should support up to 1 million concurrent users with an average page load time of under 500 ms.
Example 2 — Image uploads should complete within 1 second for images up to 5MB on a stable 4G connection for 95% of attempts.

This is about NFRs, broadly, every Product manager should also learn about System Design. System Design in basically high-level tech understanding and working of a product or a feature.

Why does a Product Manager need to understand this?

It is because Product Managers have to make a decision between feasibility and tradeoff. If you don’t understand how your product works at a high level, how would you be able to take that decision right?

Let’s understand this with the help of an example.

Imagine you are the PM for Instagram Stories. It seems simple on the surface, right? Users upload (photos and videos) that disappear after 24 hours.

Let’s say you’re considering a new feature: “Collaborative Stories” allows users to contribute to a single story visible to all their mutual followers. – allowing multiple

If you don’t have any understanding of System Design, you would ask vague and generic questions, but someone who has an understanding of System Design will ask the questions like -

What happens if multiple users try to add to the collaborative story at the exact same time?
How will the system handle potential conflicts or ensure a consistent view for everyone? See why it is important because users seeing different versions of the same collaborative story or experiencing errors due to conflicting edits would be frustrating and confusing.
Will collaborative stories require significantly more storage than regular stories? How will this impact our storage costs and potentially the app’s size on users’ devices?
How quickly will new contributions to a collaborative story appear to other participants and viewers? What is the expected latency?

These will play an important role in the high-level solution and architecture of the product. Your Product release timelines are also dependent on these. So having a brief understanding of the System Design / high-level technical understanding of how your product works will give you good confidence in giving suggestion, not only related to the features but also in terms of performance, scalability etc.

Now you know how important it is to have an understanding of System Design. Let’s uncover different aspects of System Design while building any feature.

Scalability

Again, you’re the PM for Instagram Stories.

Imagine this — You open Instagram Stories, and instead of seeing your friends’ updates instantly, you’re staring at a loading spinner… forever. Or maybe you try to upload a story of your morning coffee, and it just hangs there, you would not like it right? That’s what happens when a system can’t scale.

So, Scalability here in the case of Instagram Stories means, if more and more people come onto Instagram, start using stories simultaneously, still your feature should be running smoothly.

It’s like planning a party. If you invite five friends, your regular living room works fine. But if suddenly 100 people show up, you need a much bigger venue and maybe even some crowd control!

For Instagram Stories, that “bigger venue” and “crowd control” involve some clever behind-the-scenes magic. So now let’s talk about these measure to make your product scalable.

Load Balancing

As more people try to access Instagram Stories at the same time, the incoming traffic needs to be distributed. Load balancers act like traffic cops, directing user requests to the servers that are least busy and most available.

When you open Instagram Stories, your app needs to connect to Instagram’s servers to fetch the latest updates. Instead of sending all those requests to a single overworked server, a load balancer acts as the first point of contact. It intelligently distributes these incoming requests to a pool of available servers that are specifically designed to handle Story-related tasks (fetching, displaying, uploading).

Content Delivery Network ( CDN )

Think of it like Instagram having mini warehouses located all around the world that store copies of Story content (photos and videos). When you want to view a Story, your app connects to the nearest warehouse instead of having to travel all the way to Instagram’s main “storage hub”

So when a user in Mumbai uploads a Story, it’s not just stored in one central location, say, California. Copies of that Story are likely distributed to CDN servers in India and other nearby regions. When another user in Mumbai wants to view that Story, their request is routed to the local CDN server, which can deliver the content much faster because it’s geographically closer.

Consistency

Now let’s understand consistency with the same example of Instagram Stories. We discussed that the Instagram data’s replica is stored at multiple places across the world, and it is delivered to you from the nearest hub. Put an emphasis on Replica, to make sure that it is exact replica at real time is Consistency.

In case of Instagram stories, think of consistency as ensuring that everyone is seeing the same Story. It’s about having a unified and coherent view of the data.

Imagine if you updated your profile picture, and some of your friends saw the new one while others still saw the old one — that would be inconsistent.

Achieving perfect consistency in a system as massive and distributed as Instagram is a really hard problem. There are inherent delays in networks and processing. So, instead of aiming for absolute, immediate consistency in every single scenario, Instagram likely employs a few strategies and often leans towards what’s called eventual consistency for some less critical aspects.

With millions of views happening every second, updating every single view count in absolute real-time across the entire system would be incredibly resource-intensive and could impact performance. So, there might be a trade-off where the view count is eventually consistent — it will get there, but there might be a small delay.

Availability

Availability means how reliable your product is. It’s usually measured as a percentage of uptime. For example, “99.9% availability” means the service is expected to be working 99.9% of the time. That 0.1% represents the allowed downtime for maintenance or unexpected issues.

Imagine you want to upload a quick Instagram story but you see a little error message pop up: “Couldn’t upload. Please try again later. “ You try refreshing your feed, and other parts of Instagram seem to be working fine, but Stories isn’t just responding. You try to view your friend’s Story, but it just shows a loading Spinner icon that never goes away. You see, other people are complaining on Twitter that Instagram Stories is down for them, too. That’s the unavailability of Instagram Stories. This is a very poor experience.

That’s why Product Managers make sure that their product or features are highly available, there should be as minimal downtime as it can. Instagram achieve this via redundancy, that means if one server is down, the load balancer is able to route to the other server which is up.

CAP Theorem

Imagine you’re building a system where data needs to be shared and accessed by many people, possibly across different locations (like Instagram Stories). The CAP theorem says that you can only fully guarantee 2 out of 3 fundamental properties at any given time:

Consistency ©: Every read request receives the most recent write, or an error. Think of it as everyone always seeing the absolute latest version of a Story, no matter when or where they look. If I add a sticker, everyone sees it instantly, exactly the same way.

Availability (A): The system is always up and responsive. You can always try to upload or view a Story, and the system will give you something back, even if it’s slightly delayed or not the absolute newest version.

Partition Tolerance (P): The system continues to operate even if there are network partitions (breaks in communication) between different parts of the system. Imagine if the internet cable between Bangalore and Boston gets cut; the system should still allow people in Bangalore to use Stories, and people in Boston to use Stories, even if they can’t see each other’s updates for a bit.

We want all the three but we can have 2 only. So what 2 will you priortise among the three?

Instagram Stories, being a global platform with millions of users interacting constantly, must be Partition Tolerant (P). The internet is inherently unreliable; network hiccups are inevitable across such a vast infrastructure. If a temporary network issue occurs between different regions, Instagram can’t just shut down Stories for everyone. It needs to keep running in the affected areas.

Given that Instagram prioritizes Partition Tolerance, they have to make a trade-off between Consistency © and Availability (A). Instagram leans heavily towards Availability (A). Here’s why:

Prioritizing Availability: Instagram wants you to be able to upload and view Stories as much as possible, without frequent errors or downtime. If you’re trying to share a moment, they want that to work. If you’re scrolling through your feed, they want Stories to load.

Implications for Consistency (Eventual Consistency): To achieve high availability in a partitioned environment, Instagram often employs what’s called “eventual consistency.

“ This means that while they strive for everyone to see the latest data, there might be very brief periods where different users see slightly different states.

Let’s take an example, you might see a certain number of views on your Story, and if your friend looks at the same time from a different location, they might see a slightly different count. The view counts will eventually synchronise.

Why this trade-off makes sense for Instagram Stories: For a social, real-time platform like Stories, a slight delay in seeing the absolute latest data or a temporary inconsistency in counts is generally less disruptive to the user experience than the entire feature being unavailable during network issues.

Now, imagine another scenario — you transfer money from your account to another, it’s absolutely crucial that the transaction is consistent. You don’t want the money deducted from your account but not credited to the recipient’s, or vice versa.

Even if there’s a network issue between the banks, the system would likely prioritize making sure the transaction is correctly recorded on both sides, even if it means a slight delay in the transaction being fully processed and visible to both parties (impacting immediate availability).

I think now you have a some clarity about System Design, but let’s see some case studies to get a very good understanding.

Case Study 1 — System Design of YouTube

Here we will understand how YouTube works, at the core YouTube needs to solve these problems.

How do users upload their videos?
How are these videos stored and organized?
How do users find the videos they want to watch?
How are videos streamed smoothly to millions of viewers simultaneously?

Let’s understand block by block.

How do users upload their videos?

Imagine you are a YouTuber, you have created the video, edited it and it’s ready to be uploaded on YouTube. As soon as you click the ‘upload button’,

The video file is sent from the user’s device to YouTube’s servers. YouTube doesn’t just store the raw video, it does a lot of processing. First the video is converted into multiple different resolutions (like 360p, 720p, 1080p, 4K). This ensures that users with different internet speeds and devices can watch the video smoothly. Information like the video title, description, and tags provided by the uploader is stored and indexed. This is crucial for search and discovery later.

In parallel, YouTube’s systems analyze the video for various things like copyright issues, inappropriate content, and even to understand what the video is about. Once processed, the different versions of the video and its associated data are stored in YouTube’s massive storage.

How are these videos stored and organised?

Storing the huge amount of video data on YouTube is a big task. They use a distributed storage system, meaning the data isn’t all in one place. It’s spread across many servers.

This helps in handling the constant influx of new videos. If one storage unit fails, others can still serve the data. ( Hope you are able to recall the concepts ). Distributing the data closer to users can improve playback speed and you will not have buffering issues. Videos are organized using their unique IDs and the metadata associated with them. This makes it easier to retrieve and manage them.

How do users find the videos they want to watch?

Discovery on YouTube happens via Search, Homepage, Recommendation, Notifications. When a user types in a query, YouTube’s search engine looks through the video titles, descriptions, tags, and even the content of the video (using advanced techniques) to find relevant videos. It then ranks these videos based on factors like relevance, popularity, and quality. On the Homepage, the videos are personalized based on your viewing history and interest, along with this other factors like popularity, trendiness, your channel subscription is alo taken into account.

How are videos streamed smoothly to millions of viewers simultaneously?

Once you click on a video, YouTube needs to deliver it to your device without buffering, so that you have a good viewing experience. YouTube is able to achieve this via CDN ( Content Delivery Network ). These are networks of servers located around the world that store copies of popular videos closer to users. When you request a video, you’re likely getting it from a CDN server near you, reducing latency and improving playback speed. Also, YouTube uses adaptive streaming, that means it automaticalyl adjust the quality of the video based on your network connection.

Interview Question — YouTube Buffering has increased, what can be the issue?

Now you know the system design of the YouTube, try to think about why YouTube video can buffer.

Inefficient CDN Routing: Users are not connected to the Nearby CDN that’s where the data has to be fetched for a large number of users from very far leading to delay and Buffering. This can be because YouTube underestimated the traffic and leading to insufficient capability.
CDN Cache Updation Delay: Trending videos or Popular videos are not there in the latest cache or latest Edit of the videos are not there in the Cache.

3. Video Processing Delay: Popular Youtubers uploading super high resolution with Complex effects.

Above Book we have explained everything with a Case Study Based Approach, this can be a game changer in your Product Management Career

For More, check out our PM Interview Mastery Course ( Crack PM Interview with First Principle Thinking like the top 1% ) — 5/5 Rated

Hey, I’m Shailesh Sharma! I help PMs and business leaders excel in Product, Strategy, and AI using First Principles Thinking.
For more, check out my Live cohort course, PM Interview Mastery Course, Cracking Strategy, and other Resources

More about PM Interview questions and Mock Interviews | YouTube | Courses & Cohort | Tech & Strategy Newsletter

Technomanagers

Discussion about this post

Ready for more?