How to Design a TikTok

How to design a TikTok

Functional Requirements

  1. Video Upload: Allow users to upload videos to the platform.
  2. Video Playback: Enable users to watch videos.

By developing these, we can construct a minimum viable product.

Non-functional Requirements

  1. Scalability: Support a massive user base with high concurrency.
  2. Availability: Ensure the system remains operational even during partial failures.
  3. Low Latency: Minimize delays in video loading and playback.
  4. Fault Tolerance: Handle hardware/network failures gracefully without data loss.

Assumptions

  • User Base: 1 billion daily active users (DAU).
  • Usage Patterns:
    • Each user watches 100 videos per day.
    • Each user uploads 1 video per day.
  • Video Size: Average video size is 10MB.

Database Selection: SQL vs NoSQL

  • SQL Databases:
    • Pros: Strong consistency, relational data support.
    • Cons: Challenges with sharding and hotspot management.
  • NoSQL Databases:
    • Pros: Cost-effective, horizontally scalable.
    • Cons: Limited transactional support.

Video Storage Strategy

Blob Storage (Binary Large Object):

  • Optimized for unstructured data (videos, images, audio).
  • Ideal for storing and retrieving large volumes of small files efficiently.

How to upload a video

  1. Since we don’t know what users are uploading, exposing the storage to the interface directly is unsafe.
  2. A better option is allocating a temporary space to store the original videos uploaded by users.
  3. When uploading, a video can be cut into small pieces to support breakpoint resume upload when a break happens, and also parallel uploading, which means multiple segments can be uploaded simultaneously. (for a mobile app, the network environment is not stable)
  4. Once all segments are uploaded, we can use a message queue and a worker pool to merge all the segments and do a file integrity verification. After that, this video should be encoded into different formats because videos of different qualities should be played according to devices and network.

How to watch a video

  1. To avoid hotspots caused by frequent access to popular videos, we can deploy a CDN near user locations to offload traffic from the blob storage.
  2. Although storing videos in a CDN speeds up delivery and reduces latency, it also comes with high costs. So, we should make sure only the most popular videos are cached there.
  3. By introducing an extractor service, it can regularly find popular videos from the blob storage and send them to the CDN, those outdated videos are replaced.
  4. We can also introduce a streaming protocol like the HTTP Live Streaming from Apple to realize “stream-as-you-go” manner, which improves user experience.

Show off

  1. We can introduce a recommendation system to recommend videos to users rather than pushing original feed accoring to the time.
  2. We can introduce a Two-Tower Model to embed user and video features into separate vectors. When a client requests videos, the system can recommend those that match their interests based on vector similarity.
  3. The pro is the vieo watching time of clients can be extended, the con is the extra cost of hiring a team to construct and deploy the model.

System Architecture