System Design

It involves taking a problem statement, breaking it down into smaller components and designing each component to work together effectively to achieve the overall goal of the system.
Steps for approaching a system design:
- Understand the problem
  - Identify the users and their needs, as well as the constraints and limitation
- Identify the scope of the system
  - Define the boundaries, what the system will do and what it will not
- Research & analyze existing systems
- Create a HLD
  - outline main components & how they interact with each other
  - rough diagram or flow chart
- Refine the design
- Document the design
- Continuously monitor and improve the system
Design Pattern
- A software design patten is a general, reusable solution to a commonly occuring problem within a given context in software design
- The steps go like this:
  - Product Requirement Doc
  - Features/Abstract Concepts
  - Data Definitions
  - Objects
  - Data Base
Process should be reasonable as well. For example, we don't want to stream the actual resolution ie say 8K video directly to the viewers. It's impractical. We need to downgrade or transform it to some res better suited for streaming.
Abstraction should be there for the end user. They don't necessarily need to know about the process, they need the end result. So, abstract away the problem.
Testing should be at every step: edge cases, common cases, specific cases.

Tip

Extensibility: a measure of the ability to extend a system and the level of effort required to implement the extension

Less coupling, more cohesion. Then its easy to scale and extend as and when requirements change

Chapter 1

Databases & DBMS

Database is an organized collection of structured information/data, typically stored electronically in a computer system.
Controlled by DBMS
DBMS serves as an interface between the database and its end-users or programs, allowing users to retrieve, update, and manage how the information is organized and optimized

Components

Schema: shape of a data structure and specifies what kind of data goes where
Table: various cols in a spreadsheet
Column: set of values of a particular type
Row: data is recorded in rows

Types

Challenges

Common challenges faced while running db at scale:

Absorbing significant increases in data volume
Ensuring data security
Keeping up with demand
Fault Tolerant

SQL

sql db is a collection of data items with pre-defined relationships b/w them.
organized as set of tables with cols & rows
each row in a table could be marked with a unique identifier called PK & there can also be a FK.
Follow ACID

Materialized Views

A materialized view is a pre-computed data set derived from a query specification and stored for later use. Because the data is pre-computed, querying a materialized view is faster than executing a query against the base table of the view.
This performance difference can be significant when a query is run frequently or is sufficiently complex.

N+1 Query problem

The N+1 query problem happens when the data access framework executed N additional SQL statements to fetch the same data that could have been retrieved when executing the primary SQL query.
The larger the value of N, the more queries will be executed, the larger the performance impact. And, unlike the slow query log that can help you find slow running queries, the N+1 issue won’t be spotted because each individual additional query runs sufficiently fast to not trigger the slow query log.
Commonly seen in GraphQL and ORM
Can be optimized by batching requests

Disadvantages

Expensive to maintain
Difficult schema evolution
Performance hits (joins, denormalization)

NoSQL

Unlike in relational databases, data in a NoSQL database doesn't have to conform to a pre-defined schema.
broad category; includes:
- Document DB (MongoDB, DocumentDB)
- Key-Value (Redis, DynamoDB)
- Graph DB (GraphQL, NeptuneDB)
  - uses graph structures for semantic queries with nodes, edges and properties to store data
  - edges represent relationships b/w the nodes
  - use-cases: Fraud detection, recommendation engines, social network
- Time-Series DB
Follow BASE

SQL vs NoSQL

Storage
Schema
Querying (SQL vs different syntax)
Scalability
- SQL: vertically scalable, can get expensive
- NoSQL: horizontally scalable
Reliability
- SQL wins
Data Intensive/High IO workloads
- NoSQL wins

ACID vs BASE

ACID
- ATOMIC
- CONSISTENT
- ISOLATED
- DURABLE
- Where reliability and consistency are essential
BASE
- Basic Availability
  - db appears to work most of the time
- Soft-state
  - db repicas or stores dont have to be mutually consistent all the time
- Eventual Consistency
  - data might not be consistent immediately but given sufficient time, it becomes consistent.
- where scalability and HA are essential

DB Replication

Replication is a process that involves sharing information to ensure consistency between redundant resources such as multiple databases, to improve reliability, fault-tolerance, or accessibility.

Master-Slave Replication

If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned.
Advantages
- Backups of the entire database of relatively no impact on the master.
- Applications can read from the slave(s) without impacting the master.
- Slaves can be taken offline and synced back to the master without any downtime.
Disadvantages
- Replication adds more hardware and additional complexity.
- Downtime and possibly loss of data when a master fails.
- The more read slaves, the more we have to replicate, which will increase replication lag.

Master-Master Replication

Advantages
- Applications can read from both masters.
- Distributes write load across both master nodes.
- Simple, automatic, and quick failover.
Disadvantages
- Not as simple as master-slave to configure and deploy.
- Conflict resolution comes into play as more write nodes are added and as latency increases.

Synchronous vs Asynchronous Replication

how the data is written to the replica
In synchronous replication, data is written to primary storage and the replica simultaneously.
asynchronous replication copies the data to the replica after the data is already written to the primary storage. Although the replication process may occur in near-real-time, it is more common for replication to occur on a scheduled basis and it is more cost-effective.

Indexes

they are used to improve the speed of data retrieval operations on the data store
trade-offs of increased storage, slower writes (not only have to write the data but also update the index) for the benefit of faster reads

An index is a data structure that can be perceived as a table of contents that points us to the location where actual data lives
Two types:
- Dense
- Sparse

Dense Index

for every row
Requires more memory

Sparse Index

subset of rows
less memory

Normalization & Denormalization

Surrogate Key: A system-generated value that uniquely identifies each entry in a table when no other column was able to hold properties of a primary key.

Dependencies

Partial dependency: Occurs when the primary key determines some other attributes.
Functional dependency: It is a relationship that exists between two attributes, typically between the primary key and non-key attribute within a table.
Transitive functional dependency: Occurs when some non-key attribute determines some other attribute.

Anomalies

Database anomaly happens when there is a flaw in the database due to incorrect planning or storing everything in a flat database. This is generally addressed by the process of normalization.

There are three types of database anomalies:

Insertion anomaly: Occurs when we are not able to insert certain attributes in the database without the presence of other attributes.
Update anomaly: Occurs in case of data redundancy and partial update. In other words, a correct update of the database needs other actions such as addition, deletion, or both.
Deletion anomaly: Occurs where deletion of some data requires deletion of other data.

Normalization

the process of organizing data in a database
creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.

Why?

A fully normalized database allows its structure to be extended to accommodate new types of data without changing the existing structure too much.

Normal Forms - guidelines to ensure that db is normalized

1NF

repeating groups are not permitted
mixing data types in the same column in not permitted

2NF - satisfies 1NF - no partial dependency

3NF - satisfies 2NF - no transitive dependency

BCNF - stronger version of 3NF - for every FD X -> Y, X should be a super key

Advantages

reduces data redundancy
better data design
increases data consistency

Disadvantages

Data design is complex
Slower performance
Maintenance overhead
requires more joins

Denormalization

DB optimization technique when we add redundant data to one or more tables
helps in avoiding costly joins in a relational db
improves read performance at the cost of some write performance

Advantages

reduces data redundancy
better data design
Convenient to manage.

Disadvantages

Expensive inserts and updates.
Increases complexity of database design.
Increases data redundancy

CAP Theorem

CAP theorem states that a distributed system can deliver only two of the three desired characteristics Consistency, Availability, and Partition tolerance (CAP).

Consistency

Consistency means that all clients see the same data at the same time, no matter which node they connect to. For this to happen, whenever data is written to one node, it must be instantly forwarded or replicated across all the nodes in the system before the write is deemed "successful".

Availability

Availability means that any client making a request for data gets a response, even if one or more nodes are down.

Partition Tolerance

means the system continues to work despite message loss or partial failure.
A system that is partition-tolerant can sustain any amount of network failure that doesn't result in a failure of the entire network. Data is sufficiently replicated across combinations of nodes and networks to keep the system up through intermittent outages.

CA Tradeoff

In this word, we can't gurantee the stability of network, so distributed must chose PT.
This means we tradeoff b/w C and A.

CA Db

A CA database delivers consistency and availability across all nodes. It can't do this if there is a partition between any two nodes in the system, and therefore can't deliver fault tolerance.
eg. PostgreSQL

CP Db

A CP database delivers consistency and partition tolerance at the expense of availability. When a partition occurs between any two nodes, the system has to shut down the non-consistent node until the partition is resolved.
eg. MongoDB

AP Db

An AP database delivers availability and partition tolerance at the expense of consistency. When a partition occurs, all nodes remain available but those at the wrong end of a partition might return an older version of data than others. When the partition is resolved, the AP databases typically re-syncs the nodes to repair all inconsistencies in the system.
eg. Apache Cassandra

PACELC Theorem

extension of CAP theorem
introduces latency (L) as an additional attribute of a distributed system.
PACELC theorem states that in the case of Network Partition ‘P’, a distributed system can have tradeoffs between Availability ‘A’ and Consistency ‘C’ Else ‘E’ if there is no Network Partition then a distributed system can have tradeoffs between Latency ‘L’ and Consistency ‘C’.

Partition basically means two 2 nodes are not able to communicate with each other.
One of the major pitfalls of the CAP Theorem was it did not make any provision for Performance or Latency, in other words, CAP Theorem didn’t provide tradeoffs when the system is under normal functioning or non-partitioned.
For example, according to the CAP theorem, a database can be considered available if a query returns a response after 30 days. Obviously, such latency would be unacceptable for any real-world application.

Transactions

A transaction is a series of database operations that are considered to be a "single unit of work". The operations in a transaction either all succeed, or they all fail.

Tip

Usually, relational databases support ACID transactions, and non-relational databases don't (there are exceptions).

States

A transaction in a db can be in one of the following states:

Distributed Transaction

A distributed transaction is a set of operations on data that is performed across two or more databases. It is typically coordinated across separate nodes connected by a network, but may also span multiple databases on a single server.

Why?

a distributed transaction involves altering data on multiple databases.
all the nodes must commit, or all must abort and the entire transaction rolls back. This is why we need distributed transactions.

Two-Phase Commit

Problems

What if one of the nodes crashes?
What if the coordinator itself crashes?
It is a blocking protocol.

Three-Phase Commit

helps with the blocking problem in 2PC

Why pre-commit?

If the participant nodes are found in this phase, that means that every participant has completed the first phase. The completion of prepare phase is guaranteed.

Sagas

A saga is a sequence of local transactions. Each local transaction updates the database and publishes a message or event to trigger the next local transaction in the saga. If a local transaction fails because it violates a business rule then the saga executes a series of compensating transactions that undo the changes that were made by the preceding local transactions.

Cons: Hard to debug

Sharding

Data Partioning

Data partitioning is a technique to break up a database into many smaller parts. It is the process of splitting up a database or a table across multiple machines to improve the manageability, performance, and availability of a database.

2 methods:
- Horizontal Partitioning (Sharding)
  - we split the table data horizontally based on the range of values defined by the partition key
- Vertical Partitioning
  - we partition the data vertically based on columns. We divide tables into relatively smaller tables with few elements, and each part is present in a separate partition.

Sharding

the data held in each is unique and independent of the data held in other partitions.
The justification for data sharding is that, after a certain point, it is cheaper and more feasible to scale horizontally by adding more machines than to scale it vertically by adding powerful servers

Partioning Criterias

Hash-based
List-based
Range-based

When to use?

Quickly scale by adding more shards
Better performance as each machine is under less load
When more concurrent connections are required
Maintain data in distinct geo locations

Adv

Availability
Scalability
- data is distributed across multiple partitions
Query Performance

Disadv

Complexity
Joins across shards cost much
Rebalancing
- If the data distribution is not uniform or there is a lot of load on a single shard, in such cases, we have to rebalance our shards so that the requests are as equally distributed among the shards as possible.

Database Federation

Federation (or functional partitioning) splits up databases by function. The federation architecture makes several distinct physical databases appear as one logical database to end-users.
All the components are tied together by one or more federal schemas. They specify the information that can be shared by the federation components and to provide a common basis for communication among them.

Federation also provides a cohesive, unified view of data derived from multiple sources. The data sources for federated systems can include databases and various other forms of structured and unstructured data.
Flexible data sharing
access heterogeneous data in a unified way
joins can be costly

Chapter 2

N-tier Architecture

N-tier architecture divides an application into logical layers and physical tiers.
Layers are a way to separate responsibilities and manage dependencies.
Each layer has a specific responsibility. A higher layer can use services in a lower layer, but not the other way around.

Tiers are physically separated, running on separate machines. A tier can call to another tier directly, or use asynchronous messaging. Although each layer might be hosted in its own tier, that's not required. Several layers might be hosted on the same tier. Physically separating the tiers improves scalability and resiliency and adds latency from the additional network communication.

An N-tier architecture can be of two types:

In a closed layer architecture, a layer can only call the next layer immediately down.
In an open layer architecture, a layer can call any of the layers below it

Types

Single/1-Tier architecture

It is the simplest one as it is equivalent to running the application on a personal computer. All of the required components for an application to run are on a single application or server.

2-Tier architecture

In this architecture, the presentation layer runs on the client and communicates with a data store. There is no business logic layer or immediate layer between client and server.

3-Tier architecture

3-Tier is widely used and consists of the following different layers:
- Presentation layer: Handles user interactions with the application.
- Business Logic layer: Accepts the data from the application layer, validates it as per business logic and passes it to the data layer.
- Data Access layer: Receives the data from the business layer and performs the necessary operation on the database.

Advantages

Improves availability
better security as layers can behave as firewalls
better scalability

Disadvantages

Increased complexity
increased network latency as no of tiers increases
expensive

Message Brokers

A message broker is a software that enables applications, systems, and services to communicate with each other and exchange information.

interdependent services can "talk" with one another directly, even if they were written in different languages or implemented on different platforms.
Message brokers can validate, store, route, and deliver messages to the appropriate destinations. They serve as intermediaries between other applications, allowing senders to issue messages without knowing where the receivers are, whether or not they are active, or how many of them there are.
Decoupling

Models

Point-to-point messaging
- one-to-one relationships b/w sender and receiver
Publish-Subscribe messaging
- topics and subscriptions

Message Brokers vs Event Streaming

Message Brokers suppport:
- message queues
- pub/sub
- Eg. RabbitMQ, SNS, SQS
Event Streaming supports:
- pub/sub only
- scalable
- but no message delivery guarantee
- fast but not as feature rich as MB
- Eg. Kafka, Kinesis

Message Queues

service-to-service communication
async communication
large scale distributed systems

Features

Push or Pull Delivery
- Pull: continuously querying for new messages
- Push: consumer is notified when a message is available
FIFO
- Head of the queue is processed first
Schedule/Delay Delivery
- At a specific time
At-least-Once Delivery
Exactly-Once Delivery
Dead-Letter Queues
- better inspection for messages that can't be processed successfully

Backpressure

If queues start to grow significantly, the queue size can become larger than memory, resulting in cache misses, disk reads, and even slower performance.
Backpressure can help by limiting the queue size, thereby maintaining a high throughput rate and good response times for jobs already in the queue.
Once the queue fills up, clients get a server busy or HTTP 503 status code to try again later.
Clients can retry a request perhaps with exponential backoff strategy.

Pub-Sub

Eliminates Polling
Decoupled and independent scaling
Fanout
- msg is sent to a topic and then replicated and pushed to multiple endpoints
- parallel processing
Durability
eg. SNS

Monoliths and Microservices

Monoliths

self-contained & independent application
built as single unit
can perform all the tasks to satisfy business needs

Adv
- simple to develop or debug
- fast and reliable communication
- easy testing
- supports ACID transactions
Disadv
- maintenance becomes hard as system grows
- tightly coupled, hard to extend
- on each update, entire app is redeployed
- single bug, whole system down
- difficult to scale

Modular Monoliths

A Modular Monolith is an approach where we build and deploy a single application (that's the Monolith part), but we build it in a way that breaks up the code into independent modules for each of the features needed in our application.
reduces dependencies

Microservices

A microservices architecture consists of a collection of small, autonomous services where each service is self-contained and should implement a single business capability.

Each service has a separate codebase, which can be managed by a small development team. Services can be deployed independently and a team can update an existing service without rebuilding and redeploying the entire application.
loosely coupled
small but focused
resilience and FT
highly maintainable

Disadvantages

Complexity of a distributed system
Testing is more difficult
expensive to maintain
network congestion and latency

Best practices

services should be loosely coupled with high functional cohesion
services should communicate through well-designed APIs
data storage should be private to the service that owns the data
ensure API changes are backward compatible

Service-Oriented Architecture (SOA)

Service-oriented architecture (SOA) defines a way to make software components reusable via service interfaces.
These interfaces utilize common communication standards and focus on maximizing application service reusability whereas microservices are built as a collection of various smallest independent service units focused on team autonomy and decoupling.

So, which one and when?

Each has its own advant and disadvant.
Advised to start with monolith when building a new system or ask ourselves some questions:
- Is team too large to work effectively on a shared codebase?
- Are teams blocked on other teams?
- Is business mature enough to use microservices?

Tip

We frequently draw inspiration from companies such as Netflix and their use of microservices, but we overlook the fact that we are not Netflix. They went through a lot of iterations and models before they had a market-ready solution, and this architecture became acceptable for them when they identified and solved the problem they were trying to tackle.

Microservices are solutions to complex concerns and if your business doesn't have complex issues, you don't need them.

EDA (Event-Driven Architecture)

Event-Driven Architecture (EDA) is about using events as a way to communicate within a system.
The publisher is unaware of who is consuming an event and the consumers are unaware of each other. Event-Driven Architecture is simply a way of achieving loose coupling between services within a system.

What is an event?

An event is a data point that represents state changes in a system. It doesn't specify what should happen and how the change should modify the system, it only notifies the system of a particular state change. When a user makes an action, they trigger an event.

Common use cases:
- Sagas
- Pub-Sub
- Fanout and parallel processing
- metadata and metrics

Event Sourcing

Instead of storing just the current state of the data in a domain, use an append-only store to record the full series of actions taken on that data. The store acts as the system of record.

--Event sourcing is about using events as a state, which is a different approach to storing data. Rather than storing the current state, we're instead going to be storing events.

Advantages

Excellent for real-time data reporting
Great for fail safety as data can be reconstituted from the event source
Preferred way of achieving audit logs functionality for high compliance system

Disadvantages

Requires an extremely efficient network infrastructure
Requires reliable way to control message formats, such as schema registry

Command and Query Responsibility Segregation (CQRS)

an architectural pattern that divides a system's actions into commands and queries.
In CQRS, a command is an instruction, a directive to perform a specific task. It is an intention to change something and doesn't return a value, only an indication of success or failure. And, a query is a request for information that doesn't change the system's state or cause any side effects.

Tip

The core principle of CQRS is the separation of commands and queries. They perform fundamentally different roles within a system, and separating them means that each can be optimized as needed, which distributed systems can really benefit from.

decoupling
independent scaling of read and write workloads
But:
- more complex app design
- increased system maintenance efforts

API Gateway

The API Gateway is an API management tool that sits between a client and a collection of backend services. It is a single entry point into a system that encapsulates the internal system architecture and provides an API that is tailored to each client.
It also has other responsibilities such as authentication, monitoring, load balancing, caching, throttling, logging, etc.

We need it because microservices provide fine-grained APIs ie the clients need to interact with multiple services. So, API GW can provide a single entry point for all clients with some additional features and better management.

Features

Authentication and Authorization
Service Discovery
Caching
Security
Reverse Proxy
Logging, Tracing
IP whitelisting or blacklisting

Advantages

Encapsulates the internal structure of an API.
Provides a centralized view of the API.
Simplifies the client code.
Monitoring, analytics, tracing, and other such features.

Disadvantages

Possible single point of failure.
Might impact performance.
Can become a bottleneck if not scaled properly.
Configuration can be challenging.

Backend for Frontend (BFF) Pattern

In the Backend For Frontend (BFF) pattern, we create separate backend services to be consumed by specific frontend applications or interfaces. This pattern is useful when we want to avoid customizing a single backend for multiple interfaces.

Also, sometimes the output of data returned by the microservices to the front end is not in the exact format or filtered as needed by the front end. To solve this issue, the frontend should have some logic to reformat the data, and therefore, we can use BFF to shift some of this logic to the intermediate layer.

Tip

The primary function of the backend for the frontend pattern is to get the required data from the appropriate service, format the data, and sent it to the frontend.

Eg. GraphQL

When to use BFF?

optimize backend for the requirements of a specific client
Customizations are made to a general-purpose backend to accommodate multiple interfaces (like imdb.com and m.imdb.com)

REST, GraphQL, gRPC

API

API stands for Application Programming Interface. It is a set of definitions and protocols for building and integrating application software.

if we want to interact with a computer or system to retrieve information or perform a function, an API helps you communicate what you want to that system so it can understand and complete the request.

REST

Representational State Transfer
fundamental unit: resource

Constraints

Uniform Interface
Client-Server
Stateless (no client context shall be stored on the server between requests)
Cacheable (every response should include whether the response is cacheable or not)

HTTP Verbs

HTTP defines a set of request methods to indicate the desired action to be performed for a given resource.

GET
POST
PUT
DELETE
PATCH

HTTP Response Codes

5 classes:

1xx: Informational responses
2xx: Successful responses
3xx: Redirection responses
4xx: Client error responses
5xx: Server error responses

Advantages

Simple and easy to understand
Flexible and portable
Good caching support
Client and server are decoupled

Disadvantages

Over-fetching of data

GraphQL

query language and server-side runtime for APIs that gives client exactly the data they request and no more
by Facebook
designed to make APIs fast, flexible
add or deprecate fields without impacting existing queries
Fundamental unit: query
Single url endpoint

Concepts

Schema
- A GraphQL schema describes the functionality clients can utilize once they connect to the GraphQL server.
Queries
- A query is a request made by the client. It can consist of fields and arguments for the query.
Resolvers
- Resolver is a collection of functions that generate responses for a GraphQL query. In simple terms, a resolver acts as a GraphQL query handler.

Advantages

Eliminates over-fetching of data
Strongly defined schema

Disadvantages

Shifts complexity to server side
Caching becomes hard

Use-Cases

reducing app bandwidth usage as we query multiple resources in a single query
when working with graph-like data model

gRPC

by Google
high performance Remote Procedure Call framework that can run in any environment
It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking, authentication and much more.

Concepts

Protocol Buffers
- provide a language and platform-neutral extensible mechanism for serializing structured data in a forward and backward-compatible way
- It's like JSON, except it's smaller and faster, and it generates native language bindings.
Service Definition
- gRPC is based on the idea of defining a service and specifying the methods that can be called remotely with their parameters and return types

Advantages

Lightweight and efficient
High Performance
Bi-directional streaming

Disadvantages

Limited browser support
Steeper learning curve

Use-Cases

Real-time communication via bi-dir streaming
efficient inter-service communication in microservices
low latency and high throughput communication

Type	Coupling	Chattiness	Performance	Complexity	Caching	Codegen	Discoverability	Versioning
REST	Low	High	Good	Medium	Great	Bad	Good	Easy
GraphQL	Medium	Low	Good	High	Custom	Good	Good	Custom
gRPC	High	Medium	Great	Low	Custom	Great	Bad	Hard

Long Polling, Websockets, Server-Sent Events (SSE)

Web applications were initially developed around a client-server model, where the web client is always the initiator of transactions like requesting data from the server. Thus, there was no mechanism for the server to independently send, or push, data to the client without the client first making a request. Let's discuss some approaches to overcome this problem.

Long Polling

HTTP Long polling is a technique used to push information to a client as soon as possible from the server. As a result, the server does not have to wait for the client to send a request.
In Long polling, the server does not close the connection once it receives a request from the client. Instead, the server responds only if any new message is available or a timeout threshold is reached.

Working

Let's understand how long polling works:

The client makes an initial request and waits for a response.
The server receives the request and delays sending anything until an update is available.
Once an update is available, the response is sent to the client.
The client receives the response and makes a new request immediately or after some defined interval to establish a connection again.

Advantages

easy to implement, good for small-scale
nearly universally supported

Disadvantages

not scalable
creates a new conn each time, which can be intensive on the server
reliable message ordering can be an issue for multiple requests

WebSockets

WebSocket provides full-duplex communication channels over a single TCP connection. It is a persistent connection between a client and a server that both parties can use to start sending data at any time.

The client establishes a WebSocket connection through a process known as the WebSocket handshake. If the process succeeds, then the server and client can exchange data in both directions at any time. The WebSocket protocol enables the communication between a client and a server with lower overheads, facilitating real-time data transfer from and to the server.
Initial handshake causes HTTP upgrade to ws://
Connection closed when either one decides.

Advantages

full duplex async messaging
lighweight for both client and server

Disadvantages

complex than HTTP
not optimized for streaming audio and video data. A technology like WebRTC is better suited in these scenarios.
Terminated connections aren't automatically recovered.
are stateful hence hard to use in large-scale systems as we need to share connection state across servers

Server-Sent Events (SSE)

Server-Sent Events (SSE) is a way of establishing long-term communication between client and server that enables the server to proactively push data to the client.

It is unidirectional, meaning once the client sends the request it can only receive the responses without the ability to send new requests over the same connection.

Advantages

Simple to implement and use for both client and server.
Supported by most browsers.

Disadvantages

Unidirectional nature can be limiting.
Limitation for the maximum number of open connections.
Does not support binary data