05 - Application Layer

Problem 1: Short Answer Questions

(a) The available bandwidth on transcontinental links is increasing every year, yet the round-trip latency is starting to approach the speed of light limits. What are the implications of this for applications such as HTTP?
View Answer

In short: The increasing bandwidth-delay product means applications become more latency-limited than bandwidth-limited. HTTP performance is increasingly constrained by round-trip time rather than download speed, making latency optimization (connection reuse, pipelining, multiplexing) more critical than ever.

Elaboration:

The Bandwidth-Delay Product Problem:
```
Bandwidth Delay Product = Bandwidth × RTT

Example:
- 1 Gbps link, 100 ms RTT = 100 Mb = 12.5 MB
- 10 Gbps link, 100 ms RTT = 1 Gb = 125 MB

This represents "data in flight" on the link at any moment.
```
What This Means for HTTP:
1. Latency becomes the bottleneck
  Scenario: Download 1 MB file Slow link (10 Mbps, 50 ms RTT): Time = TCP handshake (50 ms) + HTTP request (50 ms) + transmission (800 ms) = ~900 ms Dominated by transmission time Fast link (1 Gbps, 100 ms RTT): Time = TCP handshake (100 ms) + HTTP request (100 ms) + transmission (8 ms) = ~208 ms Dominated by latency/roundtrips!
2. Connection overhead matters more
  With HTTP/1.0 (new connection per object): For 10 objects: 10 × TCP handshake (3 × RTT) = 30 RTTs With fast link, 100 ms RTT: = 3 seconds just for handshakes! Transmission time for small objects becomes negligible
3. Why bandwidth improvements help less
  Going from 100 Mbps to 1 Gbps (10× improvement): Saves ~8 ms per 1 MB But RTT is still 100 ms (physics limit) = 1200× slower than speed-of-light Latency is the real constraint
Implications for HTTP Applications:
- Connection Reuse Critical: Persistent connections (HTTP/1.1) become essential
- Pipelining/Multiplexing Needed: HTTP/2, HTTP/3 multiplexing over single connections
- DNS Caching Essential: DNS lookups add 50-200 ms per domain
- Geographic Distribution: Content delivery networks (CDNs) needed to reduce latency
- Protocol Overhead Matters: TCP/TLS handshakes become dominant cost
- Parallel Connections Less Useful: Can’t pipeline effectively anyway due to RTT
Real World Example:
```
Downloading webpage with 50 objects from different CDNs:

Old approach (HTTP/1.0, 6 parallel connections):
  50 objects × 100 ms RTT = 5 seconds
  Bandwidth barely used

New approach (HTTP/2 multiplexing, single connection):
  ~200 ms (1-2 RTTs for connection + pipelining)
  Same bandwidth, much faster
```
Conclusion:

As bandwidth increases, applications become latency-bound rather than bandwidth-bound. HTTP design must minimize RTTs through connection multiplexing, protocol efficiency, and geographic proximity rather than expecting bandwidth improvements to help.

(b) What’s an authoritative name server? What part of the name space hierarchy is City University of New York (CUNY) name server is responsible for. Briefly explain.

View Answer

In short: An authoritative name server is the official source for DNS records in a particular zone of the namespace hierarchy. CUNY’s name server (cuny.edu) is responsible for the cuny.edu domain zone, containing all hosts within CUNY (qc.cuny.edu, hunter.cuny.edu, etc.).

Elaboration:

What is an Authoritative Name Server?

An authoritative name server:
- Maintains the official DNS records for a specific zone
- Responds to queries about hosts in that zone
- Is the "source of truth" for that zone
- Does NOT perform recursive queries (typically)

DNS Hierarchy:

Root zone (.)
├── .edu zone (root delegates)
│   ├── .cuny.edu zone (edu delegates)
│   ├── .mit.edu zone
│   └── .stanford.edu zone
│
└── .com zone
    ├── .google.com zone
    └── .amazon.com zone

CUNY’s Responsibility:

CUNY's authoritative nameserver (ns.cuny.edu or ns1.cuny.edu):

Responsible zone: cuny.edu

Contains records for:
- **(qc)**cuny.edu (Queens College)
- **(hunter)**cuny.edu (Hunter College)
- **(baruch)**cuny.edu (Baruch College)
- **(host1)**cuny.edu
- **(host2)**cuny.edu
- ... any host *.cuny.edu

Does NOT contain:
- **(mit)**edu records (MIT's server handles)
- **(google)**com records (Google's server handles)
- Any hosts outside cuny.edu

How it Works:

Query: What is the IP of qc.cuny.edu?

1. Client → Root nameserver: "Who handles .edu?"
   Root: "Ask ns.edu.cuny or similar"

2. Client → .edu nameserver: "Who handles .cuny.edu?"
   .edu: "Ask ns.cuny.edu (CUNY's nameserver)"

3. Client → ns.cuny.edu (authoritative): "IP of qc.cuny.edu?"
   CUNY: "It's 136.48.100.1" (authoritative answer)

Key Characteristics:

Aspect	Authoritative	Non-Authoritative (Resolver)
Source	Official zone records	Cached records
Updates	Maintained by zone admin	Cached with TTL
Responsibility	One zone only	Multiple zones (via caching)
Record Type	Complete for zone	Partial (what was requested before)

Conclusion:

CUNY’s authoritative nameserver manages the cuny.edu zone and knows the official IP addresses for all hosts within that zone (qc.cuny.edu, hunter.cuny.edu, etc.). It is the authoritative source for these records in the DNS hierarchy.

(c) Suppose a user requests a Web page that consists of some text and two images. Will the client send one request and receive 3 response messages? Explain.

View Answer

In short: It depends on the HTTP version and connection type. With HTTP/1.0 or non-persistent connections, the answer is NO—the client sends 3 requests (one for HTML, one for each image) and receives 3 responses. With HTTP/1.1 persistent connections or HTTP/2, it’s more complex.

Elaboration:

HTTP/1.0 with Non-Persistent Connections:

Web page structure:
- HTML document (text)
- Image 1
- Image 2

Client behavior:
1. Send HTTP GET request for HTML
   Receive response #1 (HTML)

2. Parse HTML, find <img> references
   Send HTTP GET request for Image 1
   Receive response #2 (Image 1)

3. Send HTTP GET request for Image 2
   Receive response #3 (Image 2)

Result: 3 requests, 3 responses ✓

HTTP/1.1 with Persistent Connections:

Client behavior:
1. Send GET for HTML
   Receive response #1

2. Same TCP connection still open!
   Send GET for Image 1
   Receive response #2

3. Same TCP connection still open!
   Send GET for Image 2
   Receive response #3

Result: 3 requests, 3 responses
But: All over the SAME TCP connection (more efficient)

HTTP/1.1 with Pipelining:

Client behavior (if pipelining enabled):
1. Send GET request for HTML
2. Send GET request for Image 1 (without waiting for response)
3. Send GET request for Image 2 (without waiting for response)

Then receive:
- Response #1 (HTML)
- Response #2 (Image 1)
- Response #3 (Image 2)

Result: 3 requests, 3 responses
But: Requests pipelined, responses arrive in order

HTTP/2 with Multiplexing:

Client behavior:
1. Single TCP connection
2. Send frame for HTML
3. Send frame for Image 1
4. Send frame for Image 2

Server can interleave responses:
- Send HTML chunks
- Send Image 1 chunks
- Send Image 2 chunks
- All on same connection, simultaneously

Result: 3 responses, but not necessarily discrete "messages"

The Key Point:

Aspect	HTTP/1.0	HTTP/1.1 Persistent	HTTP/2
Requests	3 (separate connections)	3 (same connection)	3 (same connection)
Responses	3 (separate)	3 (same connection)	3 (multiplexed)
Sequential?	Yes	Yes (default)	No (interleaved)
Efficiency	Poor	Good	Excellent

Conclusion:

The answer is technically YES (3 requests → 3 responses), but the nuance depends on HTTP version:

HTTP/1.0: Three separate TCP connections, three distinct responses
HTTP/1.1+: One persistent connection, three responses in order
HTTP/2: One connection, three responses multiplexed together

(d) Can two distinct Web pages from the same origin server, e.g., www.mit.edu/research.html and www.mit.edu/students.html, be sent over the same persistent connection? Why or why not?

View Answer

In short: YES. HTTP/1.1 persistent connections allow multiple requests and responses to be exchanged over the same TCP connection. Two distinct web pages from the same origin server can absolutely be sent over the same persistent connection.

Elaboration:

How Persistent Connections Work:

Traditional (HTTP/1.0, non-persistent):
TCP connection established
GET /research.html
Receive response (research.html)
TCP connection closed

NEW TCP connection established
GET /students.html
Receive response (students.html)
TCP connection closed

With HTTP/1.1 Persistent Connection:

1. TCP connection established (SYN, SYN-ACK, ACK)

2. GET /research.html
   Receive response (research.html)
   Connection stays OPEN

3. GET /students.html (same TCP connection!)
   Receive response (students.html)
   Connection stays OPEN

4. TCP connection closed (when idle timeout or explicit close)

Benefits:

Benefit	Impact
No TCP handshake overhead	Save 3 RTTs per page
No SSL/TLS renegotiation	Save 2 RTTs if HTTPS
Connection warm-up	Congestion window increases
Network efficiency	Better link utilization

Example Timeline:

Time 0 ms:
  Send: GET /research.html

Time 50 ms:
  Receive: 200 OK + research.html

Time 60 ms:
  Send: GET /students.html (same connection)

Time 110 ms:
  Receive: 200 OK + students.html

Total time: 110 ms

If separate connections:
  Connection 1: TCP handshake (50 ms) + request/response (50 ms) = 100 ms
  Connection 2: TCP handshake (50 ms) + request/response (50 ms) = 100 ms
  Total: 200 ms (90 ms extra!)

HTTP Request Format (same connection):

GET /research.html HTTP/1.1
Host: www.mit.edu
Connection: keep-alive

[Server sends response, connection remains open]

GET /students.html HTTP/1.1
Host: www.mit.edu
Connection: keep-alive

[Server sends response, connection remains open]

Conditions:

Persistent connections work when:
Both pages from SAME server (www.mit.edu)
HTTP/1.1 is used (default in modern browsers)
Connection header not set to "close"
Content-Length or chunked encoding provided
No HTTP errors that close connection (500, 503, etc.)

Conclusion:

YES, two distinct web pages from the same origin server can be sent over the same persistent connection. This is the default behavior in HTTP/1.1, and it significantly improves performance by eliminating TCP handshake overhead.

(e) Can two distinct Web pages from different origin servers, e.g., www.mit.edu/research.html and www.cuny.edu/students.html, be sent over the same persistent connection? Why or why not?

View Answer

In short: NO. Persistent connections are specific to a single server. A connection to www.mit.edu cannot be reused for requests to www.cuny.edu. The client must establish a separate TCP connection to each origin server.

Elaboration:

Why Not?

HTTP/1.1 persistent connections are tied to:
1. Host (e.g., www.mit.edu)
2. Port (e.g., 80 for HTTP)
3. Protocol (HTTP vs HTTPS)

Connection to www.mit.edu:80 is separate from www.cuny.edu:80
Cannot be reused across different servers

TCP Connection Mechanics:

TCP connection identified by 5-tuple:
- Source IP
- Source port
- Destination IP (www.mit.edu = 128.30.2.36)
- Destination port (80)
- Protocol (TCP)

Connection to www.cuny.edu (136.48.0.1) would be:
- Source IP (same)
- Source port (different)
- Destination IP (different!) ← DIFFERENT SERVER
- Destination port (80)
- Protocol (TCP)

Completely different connection

HTTP Request Format (different servers):

← Connection to www.mit.edu
GET /research.html HTTP/1.1
Host: www.mit.edu

[Response received]
[Connection closed or kept open for more mit.edu requests]

← NEW Connection to www.cuny.edu
GET /students.html HTTP/1.1
Host: www.cuny.edu

[Response received]

Timeline Comparison:

Same server (www.mit.edu):
Time 0: GET /research.html
Time 50: Receive response
Time 60: GET /students2.html (same TCP connection)
Time 110: Receive response
Total: 110 ms

Different servers (www.mit.edu vs www.cuny.edu):
Time 0: TCP handshake to mit.edu
Time 50: GET /research.html
Time 100: Receive response
Time 101: TCP handshake to cuny.edu (NEW connection)
Time 151: GET /students.html
Time 201: Receive response
Total: 201 ms (extra TCP handshake!)

Exception: HTTP Proxies

A proxy can maintain persistent connections to multiple servers:

Browser → Proxy: GET www.mit.edu/research.html
Proxy ← → www.mit.edu (connection 1)

Browser → Proxy: GET www.cuny.edu/students.html
Proxy ← → www.cuny.edu (connection 2)

But the browser itself still only connects to ONE proxy
Proxy manages connections to multiple servers

Modern Workaround: CDNs

Instead of different servers:
- Both pages served from CDN edge server
- Same origin server (CDN node)
- Persistent connection works

Browser → CDN node for mit.edu content
Browser → CDN node for cuny.edu content (same CDN server)
Can reuse connection within CDN

Conclusion:

NO, two web pages from different origin servers cannot use the same persistent connection. Each server requires a separate TCP connection. This is a fundamental limitation of TCP (which is server-specific) and HTTP (which respects TCP connection boundaries).

(f) With nonpersistent connections between the browser and the origin server, is it possible for a single TCP segment to carry two distinct HTTP request messages. Explain.

View Answer

In short: NO. With nonpersistent connections, each HTTP request requires its own TCP connection (3-way handshake, request, response, close). A TCP segment carries data from one connection only, so two HTTP requests would require two separate TCP connections and thus two separate segments.

Elaboration:

Understanding TCP Segments:

A TCP segment is the unit of data at the transport layer
- Contains TCP header + payload (HTTP data)
- Belongs to ONE TCP connection (identified by source/dest IP:port)

One segment = One TCP connection
Cannot carry data from two different connections

Nonpersistent Connection Model:

Request 1:
SYN (TCP handshake)
SYN-ACK
ACK
[TCP segment with HTTP GET request]
   Carries: GET /page1.html HTTP/1.0\r\n...

[Response received, connection closes]

Request 2:
NEW SYN (new TCP connection)
SYN-ACK
ACK
[NEW TCP segment with HTTP GET request]
   Carries: GET /page2.html HTTP/1.0\r\n...

[Response received, connection closes]

Why Not in One Segment?

Hypothesis: Send both in one segment?

GET /page1.html HTTP/1.0\r\n...
GET /page2.html HTTP/1.0\r\n...

Problem 1: Which connection?
- TCP connection is between specific IP:port pairs
- One connection to server A
- One connection to server B (different)
- Can't fit both in one segment

Problem 2: HTTP protocol expectation
- Server receives segment on established connection
- Reads first request: GET /page1.html
- Sends response for page1
- Connection closes (nonpersistent)
- Second request is lost!

Problem 3: Multiple requests aren't delimited
- How would server know where one HTTP message ends?
- Without Content-Length or keep-alive, message ends with connection close

What WOULD Work (but breaks nonpersistent model):

If we COULD send two requests in one segment:

GET /page1.html HTTP/1.1\r\n
Host: server.com\r\n
Connection: keep-alive\r\n
Content-Length: 0\r\n
\r\n
GET /page2.html HTTP/1.1\r\n
Host: server.com\r\n
Content-Length: 0\r\n
\r\n

But this REQUIRES:
- Persistent connection (HTTP/1.1)
- Keep-alive header
- Proper message framing
- This is NOT nonpersistent!

TCP and HTTP Constraints:

Constraint	Implication
Nonpersistent = new TCP connection per request	Each request needs own 3-way handshake
TCP segment belongs to one connection	Can’t mix requests from different connections
Connection close ends message	Second request lost when connection closes
HTTP/1.0 (nonpersistent) has no framing	Can’t delimit multiple requests

Example Timeline:

Time 0:   SYN ——————→
Time 10:  ←———— SYN-ACK
Time 20:  ACK ——————→
Time 30:  GET request ——→ [One segment carries one HTTP request]
Time 80:  ←———— Response
Time 90:  FIN ——————→ [Connection closes]

Time 91:  NEW SYN ———→ [New connection for second request]
Time 101: ←——— SYN-ACK
Time 111: ACK ——————→
Time 121: GET request ——→ [Different segment, different connection]
Time 171: ←———— Response

Conclusion:

NO. A TCP segment cannot carry two distinct HTTP request messages in a nonpersistent connection model because:

Each request requires its own TCP connection
A TCP segment belongs to exactly one connection
Nonpersistent connections close after response, losing any additional data
HTTP/1.0 has no framing mechanism to delimit multiple messages

This is why persistent connections (HTTP/1.1) were invented—to allow multiple requests over one connection and better utilize bandwidth.

(g) We know that a separate TCP connection is established for data transfer in FTP. Briefly describe the client and server communication to make this possible.

View Answer

In short: FTP uses two TCP connections: a control connection for commands and a data connection for file transfer. The client sends commands (USER, PASS, RETR, STOR) over the control connection, and the server establishes a data connection when needed, either in active mode (server initiates) or passive mode (client initiates).

Elaboration:

Two Connection Model:

FTP Client                          FTP Server

Control connection ←———————————→ Control port 21
(client commands,
 server responses)

Data connection ←———————————→ Data port 20 (active)
                              or random port (passive)
(file data transfer)

Active Mode (Server-Initiated Data Connection):

Step 1: Control Connection Established
  Client → Server: TCP connection to port 21
  This persists for entire FTP session

Step 2: User Authentication
  Client → Server (control): USER username\r\n
  Server → Client (control): 331 Password required\r\n
  Client → Server (control): PASS password\r\n
  Server → Client (control): 230 Login successful\r\n

Step 3: Issue Retrieve Command
  Client → Server (control): RETR filename\r\n
  Server → Client (control): 150 Opening data connection\r\n

Step 4: Server Initiates Data Connection
  Server → Client: TCP connection from port 20 to client's data port
  [Data transfer begins on this connection]
  [Transfer completes]
  Server → Client (data): Connection closes

Step 5: Server Notifies Completion
  Server → Client (control): 226 Transfer complete\r\n

Passive Mode (Client-Initiated Data Connection):

Motivation: Firewalls/NAT often block incoming connections

Step 1-2: Control Connection & Authentication (same as active)

Step 3: Request Passive Mode
  Client → Server (control): PASV\r\n
  Server → Client (control): 227 Entering Passive Mode (h1,h2,h3,h4,p1,p2)\r\n
  [Response contains server's IP and random port number]

Step 4: Client Initiates Data Connection
  Client → Server: TCP connection to provided IP:port
  (Server was listening on this port)

Step 5: Issue Retrieve Command
  Client → Server (control): RETR filename\r\n
  Server → Client (control): 150 Opening data connection\r\n
  [Data transfer on already-established data connection]
  [Transfer completes]
  Data connection closes

Step 6: Completion Notification
  Server → Client (control): 226 Transfer complete\r\n

Active Mode Timeline:

Time 0:    Control connection (client port → server:21)
Time 10:   USER command sent
Time 20:   PASS command sent
Time 30:   RETR filename command sent
Time 40:   ← Server initiates data connection (server:20 → client port X)
Time 50:   Data transfer begins
Time 500:  Data transfer completes
Time 510:  ← 226 Transfer complete (control connection)

Passive Mode Timeline:

Time 0:    Control connection (client port → server:21)
Time 10:   USER command sent
Time 20:   PASS command sent
Time 30:   PASV command sent
Time 40:   ← Server responds with port number (e.g., 1234)
Time 50:   Data connection initiated (client → server:1234)
Time 60:   RETR filename command sent
Time 70:   Data transfer begins
Time 500:  Data transfer completes
Time 510:  ← 226 Transfer complete (control connection)

Command Examples on Control Connection:

Command	Purpose	Response
USER	Provide username	331 (need password)
PASS	Provide password	230 (success) or 530 (fail)
RETR	Retrieve file	150 (opening data), 226 (done)
STOR	Store file	150 (opening data), 226 (done)
LIST	List directory	150 (opening data), 226 (done)
QUIT	End session	221 (goodbye)

Why Separate Data Connection?

1. Protocol separation
   - Control: Command/response (ASCII text, small)
   - Data: File transfer (binary, large)

2. Flexibility
   - Can transfer multiple files without re-authenticating
   - Can use different data rates
   - Can resume interrupted transfers

3. Network efficiency
   - Control connection lightweight
   - Data connection optimized for throughput

4. Compatibility
   - Works with firewall rules
   - Can use active or passive depending on network

Key Points:

- Control connection: Always client → server:21 (persists)
- Data connection: Separate, established per transfer
- Active: Server initiates data connection from port 20
- Passive: Client initiates to server's random port
- Responses on control connection indicate data connection status

Conclusion:

FTP uses a control connection (to port 21) for commands and responses, and a separate data connection (port 20 for active, random port for passive) for actual file transfer. The client sends FTP commands over the control connection, and the server initiates (active mode) or accepts (passive mode) the data connection as needed. This separation allows efficient file transfer while maintaining session control and enabling features like authentication and error reporting.

(h) Does a user e-mail agent upload an outgoing e-mail to a mail server using POP3/IMAP, or SMTP? Briefly explain.

View Answer

In short: SMTP is used to upload outgoing email. POP3 and IMAP are used only for downloading/retrieving received email. The mail client uses SMTP to send messages to the mail server, which then routes them to recipients.

Elaboration:

Three Separate Protocols:

User's Mail Client
↓
SMTP (port 25, 465, 587): SEND outgoing email
↓
Mail Server (SMTP server)
↓
Routes to recipient's mail server via SMTP
↓
Recipient's Mail Server (POP3/IMAP server)
↓
POP3 or IMAP (port 110, 143, 993): RETRIEVE email
↓
Recipient's Mail Client

SMTP (Simple Mail Transfer Protocol):

Purpose: SENDING email

Client workflow:
1. User composes email in mail client (Outlook, Gmail, etc.)
2. User clicks "Send"
3. Mail client connects to SMTP server (port 587 with TLS)
4. Client authenticates: AUTH LOGIN
5. Client sends message:
   - MAIL FROM: sender@domain.com
   - RCPT TO: recipient@domain.com
   - DATA (message body)
6. Server accepts and routes message
7. Connection closes

Server then:
- Looks up recipient's mail server via DNS
- Connects to recipient's SMTP server
- Delivers message
- (May queue if recipient offline)

POP3 (Post Office Protocol 3):

Purpose: RETRIEVING email (download-and-delete model)

Client workflow:
1. Mail client connects to POP3 server (port 110 or 995)
2. User authenticates with username/password
3. Server returns list of messages
4. Client downloads messages
5. Messages deleted from server (typically)
6. Connection closes

Characteristic:
- Intended for single client access
- After download, email usually removed from server
- Not ideal for multiple devices

IMAP (Internet Message Access Protocol):

Purpose: RETRIEVING email (keep-on-server model)

Client workflow:
1. Mail client connects to IMAP server (port 143 or 993)
2. User authenticates
3. Server provides folder structure (Inbox, Drafts, Sent, etc.)
4. Client can:
   - Preview messages without downloading
   - Download specific messages
   - Delete, flag, organize messages
   - Synchronize across devices
5. Messages stay on server
6. Connection can remain open

Characteristic:
- Designed for multiple client access
- Emails remain on server until explicitly deleted
- Great for accessing from multiple devices
- More bandwidth-efficient (selective download)

Complete Email Flow:

Alice sends email to Bob:

Step 1 (SMTP - Send):
  Alice's client → SMTP server (mail.alice.com:587)
  Sends: alice@alice.com → bob@bob.com

Step 2 (SMTP - Route):
  mail.alice.com → mail.bob.com (SMTP)
  Message transferred between servers

Step 3 (IMAP/POP3 - Receive):
  Bob's client → mail.bob.com (IMAP port 993)
  Bob downloads/reads message

Key Distinction:

Protocol	Direction	Purpose	Port
SMTP	Client → Server	SEND email	25, 465, 587
POP3	Client ← Server	RETRIEVE email	110, 995
IMAP	Client ← Server	RETRIEVE email	143, 993

Conclusion:

SMTP is used for uploading/sending outgoing email to the mail server. POP3 and IMAP are used for downloading received email from the mail server. These are distinct protocols with different purposes in the email infrastructure.

(i) What’s a MIME type? What is it used for? Briefly explain.

View Answer

In short: A MIME type (Multipurpose Internet Mail Extensions) is a standard label that identifies the format of data (e.g., text/plain, image/jpeg, application/pdf). It tells systems how to interpret and display the content, enabling proper handling of diverse file types across networks.

Elaboration:

What is MIME?

MIME Type Syntax:
type/subtype

Examples:
- text/plain (plain text)
- text/html (HTML document)
- image/jpeg (JPEG image)
- image/png (PNG image)
- application/pdf (PDF document)
- application/json (JSON data)
- audio/mpeg (MP3 audio)
- video/mp4 (MP4 video)
- application/zip (ZIP archive)

Purpose:

Without MIME types:
- System receives file called "document"
- Is it text? Binary? Image? Archive?
- How should it be displayed?
- What program should open it?
- Ambiguous and error-prone

With MIME types:
- Server sends "Content-Type: application/pdf"
- Client knows it's a PDF
- Client launches PDF reader
- Content displayed correctly

Common MIME Types:

Type	Subtype	Purpose
text	plain, html, css, javascript	Text-based files
image	jpeg, png, gif, svg+xml	Image files
audio	mpeg, wav, ogg	Audio files
video	mp4, webm, ogg	Video files
application	pdf, json, xml, zip, octet-stream	Data/binary files
multipart	form-data, mixed, related	Multiple parts in one message

How MIME Works in HTTP:

HTTP Response:
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 1234

<!DOCTYPE html>
<html>
...

Browser sees “text/html” → Renders as HTML webpage

HTTP Response:
HTTP/1.1 200 OK
Content-Type: application/pdf
Content-Length: 50000

[binary PDF data]

Browser sees “application/pdf” → Launches PDF viewer

MIME in Email:

Email with attachment:

From: alice@example.com
To: bob@example.com
Subject: Photos
MIME-Version: 1.0
Content-Type: multipart/mixed

--boundary123
Content-Type: text/plain

Here are the photos you requested.

--boundary123
Content-Type: image/jpeg
Content-Transfer-Encoding: base64

[binary image data encoded as base64]

--boundary123--

Mail client:

Reads “multipart/mixed”
Recognizes multiple parts
Displays text portion
Saves image/jpeg as attachment with .jpg extension

MIME with Parameters:

Content-Type: text/plain; charset=utf-8
└─ Type: text
├─ Subtype: plain
└─ Parameter: charset=utf-8 (UTF-8 encoding)

Content-Type: image/jpeg; name="photo.jpg"
└─ Type: image
├─ Subtype: jpeg
└─ Parameter: filename for download

Content-Type: multipart/form-data; boundary=----WebKitFormBoundary
└─ Type: multipart
├─ Subtype: form-data
└─ Parameter: boundary delimiter for parts

Why MIME Matters:

Content Negotiation
- Server can offer multiple formats
- Client requests preferred format
- “Accept: text/html, application/json”
Charset Handling
- “text/html; charset=utf-8”
- Ensures proper character encoding
- Prevents garbled text
Plugin/Handler Selection
- OS looks at MIME type
- Launches appropriate application
- User doesn’t need to specify
Interoperability
- Standard way to describe content
- Works across all platforms
- Enables automation

Conclusion:

A MIME type is a standard label (e.g., text/html, image/jpeg) that identifies the format of data. It’s used to tell systems how to interpret, display, and handle content, enabling proper routing and processing of diverse file types across email systems, web servers, and applications.

(j) Why might a domain address (e.g., www.cnn.com) have several IP addresses?

View Answer

In short: A domain can have multiple IP addresses for load balancing (distributing traffic across servers), geographic redundancy (serving from multiple locations), fault tolerance (if one server fails, others handle traffic), and scalability (handling large traffic volumes).

Elaboration:

Load Balancing:

Single IP address (poor):
  All requests → Single server
  Server capacity: 1000 requests/sec
  If 2000 requests arrive: 50% get dropped

Multiple IP addresses (good):
  DNS returns multiple IPs in rotation
  Requests distributed across servers
  Total capacity: 4000 requests/sec (4 servers × 1000 each)

www.cnn.com might have:
- **(93)**184.216.34
- **(93)**184.216.35
- **(93)**184.216.36
- **(93)**184.216.37

How DNS Round-Robin Works:

Client 1 → DNS: What's the IP for www.cnn.com?
          ← DNS: [93.184.216.34, 93.184.216.35, 93.184.216.36, ...]
          Client gets first IP: 93.184.216.34

Client 2 → DNS: What's the IP for www.cnn.com?
          ← DNS: [93.184.216.35, 93.184.216.36, ..., 93.184.216.34]
          (rotated list)
          Client gets: 93.184.216.35

Client 3 → DNS: What's the IP for www.cnn.com?
          ← DNS: [93.184.216.36, ..., 93.184.216.34, 93.184.216.35]
          Client gets: 93.184.216.36

Result: Traffic spread across all servers

Geographic Redundancy:

Single server (bad):
  Server in New York
  Los Angeles users: ~3000 ms latency
  Europe users: ~10000 ms latency
  User experience: Poor

Multiple geographic locations (good):
  Server 1: New York (1.2.3.4)
  Server 2: Los Angeles (1.2.3.5)
  Server 3: London (1.2.3.6)
  Server 4: Tokyo (1.2.3.7)

  User in LA → Connects to 1.2.3.5 (local server)
  Latency: ~10 ms (much better)

GeoDNS can be used:
  Clients in US get US servers
  Clients in Europe get EU servers
  Clients in Asia get Asia servers

Fault Tolerance:

Single server (risky):
  www.cnn.com → 1.2.3.4
  Server 1.2.3.4 crashes
  Website is DOWN
  All users affected
  No redundancy

Multiple servers (safe):
  www.cnn.com → [1.2.3.4, 1.2.3.5, 1.2.3.6, 1.2.3.7]

  Server 1.2.3.4 crashes:
  Clients keep connecting to other IPs
  Users redirected automatically
  Website stays UP
  Minimal disruption

Health checks:
  Monitoring service checks each server
  If server unhealthy: Remove from DNS responses
  Clients automatically avoid failed server

Traffic Scalability:

Peak traffic analysis:
- Typical traffic: 10,000 requests/sec per server
- Peak traffic (election day, breaking news): 100,000 requests/sec

Single server solution:
  Would need 10 servers during peak only
  Expensive and wasteful

Multiple permanent servers:
  Run 4-5 servers normally
  During peak: Some requests queued briefly
  Costs less than scaling to 10
  Handles most spikes gracefully

Real Example: CNN

www.cnn.com DNS lookup returns:

; <<>> dig www.cnn.com

www.cnn.com. 300 IN A 151.101.1.67
www.cnn.com. 300 IN A 151.101.65.67
www.cnn.com. 300 IN A 151.101.129.67
www.cnn.com. 300 IN A 151.101.193.67

Note: These are Fastly CDN IPs in different geographic regions

Other Reasons:

CDN (Content Delivery Network)
- Multiple edge servers worldwide
- Each has own IP
- Users served from nearest edge
- Faster content delivery
A/B Testing
- Version A served from IP 1.2.3.4
- Version B served from IP 1.2.3.5
- Different users test different versions
Graceful Degradation
- During maintenance: Reduce IPs in DNS response
- Gradually drain traffic from server being updated
- Zero downtime deploys
DDoS Mitigation
- Multiple IPs spread attack traffic
- Easier to filter/block attack sources
- Continues serving through attack

Conclusion:

A domain has multiple IP addresses primarily for:

Load balancing (distribute traffic)
Geographic redundancy (serve from multiple locations)
Fault tolerance (survive server failures)
Scalability (handle traffic spikes)
Performance (users connect to nearest server)

This is achieved through DNS round-robin, GeoDNS, health checks, and CDN architecture.

(k) Why might a Web server have several IP addresses for a single interface?

View Answer

In short: A web server might have multiple IP addresses on a single network interface to host multiple domains/websites, serve different services on different IPs, implement virtual hosting, isolate traffic for different customers, or handle SSL/TLS certificates for multiple domains.

Elaboration:

Virtual Hosting:

Single server, multiple websites:

IP Configuration:
eth0: 1.2.3.4 (www.site1.com)
eth0: 1.2.3.5 (www.site2.com)
eth0: 1.2.3.6 (www.site3.com)

All on same physical network interface (eth0)
All running on same server machine

When client connects:
  Client → 1.2.3.4 (gets site1)
  Client → 1.2.3.5 (gets site2)
  Client → 1.2.3.6 (gets site3)

Before SNI (Server Name Indication):

SSL/TLS required different IP per domain:

Problem: How does server know which cert to use?
- HTTPS handshake happens before HTTP Host header
- Server doesn't know which domain client wants
- Can't select correct certificate

Solution: One IP per SSL domain
- **(www)**site1.com → IP 1.2.3.4 with Site1 cert
- **(www)**site2.com → IP 1.2.3.5 with Site2 cert
- **(www)**site3.com → IP 1.2.3.6 with Site3 cert

Now server can identify domain from incoming IP

Modern Era (with SNI):

SNI (Server Name Indication) - TLS extension:
- Client sends hostname during TLS handshake
- Server knows which cert to use
- Multiple domains can share one IP!

But legacy support may still require:
- Multiple IPs for older clients
- Fallback for incompatible browsers

Practical Scenarios:

Scenario 1: Shared Hosting Provider

Company: "WebHost.com" shared hosting

One physical server hosts 100 customer websites:
192.168.1.100:
  - IP address 203.0.113.1 → customer1.com
  - IP address 203.0.113.2 → customer2.com
  - IP address 203.0.113.3 → customer3.com
  - ... up to 203.0.113.100 → customer100.com

Benefits:
- Single server, multiple paying customers
- Each customer has own IP (feels exclusive)
- Different SSL certs for each

Scenario 2: Service Isolation

Large enterprise server configuration:

eth0 (single physical NIC):
  - **(10)**0.1.10: Public-facing web server (www.company.com)
  - **(10)**0.1.11: Admin dashboard (secure, restricted access)
  - **(10)**0.1.12: API server (api.company.com)
  - **(10)**0.1.13: Backup/Health check IP

Benefits:
- Can apply different firewall rules per IP
- Different QoS (Quality of Service) per IP
- Easier to limit access (block 10.0.1.11 from outside)

Scenario 3: Multi-Tenant Application

SaaS platform: Multiple customers on same server

eth0:
  - **(1)**2.3.100: Company A instance
  - **(1)**2.3.101: Company B instance
  - **(1)**2.3.102: Company C instance

Each customer accesses their own IP:
  Company A employees → https://app.companyA.com (→ 1.2.3.100)
  Company B employees → https://app.companyB.com (→ 1.2.3.101)
  Company C employees → https://app.companyC.com (→ 1.2.3.102)

Benefits:
- Logical separation (feels like dedicated server)
- Different SLA/performance tiers per customer
- Can restart one customer's instance without affecting others

Linux Configuration Example:

# Configure multiple IPs on single interface (eth0)

ip addr add 1.2.3.4/24 dev eth0
ip addr add 1.2.3.5/24 dev eth0
ip addr add 1.2.3.6/24 dev eth0

# Verify:
ip addr show eth0

1: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>
  inet 1.2.3.4/24 scope global eth0
  inet 1.2.3.5/24 scope global secondary eth0
  inet 1.2.3.6/24 scope global secondary eth0

Web Server Configuration (Apache):

# Virtual host on different IPs

<VirtualHost 1.2.3.4:443>
  ServerName www.site1.com
  DocumentRoot /var/www/site1
  SSLCertificateFile /path/to/site1.crt
</VirtualHost>

<VirtualHost 1.2.3.5:443>
  ServerName www.site2.com
  DocumentRoot /var/www/site2
  SSLCertificateFile /path/to/site2.crt
</VirtualHost>

<VirtualHost 1.2.3.6:443>
  ServerName www.site3.com
  DocumentRoot /var/www/site3
  SSLCertificateFile /path/to/site3.crt
</VirtualHost>

Benefits Summary:

Benefit	Use Case
Virtual Hosting	Host multiple websites on one server
SSL/TLS per domain	Each domain with own certificate (pre-SNI)
Service Isolation	Different firewall rules per service
Tenant Separation	Multi-tenant SaaS platforms
Performance Control	Different rate limits per IP
Billing/Accounting	Track usage per customer IP

Modern Alternative: Name-Based Hosting

With SNI support, can now use:

eth0: 1.2.3.4 (only one IP)

VirtualHosts:
<VirtualHost 1.2.3.4:443>
  ServerName www.site1.com
  SSLEngine on
  SSLCertificateFile /path/to/site1.crt
</VirtualHost>

<VirtualHost 1.2.3.4:443>
  ServerName www.site2.com
  SSLEngine on
  SSLCertificateFile /path/to/site2.crt
</VirtualHost>

Client sends "ServerName" in TLS handshake
Server picks correct cert based on name
Multiple sites on single IP!

Conclusion:

A web server may have multiple IP addresses on a single interface for virtual hosting (multiple domains), SSL/TLS per domain (pre-SNI era), service isolation (different traffic types), tenant separation (multi-tenant platforms), or performance/billing management. While modern SNI enables multiple sites on one IP, multiple IPs may still be used for security, isolation, legacy compatibility, or administrative control.

(l) Briefly describe what HEAD, GET, POST, PUT, PATCH, DELETE HTTP requests are used for.

View Answer

In short: GET retrieves data; POST sends data for server processing; HEAD is like GET but without response body; PUT replaces an entire resource; PATCH partially updates a resource; DELETE removes a resource. These form the foundation of RESTful APIs.

Elaboration:

GET - Retrieve Data

Purpose: Request data without modifying server state

Example:
GET /api/users/123 HTTP/1.1
Host: api.example.com

Response:
200 OK
{
  "id": 123,
  "name": "John Doe",
  "email": "john@example.com"
}

Characteristics:
- Data in URL query string: GET /users?id=123&sort=name
- Idempotent (multiple identical requests = same result)
- Safe (doesn't modify server state)
- Cacheable (browsers cache GET responses)
- Bookmarkable
- Should NOT have request body

POST - Create or Process Data

Purpose: Submit data to server for processing (create, process, etc.)

Example 1: Create new user
POST /api/users HTTP/1.1
Host: api.example.com
Content-Type: application/json

{
  "name": "Jane Doe",
  "email": "jane@example.com"
}

Response:
201 Created
Location: /api/users/124
{
  "id": 124,
  "name": "Jane Doe",
  "email": "jane@example.com"
}

Example 2: Form submission
POST /login HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded

username=alice&password=secret123

Characteristics:
- Data in request body (hidden from URL)
- NOT idempotent (repeated requests create multiple resources)
- NOT safe (modifies server state)
- Not cached (usually)
- Not bookmarkable

HEAD - Retrieve Headers Only

Purpose: Like GET but without response body (just headers)

Example:
HEAD /document.pdf HTTP/1.1
Host: example.com

Response:
200 OK
Content-Type: application/pdf
Content-Length: 50000
Last-Modified: Mon, 01 Jan 2024 10:00:00 GMT

(no body, but headers tell us file info)

Use Cases:
1. Check if resource exists without downloading
2. Check file size before downloading
3. Check last modification date
4. Verify URL validity
5. Bandwidth-efficient checks

Characteristics:
- Same as GET but no response body
- Faster (no data transfer)
- Useful for large files
- Idempotent and safe

PUT - Replace Entire Resource

Purpose: Replace a resource entirely with new data

Example: Update user 123 completely
PUT /api/users/123 HTTP/1.1
Host: api.example.com
Content-Type: application/json

{
  "name": "John Smith",
  "email": "john.smith@example.com",
  "phone": "555-1234"
}

Response:
200 OK
{
  "id": 123,
  "name": "John Smith",
  "email": "john.smith@example.com",
  "phone": "555-1234"
}

Key Difference (PUT vs POST):
- PUT: Client specifies resource ID (PUT /users/123)
- POST: Server generates resource ID (POST /users)

Characteristics:
- Replaces entire resource
- Client specifies ID in URL
- Idempotent (PUT twice = same result)
- If resource doesn't exist: May create it (201) or error (404)

PATCH - Partial Update

Purpose: Partially update a resource (only changed fields)

Example: Update only name field
PATCH /api/users/123 HTTP/1.1
Host: api.example.com
Content-Type: application/json

{
  "name": "John Smith"
}

Current state before PATCH:
{
  "id": 123,
  "name": "John Doe",
  "email": "john@example.com",
  "phone": "555-0000"
}

Response (after PATCH):
200 OK
{
  "id": 123,
  "name": "John Smith",           ← Changed
  "email": "john@example.com",    ← Unchanged
  "phone": "555-0000"             ← Unchanged
}

PUT (for comparison - replaces all):
PUT /api/users/123
{ "name": "John Smith" }

Result with PUT:
{
  "id": 123,
  "name": "John Smith",
  "email": null,        ← Lost!
  "phone": null         ← Lost!
}
(All fields not specified are removed/nulled)

Characteristics:
- Only changed fields required
- More efficient than PUT
- Idempotent (usually)
- Not all servers support PATCH

DELETE - Remove Resource

Purpose: Delete a resource from server

Example: Delete user 123
DELETE /api/users/123 HTTP/1.1
Host: api.example.com

Response:
204 No Content
(Resource deleted, no body needed)

Or:
Response:
200 OK
{ "message": "User 123 deleted successfully" }

Characteristics:
- Removes resource
- Idempotent (deleting twice has same effect)
- Safe to call multiple times
- No request body (usually)
- May return 404 if already deleted

Summary Comparison:

Method	Purpose	Idempotent	Safe	Body
GET	Retrieve data	Yes	Yes	No
HEAD	Retrieve headers	Yes	Yes	No
POST	Create/process	No	No	Yes
PUT	Replace resource	Yes	No	Yes
PATCH	Partial update	Yes*	No	Yes
DELETE	Remove resource	Yes	No	No

REST API Example:

Resource: /api/articles/42

GET /api/articles/42
  → Retrieve article 42

POST /api/articles
  → Create new article

PUT /api/articles/42
  → Replace entire article 42

PATCH /api/articles/42
  → Update some fields of article 42

DELETE /api/articles/42
  → Delete article 42

GET /api/articles
  → List all articles

Conclusion:

GET: Read data (safe, idempotent)
HEAD: Check headers without body (efficient read)
POST: Create new resource or trigger action (not idempotent)
PUT: Replace entire resource (idempotent, client-specified ID)
PATCH: Partially update resource (idempotent, efficient update)
DELETE: Remove resource (idempotent)

These form the CRUD operations (Create, Read, Update, Delete) for RESTful APIs.

(m) Briefly describe the motivation for HTTP/2.0.

View Answer

In short: HTTP/2.0 was motivated by the need to reduce latency and overhead in HTTP/1.1, which suffered from head-of-line blocking, multiple connection limitations, and inefficient header transmission. HTTP/2 introduced multiplexing, binary framing, and header compression to dramatically improve performance.

Elaboration:

Problems with HTTP/1.1:

Problem 1: Head-of-Line Blocking

HTTP/1.1 limitation:
Request 1 →
           Response 1 (takes 500 ms) ←
Request 2 →
           Response 2 (takes 500 ms) ←
Request 3 →
           Response 3 (takes 500 ms) ←

Total: 1500 ms (sequential, pipelined doesn't help much)

If Request 1 delayed (500 ms wait):
Requests 2 and 3 stuck waiting
"Head of line" (first in queue) blocks rest

Problem 2: Limited Parallelism

HTTP/1.1: 6-8 parallel connections (browser limit)

Modern website: 100+ resources
- 100 JavaScript files
- 50 images
- 20 CSS files
- Fonts, videos, etc.

Only 6 can download simultaneously
Rest wait in queue

Creating more connections wastes resources:
- TCP handshake overhead
- TLS negotiation overhead
- Congestion window reset

Problem 3: Header Compression Missing

HTTP/1.1 headers: Plain text, often repeated

Example request:
GET /image1.jpg HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0...
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Cookie: session=abc123; user=john; ...

Size: ~500 bytes

Next request (same domain):
GET /image2.jpg HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0...
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Cookie: session=abc123; user=john; ...

Same 500 bytes again!
90% of headers are identical
Massive waste sending headers repeatedly

Problem 4: Text-Based Overhead

HTTP/1.1: Text-based parsing

GET /index.html HTTP/1.1\r\n
Host: example.com\r\n
Connection: keep-alive\r\n
\r\n

Parser must:
- Read character by character
- Find line breaks (\r\n)
- Parse key: value pairs
- Handle whitespace variations
- Error-prone

Also: Humans can read it (debug), but wasteful

HTTP/2 Solutions:

Solution 1: Multiplexing

HTTP/2: Multiple streams on single connection

Request 1 (Stream 1) →     Frame 1
                           Frame 1
Request 2 (Stream 3) →     Frame 2
                           Frame 1
Request 3 (Stream 5) →     Frame 3
                           Frame 2
                           Frame 3
← Response 1 (Stream 1)
← Response 2 (Stream 3)
← Response 3 (Stream 5)

Single connection, all streams active simultaneously
No head-of-line blocking

Timeline:
Time 0:    Send req1, req2, req3 (all at once)
Time 100:  Response 1 arrives (quick)
Time 200:  Response 3 arrives (quick)
Time 300:  Response 2 arrives (slow but others not blocked)

Total: 300 ms (instead of 1500 ms with HTTP/1.1)

Solution 2: Binary Framing

HTTP/1.1: Text-based
HTTP/2: Binary framing layer

Each message split into frames:
- Headers frame
- Data frame (s)
- Trailers frame

Frame format:
[Length: 3 bytes][Type: 1 byte][Flags: 1 byte]
[Stream ID: 4 bytes][Payload: variable]

Advantages:
- Efficient parsing (binary, not text)
- Fixed format, easier to implement
- Multiplexable (each frame tagged with stream ID)
- Can prioritize frames

Solution 3: Header Compression (HPACK)

HTTP/2: HPACK header compression

First request:
Host: www.example.com
User-Agent: Mozilla/5.0...
Accept: text/html

Size: 500 bytes

Second request (same domain):
Only changed field: User-Agent

Instead of resending all 500 bytes:
Send: "Request 2 uses previous headers except User-Agent"
Size: 50 bytes (10% of original!)

How it works:
- Maintain header table at client and server
- Reference previous headers by index
- Only transmit differences
- Typical compression: 85-90% reduction

Solution 4: Server Push

HTTP/1.1:
1. Browser requests index.html
2. Server responds with index.html
3. Browser parses, sees <link rel="stylesheet" href="style.css">
4. Browser requests style.css
5. Server responds

Latency: 2 round-trips for html + css

HTTP/2 Server Push:
1. Browser requests index.html
2. Server responds with index.html
3. Server predicts client needs style.css
4. Server proactively PUSH style.css
5. Browser receives both in parallel

Latency: 1 round-trip for html + css
Browser doesn't need to wait for HTML before knowing about CSS

Real-World Performance Improvement:

Benchmark: Loading website with 100 resources

HTTP/1.1 (6 parallel connections):
- TCP handshakes: 6 × 50 ms = 300 ms
- TLS handshakes: 6 × 100 ms = 600 ms
- Header transmission: 100 × 500 bytes = 50 KB
- Head-of-line blocking: Significant
Total: ~3-5 seconds

HTTP/2 (1 connection, multiplexing):
- TCP handshake: 1 × 50 ms = 50 ms
- TLS handshake: 1 × 100 ms = 100 ms
- Header transmission: Compressed 85% = 7.5 KB
- No head-of-line blocking
- Binary efficient parsing
Total: ~1-2 seconds

Improvement: 2-3x faster

Conclusion:

HTTP/2.0 was motivated by HTTP/1.1’s inefficiencies: head-of-line blocking, limited parallelism (6-8 connections), repeated headers, and text-based overhead. HTTP/2 addressed these through multiplexing (many streams on one connection), binary framing (efficient parsing), header compression (HPACK), and server push (proactive delivery), resulting in 2-3x faster load times while reducing bandwidth usage.

(n) Briefly describe the motivation for HTTP/3.0.

View Answer

In short: HTTP/3.0 was motivated by latency problems with TCP and TLS handshakes on high-latency or lossy networks. HTTP/3 replaces TCP with QUIC (UDP-based), which supports 0-RTT resumption, faster handshakes, connection migration, and per-stream congestion control to improve performance on mobile and unreliable networks.

Elaboration:

Problems with HTTP/2 (and TCP):

Problem 1: TCP Handshake Overhead

Every new HTTPS connection requires:

1. TCP handshake (3-way):
   Client → SYN
   Server ← SYN-ACK
   Client → ACK
   Latency: 1 RTT

2. TLS 1.2 handshake:
   Client → ClientHello
   Server ← ServerHello, Certificate, ...
   Client → ClientKeyExchange, ...
   Server ← Finished
   Client → Finished
   Latency: 2 RTT (minimum)

Total: 3 RTT before data transmission
On high-latency networks (100 ms RTT):
  3 × 100 ms = 300 ms just for handshakes!

Mobile: Even worse (high latency, variable)

Problem 2: Head-of-Line Blocking at TCP Layer

HTTP/2 multiplexes over TCP, but TCP itself has HoL blocking:

TCP guarantees ordered delivery:

Packet 1 (Request A) sent at time T
Packet 2 (Request B) sent at time T+10ms

Network: Packet 1 dropped, Packet 2 arrives

TCP must:
1. Wait for Packet 1 retransmission
2. Before delivering Packet 2 to application

Even though Packet 2 arrived, application must wait
HTTP/2 streams are blocked by TCP HoL blocking

QUIC: Each stream has independent congestion control
Dropped packet only affects its stream
Other streams unaffected

Problem 3: Connection Establishment Too Expensive

HTTP/2 over TCP:
- Users expect instant page load
- Establish connection on first request
- 3 RTT overhead unacceptable

Mobile example:
- RTT: 100 ms (common on 4G)
- 3 RTT = 300 ms just for connection setup
- Adds to total load time

Reusing connection helps, but:
- Network change (WiFi to cellular): Connection drops
- Roaming between networks: Connection dies
- Mobile users move frequently
- Each new connection: 300 ms penalty

Problem 4: Congestion Control Per Connection

HTTP/2 scenario:

Single TCP connection for multiple streams
One packet loss → affects ALL streams

Example:
- Stream 1 (video): Can tolerate loss
- Stream 2 (API call): Needs low latency

Packet loss detected:
TCP backs off (exponential backoff)
Both streams slow down equally

Inefficient: Video stream can wait, API can't

HTTP/3 Solutions:

Solution 1: QUIC Protocol (UDP-based)

HTTP/3 uses QUIC instead of TCP

QUIC = Quick UDP Internet Connection
Runs on UDP (connectionless, fast)
Implements reliability at QUIC layer (not TCP)

Advantages:
- No TCP handshake overhead
- Faster connection setup
- Connection migration support
- Per-stream flow control
- Per-stream congestion control

Solution 2: 0-RTT Connection Establishment

TLS 1.3 + QUIC enable 0-RTT:

First connection:
Client → Initial (with ClientHello data)
Server ← Response + data
Latency: 1 RTT (down from 3!)

Resumed connection (within session ticket):
Client → ClientHello from cache
Server ← Data
Client sends HTTP request with first packet!
Latency: 0 RTT (literally sent with setup!)

Example on 100 ms RTT:
HTTP/2: 300 ms (3 RTT) setup
HTTP/3: 0 ms (0 RTT) for resumed, 100 ms (1 RTT) for fresh

Solution 3: Per-Stream Congestion Control

HTTP/2 TCP problem:
Single connection → single congestion window
One dropped packet → all streams slow down

HTTP/3 QUIC solution:
Each stream manages its own congestion

Scenario:
- Stream 1 (video): Congestion window backing off
- Stream 2 (API): Maintains own aggressive window
- Streams don't interfere

Video doesn't block API from sending
Better performance for mixed traffic

Solution 4: Connection Migration

TCP problem:
Connection identified by IP:port pair
IP changes → TCP connection drops

User scenario:
1. Download starts on WiFi
2. User walks out of WiFi range
3. Device switches to cellular (IP changes)
4. TCP connection drops
5. Download restarts (wasted time, data)

QUIC solution:
Connections identified by Connection ID (not IP)
IP change → Connection continues!

User scenario with HTTP/3:
1. Download starts on WiFi (QUIC connection)
2. User walks out of range
3. Device switches to cellular (IP changes)
4. QUIC sends: "Still me, same Connection ID"
5. Download resumes seamlessly
6. No latency penalty, no data loss

Solution 5: Faster Handshakes

QUIC handshake (1 RTT minimum):

Client → Initial packet
Server ← Handshake packet + encrypted data
Client → Acknowledgment

Then: HTTP/3 request sent immediately

Much faster than TCP (3 RTT) + TLS (2 RTT)

Real-World Performance Impact:

Scenario: Mobile user downloads webpage

HTTP/1.1:
- Network change: Connection drops
- Must reconnect: TCP (1 RTT) + TLS (2 RTT) = 300 ms
- Then: Re-download = slow

HTTP/2:
- Same problem: TCP + TLS overhead on reconnect
- Still multiplexed, but same TCP latency issue

HTTP/3:
- Network change: QUIC migrates automatically
- Connection continues: 0 RTT overhead
- Download resumes instantly
- Perception: Seamless, fast

Comparison:

Aspect	HTTP/2 (TCP)	HTTP/3 (QUIC)
Initial handshake	3 RTT (TCP) + 2 RTT (TLS) = 5 RTT	1 RTT + 0-RTT resumption
Head-of-line blocking	At TCP layer	Per-stream only
Connection migration	Breaks	Seamless
Mobile friendliness	Poor (reconnects drop)	Excellent (transparent migration)
High-latency networks	500-1000 ms setup	100-200 ms setup

Deployment Status:

HTTP/3 adoption:
- Chrome: Full support (2020+)
- Firefox: Full support (2021+)
- Safari: Full support (2022+)
- Major sites: Google, Facebook, Cloudflare, etc.

Benefits visible on:
- Mobile networks (high RTT)
- Network changes (WiFi → cellular)
- High packet loss scenarios

Conclusion:

HTTP/3.0 was motivated by TCP’s overhead (3+ RTT handshakes, per-connection congestion control) and poor performance on mobile/unreliable networks. QUIC (UDP-based) provides 0-RTT resumption, connection migration without dropping, per-stream congestion control, and faster handshakes. This results in dramatically better performance on mobile, high-latency, and unstable networks where connection drops are common.

Problem 2: Dynamic Host Configuration Protocol

We discussed in class that a host’s IP address can either be configured manually, or by Dynamic Host Configuration Protocol (DHCP).

(a) Describe the advantages and disadvantages of each approach.

View Answer

Manual IP configuration assigns fixed addresses, providing stability and predictability but suffering from poor scalability, configuration errors, and IP conflicts. DHCP automatically assigns IP addresses and network parameters, scales well, and reduces errors, but depends on a DHCP server and may result in changing IP addresses.
(b) Describe how a host gets an IP address using DHCP.

View Answer

A host uses DHCP via the DORA process: it broadcasts DHCPDISCOVER, receives a DHCPOFFER, responds with DHCPREQUEST, and receives DHCPACK, after which it configures its network interface.

Problem 3: Video Streaming Protocol Selection

Consider an application where a camera at a highway is capturing video of the passing cars at 30 frames/second and sending the video stream to a remote video viewing station over the Internet. You are hired to design an application-layer protocol to solve this problem. Which transport-layer protocol, UDP or TCP, would you use for this application and why? Justify your answer.

View Answer

UDP is preferred for real-time video streaming because it is delay-sensitive and can tolerate packet loss. UDP avoids retransmissions, congestion control delays, and head-of-line blocking present in TCP, resulting in smoother playback.

Problem 4: DNS Recursive vs Iterative Queries

Consider a host H within qc.cuny.edu domain, whose name server is ns.qc.cuny.edu. Suppose that H tries to learn the IP address of the host ringding.cs.umd.edu. Assume that ns.qc.cuny.edu does not have the IP address of ringding.cs.umd.edu in its cache. Further assume that root DNS servers only know the authoritative name server for umd.edu domain.

(a) Describe how the IP address of ringding.cs.umd.edu will be resolved assuming no DNS server implements recursive queries.

View Answer

With iterative queries, the local DNS server queries the root server, then the umd.edu server, then the cs.umd.edu server, finally obtaining the IP address of ringding.cs.umd.edu and returning it to the host.
(b) Redo (a) assuming ALL DNS servers implement recursive queries.

View Answer

With recursive queries, the host sends one query to its local DNS server, which recursively contacts all necessary DNS servers and returns the final IP address.

Problem 5: DNS and HTTP Web Page Retrieval

Suppose within your Web browser you click on a link to obtain a Web page. Suppose that the IP address for the associated URL is not cached in your local host so that a DNS look-up is necessary to obtain the IP address. Suppose that $n$ DNS servers are visited before your host receives the IP address from DNS; the successive visits incur RTTs of $RTT_1, RTT_2, \ldots, RTT_n$. Let $RTT_0$ be the RTT between your local host and the Web server containing the Web page and let $B$ bits/sec be the sustained bandwidth.

(a) Suppose that the Web page consists of a single object of size $D_1$ bits. Further suppose that the DNS queries are sent over UDP. How much time elapses from when the client clicks on the link until the client receives the object?

View Answer

Total time: $\sum_{i=1}^{n} RTT_i + 2RTT_0 + \frac{D_1}{B}$
(b) Redo (a) assuming that the DNS queries are sent over TCP.

View Answer

Total time: $\sum_{i=1}^{n} 2RTT_i + 2RTT_0 + \frac{D_1}{B}$
(c) Assume that the user clicks on a link within the just downloaded page and starts downloading a new web page of size $D_2$ bits residing at the same server. Assume this page also consists of a single object. How much time elapses from when the client clicks on the new link until the client receives the new object? Assume that the DNS uses UDP as in (a).

View Answer

Total time: $2RTT_0 + \frac{D_2}{B}$
(d) Now assume that the web page to be downloaded in (c) has 6 other embedded objects each with size $D_3$. Assuming that the Web browser implements HTTP/1.0 with non-persistent connections and no parallel TCP connections, how much time elapses from when the client clicks on the new link until the client receives all objects?

View Answer

Total time: $(2RTT_0 + \frac{D_2}{B}) + 6(2RTT_0 + \frac{D_3}{B})$
(e) Redo (d) assuming the Web browser implements 4-parallel TCP connections with non-persistent connections.

View Answer

Total time: $(2RTT_0 + \frac{D_2}{B}) + 4RTT_0 + \frac{6D_3}{B}$
(f) Redo (d) assuming the Web client uses HTTP/1.1 with persistent connections (no pipelining).

View Answer

Total time: $(2RTT_0 + \frac{D_2}{B}) + 6(RTT_0 + \frac{D_3}{B})$

Problem 6: Instant Messaging System Architecture

Suppose you were to implement an instant message such as Yahoo messenger, which allows any number of users to exist in the system and establish instant messaging sessions among them.

(a) Describe the architecture of your system (system components, protocol messages exchanged etc.) to enable users to dynamically learn each other’s current IP addresses and port numbers so that they can seamlessly start instant messaging sessions.

View Answer

Users register IP addresses and ports with a centralized directory server, which clients query to discover peers, while messages are exchanged peer-to-peer.
(b) Suppose you were to allow users to have “buddy lists” and learn about the current communication status of their buddies. How would you extend your system to enable this feature?

View Answer

The directory server maintains buddy lists and presence information and notifies users of status changes.

Problem 7: Web Proxy Caching Performance

Assume that Queens College decided to use a Web Proxy, i.e., a Web cache. In this model, each Web browser is set up to send their requests to the Web proxy rather than sending the request directly to the actual Web server. Recall that a Web browser also maintains a local cache. Suppose a user accesses 100 objects one after the other using HTTP/1.0. The size of each object is 10000 bits. Assume that the sustained bandwidth between the user’s PC and the Web Proxy is 10Mbps and has an RTT of 1ms, and the sustained bandwidth between the Web proxy and a Web server is 1Mbps and has an RTT of 100ms.

(a) What is the average object retrieval time in the absence of any cache hits?

View Answer

With no cache hits, the average object retrieval time is approximately 101 ms.
(b) Assume that of all user requests, 20 percent is found in the Browser cache, half of the remaining user requests are satisfied from the Web Proxy cache, and the remaining requests make it up to the Web server. What’s the average object retrieval time now?

View Answer

With browser and proxy caching, the average retrieval time is approximately 40.8 ms.

Problem 8: Alert Notification Protocol Selection

Consider a network-attached burglar alarm which is programmed to notify the police when a burglar enters the house. Suppose that you are to use either HTTP or SMTP to send the notification message. How would you use each protocol to send the message? Which protocol makes more sense to use for this application?

View Answer

SMTP is preferable to HTTP for alarm notifications because it provides reliable store-and-forward delivery.

Problem 9: Email with MIME Attachments

Suppose you want to send an e-mail message M with 4 attachments, A1, A2, A3 and A4. Describe how your e-mail client, e.g., Outlook, would send this e-mail?

View Answer

The client constructs a MIME multipart message with Base64-encoded attachments and sends it using SMTP.

Problem 10: POP3 and IMAP Mail Protocols

What are POP3 and IMAP used for? What are the advantages of IMAP over POP3?

View Answer

POP3 downloads email locally, while IMAP keeps email on the server and supports synchronization and folders, making IMAP more flexible.