How the Web Works — Caching

Brunda K
Brunda’s Tech Notes
2 min readJan 25, 2023

--

In this post, I’d like to discuss caching of web resources. Reverse proxies can be deployed to act as caching intermediaries that cache content generated by origin servers. The caching servers provide the benefit of reducing the load on the backend servers by reducing the number of requests that reach the origin servers.

When a reverse proxy acts as a caching server, it sits close to the origin server. The caching server caches content by examining the values of various HTTP headers.

The caching server mainly deals with the following questions:

Is the content cacheable ?

Content from the origin server can include the Cache-Control:nostore header to indicate that the content must not be cached at all.

Cache-Control:private indicates that the content can be cached, but only in a private cache (example: the user browser).

A reverse proxy caching server may only cache content that does not have either the Cache-Control:nostore or the Cache-Control:private header.

Responses that contain the “Authorization” header are not normally cached.

Is the content stale or fresh ?

Caching server can determine if the content is stale or fresh based on elapsed time since generation.

Content is considered either stale or fresh depending on when the content was generated as compared to the value in Cache-Control: max-age=N or Cache-Control: max-sage=N ,where N is in seconds.

If the content was generated more than N seconds ago, the content is considered stale.

Cache-Control:max-age is a directive that applies to browsers, whereas Cache-Control:max-sage applies to shared caches like the reverse proxy caches that we are currently discussing.

Caching strategies

The caching server can serve content from its cache, without initiating any connection with the back end origin server. In some cases, the content needs revaluation before it can be served.

Cache-control header provides a means for the web developer to indicate this by providing the Cache-control:no cache option. Setting this provides the benefit of using the caching content and at the same time ensures it is revalidated with the origin server each time it is served.

Cache validators such as LastModified or ETag allow for validation of content by asking the server on information when the content was last modified (LastModified header) or checking if the current version of the file matches the latest version on the server (Etag header).

Conclusion:

This post described the use of reverse proxy for caching content. The cache-control header is also described briefly to illustrate how the header influences the caching server’s caching strategy.

--

--