Overview

A reverse proxy is a server that, unlike a forward proxy (or just proxy), takes a request and passes it back to the backend server. The backend server then responds with a result that is served to the requester or end user.

A forward proxy would take requests from one end user and communicate with multiple servers on the backend. Reverse proxy, however, acts as an intermediate between multiple end users and a single or limited group of servers.

What CDN enforces that makes reverse proxy used widely as an origin pull is caching. The middle step between the request and response from the reverse proxy is the caching mechanism (ex: memcached). This stores the requested asset in a local cache and prepares it for the next request.

CDN servers usually have a custom setup mandating how many requests are needed before a server will try to cache the requested asset. Probably the best value setup for this purpose is 2. This value is basically defining what the limit will be after the unique request is treated as valid and eligible for caching.

Note: An asset only requested once doesn’t have to be requested any more. Therefore it would be a waste of cache space to pile up “one-timer assets.” On the other handr, assets requested two times will most likely be requested a third, fourth, fifth time, etc., thusmaking it eligible for caching.

Setting up a reverse proxy with Nginx is quite simple and consists of three main steps: 1) setting up the caching path, 2) calling the cache zone into the vhost config file, and 3) defining the origin location.

Define cache path

http {
proxy_cache_path /var/cache/nginx keys_zone=test:10m;
…
}

Call cache zone into the vhost file

server {
…
proxy_cache test;
…
}

Define back end server or origin location

location / {
…
proxy_pass http://domain.com;
proxy_cache_min_uses 2;
proxy_cache_valid 200 30d;
…
}

Without any additional tweaks to show cache status or to append cache status into the log files, you have created a reverse proxy system supported by Nginx.

What it does is explained on the scheme below:

if proxy_cache_min_uses is 0 and cache status is MISS -> increment it by 1 and proxy_pass the request to backend server (origin)

  1. End user requests a file from CDN
  2. CDN checks if it’s in cache
  3. if the cache is MISS for this particular file “proxy_cache_min_uses” is incremented for “1”
  4. Request passed back to origin server for delivery
  5. Origin server returns file to CDN
  6. CDN delivers file to end user

This process repeats two times (defined by “proxy_cache_min_uses”), after which the CDN tries to cache the file so it delivers the same file directly from the cache upon next request.

if proxy_cache_min_uses is 2 and cache status is MISS -> proxy_pass the request to backend server (origin) AND TRY TO CACHE the file!

if proxy_cache_min_uses is 2 and cache status is HIT -> deliver file from proxy cache