Conditional resource caching in nginx


I ran into a problem with one of our sites today– we got promoted from a very popular YouTuber.
Google Analytics was recording around ~900 people active on our site in real-time.

Although we are prepared for some degree of traffic deviation, this was way above what we were prepared for.

After some tuning of the TCP/IP stack, activating certain performance mode features of our board software and doing some on-the-fly adjustments to our MySQL server we got stuff running again, albeit with massive lag.

One of the major problems was that our board software, in their infinite wisdom, decide to serve some screenshots via PHP instead of going through the filesystem.

REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7324 HTTP/1.0" 200 2705 "" "Mozilla/5.0 (Linux; Android 4.1.2; REDACTED"
REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7324 HTTP/1.0" 200 2705 "" "Mozilla/5.0 (Windows NT 6.1; REDACTED"
REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7320 HTTP/1.0" 200 3250 "" "Mozilla/5.0 (Linux; Android 4.1.2; REDACTED"
REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7330 HTTP/1.0" 200 4034 "" "Mozilla/5.0 (Windows NT 6.2; REDACTED"
REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7319 HTTP/1.0" 200 14571 "" "Mozilla/5.0 (Linux; REDACTED"
REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7329 HTTP/1.0" 200 26334 "" "Mozilla/5.0 (Windows NT 6.2; REDACTED"
REDACTED - - [21/Mar/2015:16:22:41 -0400] "GET /index.php?app=downloads&module=display&section=screenshot&id=7322 HTTP/1.0" 200 25411 "" "Mozilla/5.0 (Linux; REDACTED"

This is one part of our site we hadn’t actually cached before, and was now becoming a pressing problem and the root cause (!) of everything else being slow. It seems that not only is this call expensive as it is using a PHP worker, but it’s also apparently quite expensive for the database (!).

Unfortuantly, the location block in nginx does not support GET/POST variables. The approach of using an if block here to dynamically set proxy_cache cannot be used either (as nginx disallows this).

Instead, I managed to get the following workaround working, see the comments:

location / {
    set $nocache 1; # Make sure, by default we don't cache any response.
    if ($args ~ section=screenshot) { # Is this a screenshot?
        set $nocache 0; # Set this to 0, ensuring caching later.
    proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
    proxy_redirect off;
    proxy_buffering on;
    proxy_set_header        Host            $host;
    proxy_set_header        X-Real-IP       $remote_addr;
    proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header        Aceept          "";
    proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie;
    proxy_cache STATIC;
    proxy_cache_key "$scheme$request_method$host$request_uri";
    proxy_cache_valid 200 6h;
    proxy_cache_bypass $nocache; # If nocache is 1, bypass cache...
    proxy_no_cache $nocache; # If nocache is 1, bypass cache...
    proxy_pass  http://backend;
    include g17upstream-location-common.conf;

This solution will mean nginx will cache the response whenever it sees the $nocache variable set to 0. Since the default state for $nocache is 1, responses will not be explictly cached.  Extending this solution is as simple as adding more if statements to set $nocache to 0.

I hope this helps someone out, as this approach (although slightly ugly) is the best I can think of and I haven’t seen anyone else documenting this.