I’ve blogged before about how this site can easily push out over 2,000 requests/second using only 6 WSGI workers excluding latency. The reason that’s possible is because the whole page(s) can be cached server-side. What actually happens is that the whole rendered HTML blob is stored in the cache server (Redis in my case) so that no database queries are needed at all.
I wanted my site to still “feel” dynamic in the sense that once you post a comment (and it’s published), the page automatically invalidates the cache and thus, the user doesn’t have to refresh his browser when he knows it should have changed. To accomplish this I used a hacked
cache_page decorator that makes the cache key depend on the content it depends on. Here’s the code I actually use today for the home page:
def _home_key_prefixer(request): if request.method != 'GET': return None prefix = urllib.urlencode(request.GET) cache_key = 'latest_comment_add_date' latest_date = cache.get(cache_key) if latest_date is None: # when a blog comment is posted, the blog modify_date is incremented latest, = (BlogItem.objects .order_by('-modify_date') .values('modify_date')[:1]) latest_date = latest['modify_date'].strftime('%f') cache.set(cache_key, latest_date, 60 * 60) prefix += str(latest_date) try: redis_increment('homepage:hits', request) except Exception: logging.error('Unable to redis.zincrby', exc_info=True) return prefix @cache_page_with_prefix(60 * 60, _home_key_prefixer) def home(request, oc=None): ... try: redis_increment('homepage:misses', request) except Exception: logging.error('Unable to redis.zincrby', exc_info=True) ...
And in the models I then have this:
@receiver(post_save, sender=BlogComment) @receiver(post_save, sender=BlogItem) def invalidate_latest_comment_add_dates(sender, instance, **kwargs): cache_key = 'latest_comment_add_date' cache.delete(cache_key)
So this means:
- whole pages are cached for long time for fast access
- updates immediately invalidates the cache for best user experience
- no need to mess with ANY SQL caching
So, the next question is, if posting a comment means that the cache is invalidated and needs to be populated, what’s the ratio of hits versus hits where the cache is cleared? Glad you asked. That’s why I made this page:
It allows me to monitor how often a new blog comment or general time-out means poor django needs to re-create the HTML using SQL.
At the time of writing, one in every 25 hits to the homepage requires the server to re-generate the page. And still the content is always fresh and relevant.
The next level of optimization would be to figure out whether a particular page update (e.g. a blog comment posting on a page that isn’t featured on the home page) should or should not invalidate the home page. esp