Application Cache vs. External Cache

Caching is essential to achieve decent response times under heavy workloads. Sometimes, people ask me why even use another database such as Redis, which is a typical choice for that. After all, it is even faster to have a cache directly in the application, right? We save time and CPU resources, we avoid inter-process communication overhead, and we keep our stack smaller, thus making the deployments easier.
So, why do we need to spin up a Redis or a Memcached instance?

Sharing the cache between processes

In-application caching is simple and can be done in a very straightforward way in some languages. For example, Python's standard library provides a decorator which memoizes a function using an LRU cache, which is functools.lru_cache.


from functools import lru_cache

@lru_cache
def fetch_items():
    # Do some intense DB query and
    # get the results
    return results

By caching data in our application, we allocate memory which belongs to the application process. That is, other processes cannot access the cache; they have to recompute things and maintain their own cache. Even if our system consisted only in one application (e.g., a monolithic web app), this still may be an issue. Having only one application does not mean we run only one process of it. Web application server such as uWSGI or Gunicorn typically spawns multiple workers, each of them running a copy of the application in parallel. Those workers are generally implemented as OS processes, thus sharing no memory between them. So, each instance of the application has to populate and maintain its own cache. If a request—whose the result is already cached in Worker-1—arrives at Worker-2, we do the work again, we populate a cache again. We end up with more memory consumption for the same thing.

Memory management

Another drawback is that we have to beware of the Garbage Collector (GC) (when applicable, depending on the used language). Can we now answer the followings questions confidently?

  • How well will the GC perform under load, when the cache will grow?
  • How will the underlying memory allocator of the interpreter behave?

The more the heap grows, the more the GC has to work to figure out what it can safely clean. Depending on the GC strategy used by our language runtime, we must make sure we won't face some typical GC-related problems such as "stop-the-world" longer pause or an augmented risk of reference cycles.

We also have to be aware of the subtle memory allocation and deallocation rules of the interpreter. For small and temporary short-lived objects, the memory manager would hopefully work well. What happens for big long-lived objects? It is common for interpreters to use an arena-based allocation system. E.g., the CPython interpreter allocates arenas which are pre-allocated big chunks (256 KB) of memory. It allocates new arenas as needed during the runtime. By filling those arenas with big, long-lived objects, we have to be careful about the accompanying drawbacks that may result.

Memory managers of interpreters are primarily designed for handling temporary objects which are part of a typical program execution flow. It is a very different use case as an in-RAM database system. The job of the interpreter is to run your program safely. The job of the in-RAM DB is to manage memory efficiently.

Decoupling the cache from the application

Decoupling the application brings other benefits.
An independent (from the app) cache can be managed as such, as a standalone database. That is, we can make it persistent and highly-available if those capabilities are offered by the chosen database system (e.g., Redis). The cache database is manageable as such, and this is convenient if we need very fine-grained cache control based on specific requirements such as:

  • Highly-availability: the cache is still available after a failure
  • Persistence: the cache is still warm after a reboot

That is work we don't need to implement ourselves in our application which, in principle, should focus on the business problems we are solving.

The core job of our typical application is to solve some specific business problem. Not to manage a cache and the underlying algorithms. It is the job for other software. By the way, we would not try to implement the data storage management in our application; we let the database (e.g., PostgreSQL) take care of it. The same thing applies to the cache management system; we let another software deal with it.