Akercode.com

Redis with PostgreSQL And What To Look For

What is Redis

Redis (Remote Dictionary Server) is an open source (BSD licensed), in-memory, key-value data store, meaning we can map different data types (strings, hashes, lists, sets, and sorted sets) to a defined key.

With The Redis String type we can map the string value myvalue to a key called mykey with the set command, and then we can retrieve it with the get command.

> set mykey myvalue
OK
> get mykey
"myvalue"

Since Redis keeps data in random access memory (RAM), the performance and data transfer is blazing fast, and because of this it's perfect use-case for cache, but also has the ability to store persistent data to disk, and be used as a regular database.

What is PostgreSQL

PostgreSQL calls itself the world's most advanced open source database and not without reason.

Even if the initial release was back in 1996, the basis of the system was already described in published papers 10 years prior, in 1986.

It is known for reliability, performance and feature robustness, and is used in many enterprise applications because of its resilience.

PostgreSQL is best suited for systems that require execution of complex queries, or data warehousing and data analysis, unlike MySQL which is the first choice for many web applications which require a database merely for data transactions.

Identifying What To Cache

Slow PostgreSQL Queries

Not all slow queries are bad, but if the UI is dependent on some data, or if it's at the expense of the user experience, it can be ideal to store the data in cache for instant access.

There may be complex queries or large quantities of data that cause high usage of resources.


You can list queries that are running longer than, for example 3 minutes with the pg_stat_activity view which lives in the pg_catalog schema. The output is a detailed list of all currently running processes.

SELECT pid,
       USER,
       pg_stat_activity.query_start,
       now() - pg_stat_activity.query_start AS query_time,
       query,
       state
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '3 minutes';

Output:

Image of a SQL query result showing query time


You can also use EXPLAIN ANALYZE to get detailed execution timings of a query:

EXPLAIN ANALYSE 
SELECT * FROM bar;

Output:

Image of a SQL query result showing statistics for a table

Large PostgreSQL Database Tables

List and order tables by size can give a great overview of where the heavy load is located. Huge tables that are accessed often do have an impact on the response time.

Get table size of all the tables in the public schema, where pg_total_relation_size is the sum of pg_table_size and pg_indexes_size.

SELECT
   table_name,
   pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
   pg_total_relation_size(quote_ident(table_name)) 
FROM
   information_schema.tables 
WHERE
   table_schema = 'public' 
ORDER BY
   pg_total_relation_size(quote_ident(table_name)) DESC;

Output:

Image of a SQL query result showing database table size