PostgreSQL: Analytics for DBA

Many users of PostgreSQL know that the server at the time of their work collects a variety of statistics, but not everyone knows that it is useful to analyze and how to extract it. This small Toolkit collected some useful queries that give some idea about how to use this "hidden knowledge", which constantly accumulates. These queries can be used to monitor the status of PostgreSQL (manually or with plugins for monitoring systems like Nagios, Cacti or Zabbix), to search for bottlenecks in the server and many other similar tasks. Remember that this is only the tip of the iceberg; in the documentation you can find descriptions of several dozen system ideas that can also be useful for the PostgreSQL administrator.

For correct work need to enable Toolkit options stats_block_level and stats_row_level in postgresql.conf and set the parameter of stats_reset_on_server_start in its sole discretion. If each time you restart the PostgreSQL server you change some important configuration parameters, it makes sense to reset the statistics to monitor the effect of the change. If you are interested in a long-term perspective and restart is not due to changes in PostgreSQL configuration, set the parameter of stats_reset_on_server_start set to off.

the

the hit / read



When you run the query in PostgreSQL first looks whether there is a need to request data in shared memory (shared buffers). If found, the hit is counted if there is a relatively slow system call fread for raising data from the disk or disk cache of the operating system and is considered read. On average, right rule: the greater the ratio of hit/read, the better you have configured PostgreSQL as it's very little reading from disk, basically removing data from shared memory. For most small databases, the relationship must lie in the range 5000-10000. Do not seek, however, to raise the shared_buffers setting, which determines hit/read: too large size of shared memory may lead to loss of performance in databases with write-intensive. It is also worth remembering that fread can be quite fast if the data is in the disk cache of the OS.
 SELECT 
datname, 
CASE 
WHEN blks_read = 0 THEN 0 
ELSE blks_hit / blks_read 
END AS ratio 
FROM 
pg_stat_database;


the

the Number of modifications that occurred in the table



The list on tables: what number of records they were added, modified and deleted since the last statistics reset. DBA should represent which tables are the most loaded in the current database and what is the relationship between different types of modifying queries to them.

SELECT 
relname, 
n_tup_ins, 
n_tup_upd, 
n_tup_del 
FROM 
pg_stat_user_tables 
ORDER BY 
n_tup_upd DESC;


the

Statistics seq scan / index scan



The list on tables: what is the number of queries was performed by sequential viewing; how many requests were made using the index; and the ratio of these two numbers. Allows you to assess whether all necessary indexes are created in this table. If your tables contain a few thousand rows, the consistent view will run slower than index scan, so in the ideal case seqscan-s in these tables should not be. If you are still there, analyze the queries into tables and create appropriate indexes. It is important not to overdo it: the more indexes on the columns of the table become more expensive data update operations.

Also do not forget that after you create the index table to do an ANALYZE, otherwise the query planner won't notice your changes in the table structure.

 SELECT 
relname, 
seq_scan, 
idx_scan, 
CASE 
WHEN idx_scan = 0 THEN 100 
ELSE seq_scan / idx_scan 
END AS ratio 
FROM 
pg_stat_user_tables 
ORDER BY 
ratio DESC;

the

Statistics for indexes



A list of indexes: how many records from the index were returned to queries on that index and how many rows at the same time still had to be viewed in the parent table; the ratio of these two numbers. The essence of these statistics is simple: if you have a lot of fetch-it means the index is out of date and when you run the query, PostgreSQL is forced to look directly in the table as the source for the most current data, which slows down his work. In this case, you must rebuild the index to match the real data in the table.

 SELECT 
indexrelname, 
idx_tup_read, 
idx_tup_fetch, 
CASE 
WHEN idx_tup_fetch = 0 THEN 100 
ELSE idx_tup_read / idx_tup_fetch 
END AS ratio 
FROM 
pg_stat_user_indexes 
ORDER BY 
ratio DESC;


the

Running queries with their duration



A simple list of running server at the moment requests. Is useful when you are not familiar with the system or just do not have time to set it up — it is possible to find and interrupt the "bad" query that is stopping the server (procpid column contains the PID of the process who can make a kill if necessary). Remember, however, that a simple periodic viewing vypolnyaya requests in any case not replace the wonderful log analyzer pgFouine. Also do not forget that the process in which you execute the query, too, falls into the resulting list.

 SELECT 
datname, 
NOW() - query_start AS duration, 
procpid, 
current_query 
FROM
pg_stat_activity 
ORDER BY duration DESC;


the

a List of current locks



A list of current locks the type of lock table and database, where she exhibited and number of the transaction that put the lock. Do not be afraid, if your query produces a long list of lock-ov — not all of them are critical and blocking the table from all possible changes, and even reading. For the analysis of block list should definitely read the documentation about the types in PostgreSQL and when and what queries they are exposed. One of the common cases where the list of locks might come in handy: executing command ps aux | grep ^postgres you see that the status of one of the processes postgres-written and waiting — it just means that the process is waiting for will be charged the lock. What exactly — find out by running this query.

 SELECT 
l.mode 
d.datname, 
c.relname, 
l.granted, 
l.transactionid 
FROM 
pg_locks AS l 
LEFT JOIN pg_database AS d ON l.database= d.oid 
LEFT JOIN pg_class AS c ON l.relation = c.oid;
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Car navigation in detail

Google has launched an online training course advanced search