After nearly a year of development work, a
A major performance enhancement involves the Free Space Map (FSM), a data structure that tracks what space is available for reuse inside the Postgres backend data store. Every time your database performs an INSERT or UPDATE, Postgres uses this structure to help determine where new information will be written.
The FSM was improved in critical ways, including making two configuration parameters obsolete, and adding a new structure called the visibility map. The visibility map enables partial VACUUMs, significantly reducing I/O overhead on this critical Postgres maintenance routine. Together, these changes represent two huge administrative usability and performance improvements for VACUUM in version 8.4.
What is VACUUM?
VACUUM is a maintenance routine necessary for removing "dead" tuples. Dead tuples are a natural result of Postgres' implementation of multi-version concurrency control (MVCC). When querying a Postgres database, a transaction (for example, a SELECT operation) sees a snapshot of data (a database version) as it was at some point in the past. As changes occur, new snapshots are available. But, the database also keeps the old versions.
When a transaction completes, the snapshot it was using, and some of the data in the snapshot, may have become obsolete (for example: a DELETE or UPDATE operation may have occurred). At that point, the space occupied by the old, obsolete data can be reused. To reuse that space, some process needs to walk through your database and identify which tuples are dead so that their disk space can be reclaimed or reused. This is what VACUUM does.
Normal operation of your database requires a periodic VACUUM. In versions of Postgres released in the last two years, "autovacuum" is enabled by default, and will perform the maintenance operation for you on a regular basis.
How often you run VACUUM depends on your database and the rate at which data changes. More information about running VACUUM and autovacuum configuration can be found in the PostgreSQL manual.
How does the new Free Space Map improve VACUUM?
A few critical features of the new Free Space Map:
- Has a binary tree structure
- Uses 1 byte per heap page
- Shows the maximum amount of contiguous space available in the top space
- The data structure auto-repairs and can be reconstructed from the bottom.
Previously, every time that VACUUM was run, the free space map had to be reconstructed from scratch. The new FSM is stored on disk in separate files inside of
$PGDATA/base/, and is cached in
shared_buffers. During a VACUUM, individual nodes in the map are be updated (also known as "retail" updates) as reusable space is identified.
Because of this more flexible implementation, Postgres is able to track and adjust this data structure without configuration. The FSM can also now grow in size without requiring a database restart. Previous versions of Postgres require that you monitor the results of a VACUUM VERBOSE, or use a contrib module to track FSM-related parameters. If you needed to adjust those parameters, you would need to perform a database restart during a maintenance window.
The practical effect of these changes will be faster initialization of the FSM at system start, persistence of the data structure even in the event of a database crash and less manual configuration. Two configuration options: max_fsm_relations and max_fsm_pages no longer exist in version 8.4.
New data structures: Visibility Map
The second big improvement in free space tracking is the visibility map. This is a bitmap that tracks what data is visible to transactions. So, using the visibility map, you can quickly see which parts of a table, and ultimately the data pages inside that table, are most likely to have free space.
Previously, when VACUUM ran it had to look at every tuple in a table because there was no information stored about which pages were updated since the last VACUUM. This meant that a VACUUM operation was effectively a sequential scan of a table, an expensive read operation for large tables. With the visibility map, VACUUM will now be able to perform partial scans of tables, skipping data pages which are marked as fully visible. Partial scans means fewer disk I/O operations and faster VACUUMs!
Peeking inside your base directory
If you look inside the base directory, you will see directories that correspond to the OIDs of each database in the database cluster. OIDs are the internal identifier that Postgres uses for database objects.
FSM and Visibility Map are now stored in the filesystem alongside the relation storage, with '_fsm' and '_vm' appended to the OID of the table OID name.
lulu:base postgres$ ls -la 16392/ | head -10
drwx------ 211 postgres daemon 7174 Feb 9 19:24 .
drwx------ 14 postgres daemon 476 Jan 11 23:30 ..
-rw------- 1 postgres daemon 8192 Jan 11 23:30 112
-rw------- 1 postgres daemon 8192 Jan 11 23:30 113
-rw------- 1 postgres daemon 57344 Jan 11 23:30 11444
-rw------- 1 postgres daemon 24576 Jan 11 23:30 11444_fsm
-rw------- 1 postgres daemon 8192 Jan 11 23:30 11444_vm
-rw------- 1 postgres daemon 0 Jan 11 23:30 11446
-rw------- 1 postgres daemon 8192 Jan 11 23:30 11448
Looking to the future
The new free space map and visibility map in 8.4 may help pave the way to a long-desired feature: index-only scans. The visibility map makes it possible to determine whether an index value is up-to-date, or needs to be updated with newer data that is in a table. Rather than fetching every value from disk to ensure that the index data is consistent with table data, a smart planner can just consult the visibility map, which will indicate which tuples may have been updated, thus avoiding a table fetch for every index value.
The ability to avoid fetching table data when a query only requests data that is available in an index is an obvious performance win. Think about fetching a primary key value, for example. Index-only scans are supported in most other major databases, and will be a great addition to a future version of Postgres.
ABOUT THE AUTHOR: Selena Deckelmann works for End Point Corporation and is an enthusiastic open source advocate and PostgreSQL specialist. She is co-chair of Open Source Bridge conference and will be speaking at PgCon 2009. In her spare time, she likes to mix drinks for her local Perl and Postgres user groups, and fetch eggs from her chickens.
This was first published in March 2009