This post is not about troubleshooting but rather about an interesting observation. I will start with a quick intro.
Visibility map is a fork of the relation which maintains two bits: one for visible, the other for frozen tuples. These bits set per page. If all tuples in a page are visible to all active transactions, the bit is then set. If at least one tuple is not so, then the bit is not set. These bits are set by vacuum only according to the docs. Although, you will never do it in production but truncate command also resets it which makes sense.
Weird thing can pop up if to consider how visibility map can be affected by failed transactions, although, at first, this may not sound as something that makes sense. Here is a quick demo about the point.
I am creating a table and running vacuum so that to have the relation’s _vm file in place.
postgres=# drop table customers ;
DROP TABLE
postgres=# create table customers (id integer);
CREATE TABLE
postgres=# vacuum customers;
VACUUM
postgres=# SELECT pg_relation_filepath('customers');
pg_relation_filepath
----------------------
base/5/57450
(1 row)Let’s ensure that the _vm file is there now.
postgres@ubuntu:~/18/main/base/5$ ls -lh *57449*
-rw------- 1 postgres postgres 8.0K May 10 18:12 57449
-rw------- 1 postgres postgres 24K May 10 18:12 57449_fsm
-rw------- 1 postgres postgres 8.0K May 10 18:13 57449_vmLet’s see what the bits are when the table is empty.
postgres=# select * from pg_visibility_map('customers', 0);
all_visible | all_frozen
-------------+------------
f | f
(1 row)Good so far: table is empty and, hence, flags not set.
The all_visible flag gets cleared when a DML runs against the table. It can also be the case that the DML in actuality does not create any invisible tuple but tracking that would be too much overhead which is why the semantic of the DML cannot be checked. The bit is reset regardless of the DML.
postgres=# vacuum customers ;
VACUUM
postgres=# select * from pg_visibility_map('customers', 0);
all_visible | all_frozen
-------------+------------
t | t
(1 row)
postgres=# insert into customers values (1);
INSERT 0 1
postgres=# select * from pg_visibility_map('customers', 0);
all_visible | all_frozen
-------------+------------
f | f
(1 row)The other important point here is that the DML does not have to commit to reset the flag as can be witnessed below.
The flags are set here.
postgres=# vacuum customers;
VACUUM
postgres=# select * from pg_visibility_map('customers', 0);
all_visible | all_frozen
-------------+------------
t | t
(1 row)The other transaction updates the table but does not commit
postgres=# start transaction ;
START TRANSACTION
postgres=*# update customers set id = 2 where id = 1;
UPDATE 2and the flags get reset.
Here is the catch about all this.
Suppose, transactions accessing that table (or its individual pages) fail for some reason (it could be due to a uncaught exception, for example) for a non-trivial period of time. From the other hand, queries access the table through index-only scan which needs those bits to avoid scanning the heap. This means in-between this time (start of the failure and the next vacuum run) the queries may suffer as they will do more heap fetches which would otherwise be avoided should the transactions run smoothly.