Day 2: Functions, JSON and Partitioning

Get to [2021-training-notes-day-1]. Get to [2021-training-notes-day-3].

Watch the day 2 recording until Laurenz’ machine says Aug 31 17:48.

Indexing continued.

Having an index on name won’t help you with a query that’s looking for WHERE upper(name).

:cry: What do we do? :smile_cat: Create an index on a function!

CREATE INDEX ON test (upper(name));

For an index to be useful with a WHERE expression the query should look something like:

<indexed expression> <operator> <constant>

The operator must be an operator understood by the optimizer. Use any constant that is constant during the query. To demonstrate how serious this is, consider:

EXPLAIN SELECT * FROM test WHERE id + 0 = 42;

We have an index on id but the database isn’t smart enough to know that anything plus 0 is….that thing.

With a query on an indexed expression, we see that a sequential scan is no longer used. In our example, the estimates were way off and the optimizer though it would find 20,972 rows when the real answer is 0.

PostreSQL collects table statistics regularly:

how big
value distribution
most common values
histograms to approximate distribution

Recalculation of these statistics is triggered by data modification…or when you tell it to with ANALYZE.

CREATE TABLE person (id bigint PRIMARY KEY, name text NOT NULL, bday date NOT NULL);

Store dates and times as…dates and times. Don’t stores dates as numbers. something something number of seconds since the epoch :laughing:

SELECT * FROM person where age(bday) > INTERVAL '80 years';

postgres=# CREATE INDEX ON person (age(bday));
ERROR:  functions in index expression must be marked IMMUTABLE
Time: 0.980 ms

The “volatility” of a function.

An immutable function will return the same result always, regardless of things like time of day, phase of moon, upper casing. We got this feedback because age is a function of time.

READ Function Volatility Categories

Stan provides an example of GitLab being hit by function volatility:

We got bit by this immutable index problem recently because comparing timestamps without time zone against timestamps WITH time zone doesn’t work, and some schemas had a mix of different types: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/68784

12:02 Break! 12:12 Back!

Timestamps in PostgreSQL

<digression>

\dTS
SELECT current_timestamp;
SELECT localtimestamp;

Timestamp without time zone: easy straightforward. Often, you want timestamp with time zone.

CREATE TABLE ts (t timestmap with time zone);

If you store a timestampw ith time zone, it’s converted to UTC and stored that way. When converted to a string, the conversion happens according to parameters in the database.

SHOW timezone;

CREATE TABLE ts (t timestamp with time zone);
INSERT INTO ts VALUES (current_timestamp);

postgres=# TABLE ts;
               t
-------------------------------
 2021-08-31 16:15:10.902239+00
(1 row)

Time: 0.372 ms
postgres=# INSERT INTO ts VALUES ('2021-08-31 00:00:00+05');
postgres'# TABLE ts;
postgres'# SHOW timezone;
postgres'# SET timezone = 'Europe/Vienna';
postgres'# TABLE ts;

Make sure timezone is set properly.
Use timestamp without time zone if you are doing time zone management in your application

</digression>

Indexes, continued

On btree indexes and LIKE

EXPLAIN SELECT * FROM test WHERE name LIKE 'fluff%';

The query above won’t use the index. The reason why is collation, which does not compare character-by-character. Why? Collation does not support comparison character-by-character (character-wise).

With an operator class, we can do this. Specifically we want a test pattern operator. We just create the index a little differently:

CREATE INDEX on test (name text_pattern_ops);

Take a look at the Index Cond and then the Filter:

# EXPLAIN SELECT * FROM test WHERE name LIKE 'brie%';
                                QUERY PLAN
---------------------------------------------------------------------------
 Index Scan using test_name_idx on test  (cost=0.43..8.45 rows=1 width=11)
   Index Cond: ((name ~>=~ 'brie'::text) AND (name ~<~ 'brif'::text))
   Filter: (name ~~ 'brie%'::text)
(3 rows)

The query we just created can also be used for equality comparisons (WHERE name = ‘fluffy’). So, we don’t need two indexes.

ORDER BY name is kind of the only use for a straight-up normal index.

access methods: there are others:

SELECT amname, amhandler FROM pg_am;
 amname |      amhandler
--------+----------------------
 heap   | heap_tableam_handler
 btree  | bthandler
 hash   | hashhandler
 gist   | gisthandler
 gin    | ginhandler
 spgist | spghandler
 brin   | brinhandler

:tada: Indexing is one of the strong points of PostgreSQL.

hash - Hash indexes are hash tables persisted on disk. Can only be used for equality searches, which a btree could also handle. Hash index can’t doa nything that a btree can not. Possible speed advantages. Corner case: indexing very large values to save space.
spgist and brin very special case indices.
gist - perfect for weird data types. exotic data types that can’t be used with a btree because they don’t have an order. Consider: geographic data points or range types.
gin Most useful for composite data types. Arrays, full text search, JSON
btree - data types must have a way of being ordered/compared. If not, you can’t use btree. DEFAULT.

Things we won’t talk much about:

support functions
spgist and brin

Range Types

These are intervals. Floating poitn ranges, timestamp ranges: hotel reservations.

CREATE TABLE reservations (id bigint PRIMARY KEY, name text NOT NULL, t_from timestamp NOT NULL, t_to timestamp NOT NULL);

Q: Is the froom free from 9a to 10a?

There are a few different ways that the overlap could occur. The query to do to make this happen would be annoying. Let’s use a timestamp range instead:

 CREATE TABLE reservations (id bigint PRIMARY KEY, name text NOT NULL, res tsrange NOT NULL);

Look at this query:

INSERT INTO reservations VALUES(1, 'room 1', '[2021-08-31 09:00:0,2021-08-31 11:00:00)');

The [ says that 9a is in the range and the ) at the end says that 11a is not in the range and permits adjacent reservations.

postgres=# SELECT * FROM reservations WHERE res && '[2021-08-31 10:00:0,2021-08-31 11:00:00)';
 id |  name  |                      res
----+--------+-----------------------------------------------
  1 | room 1 | ["2021-08-31 09:00:00","2021-08-31 11:00:00")

These operators are supported by the gist index:

&& overlaps operator
@> contains operator

The execution plan says a sequential scan will be happening. Lame. Let’s make a gist index:

CREATE INDEX ON reservations USING gist (res);

It’s still using a Sequential Scan so let’s do SET enable_seqscan = off; to force it to use the gist index that we created.

a btree index is a special kind of gist index but don’t worry about it.

exclusion constraint implemented using a gist index only:

ALTER TABLE reservations ADD EXCLUDE USING gist (res WITH &&);

The thing above will prevent an overlapping entry! Whoa. Here’s how it looks:

postgres=# INSERT INTO reservations VALUES(2, 'room 1', '[2021-08-31 09:20:0,2021-08-31 10:45:00)');
ERROR:  conflicting key value violates exclusion constraint "reservations_res_excl"
DETAIL:  Key (res)=(["2021-08-31 09:20:00","2021-08-31 10:45:00")) conflicts with existing key (res)=(["2021-08-31 09:00:00","2021-08-31 11:00:00")).
postgres=# INSERT INTO reservations VALUES(2, 'room 1', '[2021-08-31 08:20:0,2021-08-31 08:59:59)');
INSERT 0 1

That’s the coolest thing we’ve seen thus far, IMO.

Gin Indexes and JSON in a Database

There are two data types for JSON:

json –> stored as a string, preserves all formatting. use if it’s not being touched in the database, only stored and retrieved.
jsonb –> to do in-database processing, prefer jsonb. Use unless you have a reason not to.

Use :: as the special operator for casting. The right way is:

SELECT CAST ('{ "a": 42, "b": [ "x": true, "y": [1, 2, 3] } }' AS json);

The shorter and more fun ways:

SELECT '{ "a": 42, "b": [ "x": true, "y": [1, 2, 3] } }':: json;

SELECT '{ "a": 42, "b": [ "x": true, "y": [1, 2, 3] } }':: jsonb;

Inserting JSON into the database

INSERT INTO json VALUES (1, '{ "a": 1, "b": {  "y": [3, 4, 5] } }');
INSERT INTO json VALUES (2, '{ "a": 1, "b": {  "y": [3, 4, 5] } }');

Querying JSON in the database!

postgres=# SELECT * FROM json WHERE j ->> 'a' = '42';
 id |                      j
----+---------------------------------------------
  1 | {"a": 42, "b": {"x": true, "y": [1, 2, 3]}}
(1 row)
postgres=# SELECT * FROM json where j @> '{ "a": 42 }';
 id |                      j
----+---------------------------------------------
  1 | {"a": 42, "b": {"x": true, "y": [1, 2, 3]}}
(1 row)
postgres=# CREATE INDEX ON json USING gin (j);
CREATE INDEX
postgres=# EXPLAIN SELECT * FROM json where j @> '{ "a": 42 }';
                                QUERY PLAN
--------------------------------------------------------------------------
 Bitmap Heap Scan on json  (cost=12.00..16.01 rows=1 width=40)
   Recheck Cond: (j @> '{"a": 42}'::jsonb)
   ->  Bitmap Index Scan on json_j_idx  (cost=0.00..12.00 rows=1 width=0)
         Index Cond: (j @> '{"a": 42}'::jsonb)
(4 rows)

So cool!

Words on JSON in the database:

Temptation to store everything in JSON is strong: why would I even create columns in a table? Or multiple tables? Just pop everything into JSON.
Queries can become very slow and complicated if you do this. Constraints can’t be set on parts of the JSON. That’s lame.
Use JSON in the database very sparingly. In 98% of all cases, it’s better to avoid it. Things twice before proceeding.
READ http://patshaughnessy.net/2017/12/15/looking-inside-postgres-at-a-gist-index

13:03 Break! 13:17 Back!

Sometimes you want to perform a similarity search.

Instead of SELECT * FROM names WHERE name IS 'lorenz'; Let’s create an extension with CREATE EXTENSION pg_trgm;. The pg_trgm gives us the % operator that lets us do

SET enable_seqscan = on;
SELECT * FROM TEST WHERE name % 'lorenz';

The btree index does not support the % operator. OMG.

CREATE INDEX ON test USING gin (name gin_trgm_ops);

A trigram index supports queries like LIKE '%rie%';

USE ILIKE instead of LIKE for case-insensitivity. Neat-o!

trgm - trigram. All that stuff from the cryptanalysis section of Cryptography is useful!
READ English Letter Frequencies on practicalcryptography.com, served via http. :lock:

# SELECT show_trgm('brie');
          show_trgm
-----------------------------
 {"  b"," br",bri,"ie ",rie}
(1 row)

How cool is that! My name is a little short to make playing with trigrams very fun. (Why trigrams and not quadgrams?)

While trigrams are amazing, they are not ideal for full text search.

We see it matching for two spaces at the beginning of the sring but only one at the end because matches at the beginning of the string are emphasized.

READ 3 Ways to Detect Slow Queries in PostgreSQL

End of indexes!

Transactions, Concurrency, Locking

Consider a transaction as a group of grouped statements within a set of parentheses.

Let’s have some ACID. :microscope:

Atomicity –> all or nothing, it all succeeds or it all fails.
Consistency
Isolation Pretend you are the only person working on the database.
Durability Once a transaction has been committed, it sticks. We won’t discuss this much.

In PostgreSQL, statements can only be run in a transaction. They run in auto commit mode. If you don’t specifically start a transaction, PostgreSQL puts it in one for you.

This gives us interesting advantages:

Either the whole statement runs or none of it runs. You update 20 rows, get an error due to network outage on the 21st, sorry! If you want a transaction taht spans multiple statements, you have to make a transaction:

START TRANSACTION;
BEGIN;

Transactions can not be nested. You can only have one transaction. ata time. To end the transaction:

COMMIT;
BACK; (?)

The PostgreSQL prompt will show you when youa re in a transaction by updating the prompt:

postgres=# BEGIN;
BEGIN
postgres=*#

In the spirit of all or nothing:

postgres=*# SELECT 1 / 0 ;
ERROR:  division by zero
postgres=!# SELECT 42;
ERROR:  current transaction is aborted, commands ignored until end of transaction block

Once it hits a problem, the transaction is done. You have to start over.

You can use SAVEPOINT to create a subtransaction so that you can ROLLBACK to the save point or the entire transaction. Use SAVEPOINT but don’t overuse it. Performance and concurrency can suffer. Don’t do something like….one SAVEPOINT for each statement.

Take a look:

postgres=# BEGIN;
BEGIN
postgres=*# SELECT 42;
 ?column?
----------
       42
(1 row)

postgres=*# SAVEPOINT alpha;
SAVEPOINT
postgres=*# SELECT 1 / 0;
ERROR:  division by zero
postgres=!# SELECT 42;
ERROR:  current transaction is aborted, commands ignored until end of transaction block
postgres=!# ROLLBACK TO SAVEPOINT a;
ERROR:  savepoint "a" does not exist
postgres=!# ROLLBACK TO SAVEPOINT alpha;
ROLLBACK
postgres=*# SELECT 42;
 ?column?
----------
       42
(1 row)

postgres=*# COMMIT;
COMMIT

Readers never block writers and writers never block readers. In our money transfer example, the “old” data is shown. The UPDATE we are running puts an exclusive lock, preventing any further writes. Locks are held until the end of the transaction. Writers can block writers. This is an example of Isolation at play. Any subsequent writes that can’t proceed due to the lock can run once the locks are removed at the end of the transaction.

There are 4 isolation levels in the SQL standard:

READ UNCOMMITTED No way to see uncommitted data in PostgreSQL. If you request this, you’ll actually get READ COMMITTED.
READ COMMITTED I never see dirty data from uncommitted transactions. Committed value is always shown. This is the default in PostgreSQL.
REPEATABLE READ – Nothing changes. I see the same values, it’s like having a snapshot of the data.
SERIALIZABLE Completely absolutely guarantees no anomalies. Rarely used. We won’t talk about it.

Use repeatable read for complicated reports to make sure you get consistent results, no matter how long it takes the report to be generated.

READ SUBTRANSACTIONS AND PERFORMANCE IN POSTGRESQL

13:54 Break! 14:06 Back!

We might have the impression that isolation always works: it doesn’t, it’s an illusion. There’s a good reason for the different isolation levels. The “lower” the isolation level, the fewer guarantees.

anomaly weird things that happen when transactions run concurrently and that would never happen if they ran independently.

There’s a specific kind of anomaly called lost update anomaly. Two options for preventing this:

pessimistic locking Stay at the lower isolation level and take care of preventing bad things from happening yourself. You do this with pessimistic locking. Place a row lock by appending FOR UPDATE or (even better) FOR NO KEY UPDATE. Better if a conflict is likely. Tradeoff: you introduce locks which are bad for concurrency.
optimistic locking Pretend everything is OK. If the value is modified between when you read and when you wrote, eh, you’ll have to start over. This doesn’t degrade concurrency but increases the likelihood of needing to repeat a transaction.

You can do either of these but the first is preferred as it will lock less.

SELECT amount FROM accounts WHERE id = 1 FOR NO KEY UPDATE;
SELECT amount FROM accounts WHERE id = 1 FOR UPDATE;

There are other ways:

SELECT amount FROM accounts WHERE id = 1 FOR NO KEY UPDATE NOWAIT; <-- Return error write away.
SELECT amount FROM accounts WHERE id = 1 FOR NO KEY UPDATE SKIP LOCKED; <-- Won't wait but will pretend that locked rows DO NOT EXIST. 

You might not use SKIP LOCKED super often.

Setting the Isolation Level

You can set the isolation level with something like:

BEGIN ISOLATION LEVEL REPEATABLE READ;

The Infamous Dead Lock

:skull: :lock: There is a deadlock checker that looks for deadlock cycles. It will cancel one of the transactions. A deadlock is a kind of serialization error.

Q: What should I do if I hit a deadlock?
A: Repeat the transaction that failed.

Two deadlocks per day: NBD. Deal with it in the application :smiley_cat: Five deadlocks per minute: worry! :cry:

Completely avoiding deadlocks: not possible. Work to avoid/reduce.

The :bow_and_arrow: bow and arrow analogy:

Brie grabs the bow.
Plop grabs the arrow.
Brie and Plop are each waiting for the other to shoot the arrow in order to proceed.

Hmm. What if we both followed the rule “grab the bow first”? No more :skull: :lock: . Translated to the database: consider a rule like always update the account with the lower ID first.

Debug Locks :bug: :lock:

Recall that locks are always held until the end of the transaction. You might be running into lots of locks because your transactions are too long.

There are two kinds of databases:

Those with short transactions
Those with trouble.

Keep transactions short. Keep transactions small. Don’t modify more in a transaction than is necessary for consistency.

Let’s learn more:

postgres=# SELECT * FROM pg_stat_activity WHERE datname IS NOT NULL;
postgres=# SELECT * FROM pg_locks;

Look for locks where granted is f.

From that first query, we get process IDs!

 datid | datname  | pid  | leader_pid | usesysid | usename  | application_name | client_addr | client_hostname | client_port |         backend_start         |          xact_sta
-------+----------+------+------------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+------------------
 13395 | postgres | 3734 |            |       10 | postgres | psql             |             |                 |          -1 | 2021-08-31 17:42:07.755783+00 | 2021-08-31 18:28:
 13395 | postgres | 3782 |            |       10 | postgres | psql             |             |                 |          -1 | 2021-08-31 17:48:58.460694+00 | 2021-08-31 18:28:
 13395 | postgres | 3932 |            |       10 | postgres | psql             |             |                 |          -1 | 2021-08-31 18:29:19.932291+00 |
 13395 | postgres | 3968 |            |       10 | postgres | psql             |             |                 |          -1 | 2021-08-31 18:31:18.042713+00 | 2021-08-31 18:32:
(4 rows)

Get more info with SELECT pg_blocking_pids(3968);.

The blocking transaction must be terminated. There are two ways to do this!

SELECT pg_cancel_backend(PID); This is when you see ERROR: canceling statement due to user request.
SELECT pg_terminate_backend(PID); That kills the whole session. psql will try (and probably succeed) to reconnect.

The notes above are to get you out of a bind. The right thing is to fix the application so it doesn’t make you have to do this.

Transactions

You can prevent long-running transactions but be super careful. Observe:

postgres=# SET idle_in_transaction_session_timeout = '2s';
SET
postgres=# BEGIN;
BEGIN
postgres=*# SELECT 42;
FATAL:  terminating connection due to idle-in-transaction timeout
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Consider a value like 1 minute but IDK, up to you!

Use statement_timeout to prevent runaway queries.

MultiVersioning

postgres=# SELECT ctid, xmin,xmax, * FROM accounts;
  ctid  | xmin | xmax | id | name |  amount
--------+------+------+----+------+----------
 (0,11) |  561 |    0 |  1 | Plop |  9800.00
 (0,12) |  562 |  563 |  2 | Brie | 20200.00
(2 rows)

The ctid is the current tuple id, the physical storage location in the table for that row:

Plop is the eleventh entry in the first block.

An UPDATE doesn’t really update, it makes a copy. Observe the chane in ctid:

postgres=*# SELECT ctid, xmin, xmax, * FROM accounts;
  ctid  | xmin | xmax | id | name |  amount
--------+------+------+----+------+----------
 (0,11) |  561 |    0 |  1 | Plop |  9800.00
 (0,14) |  566 |    0 |  2 | Brie | 20300.00
(2 rows)

postgres=*# UPDATE accounts SET amount = amount + 100 WHERE id = 2;
UPDATE 1
postgres=*# SELECT ctid, xmin, xmax, * FROM accounts;
  ctid  | xmin | xmax | id | name |  amount
--------+------+------+----+------+----------
 (0,11) |  561 |    0 |  1 | Plop |  9800.00
 (0,15) |  566 |    0 |  2 | Brie | 20400.00
(2 rows)

I missed it a bit but xmin and xmax give us info about what transaction introduced a value and which invalidated it. There will be multiple versions of a row in a table. Check my screenshot from 14:46. Deleting a row doesn’t remove it, it just changes the xmin. These old versions of the row will eventually need to be deleted. We’ll talk about garbage collection for tomorrow. Doing it at commit time would be too soon. Recall how REPEATABLE READ works.

Q: Can one query for the invisible rows?
A: No, not really.

CREATE EXTENSION pageinspect;

A low-level extension that lets you read the raw data. Useful for debugging data corruption:

SELECT * FROM heap_page_item_attrs(get_raw_page('accounts', 0), 'accounts');

:warning: Vaccuuming the table cleans up the multi-versioned rows. Each row has an entry in the index. Each version of the row has an entry in the index. You can wind up with duplicate entries in a unique index but only one row will be visible. Indexes don’t have the xmin and xmax information.

:question: Questions

Q: Is it performant to query pg_stat_activity?
A: Yes. That table is not indexed, the table. isstored in memory. No I/O pain.
Q: I have a query taking too long, locking everything up. How do we find the culprit query?
A: Looking for a hanging backend or a hanging query. Use the pg_blocking_pids function, find who is blocking that backend.
- Q: What if I want to get to the root of the tree?
- A: No ready-made answer. You have to find it yourself. It’s usually a queue, not a tree.

Table Locks

Let’s learn about table-level locks. Take a look at Table 13.2. Conflicting Lock Modes. If the cell has an X, those locking modes are not compatible. For example ROW EXCLUSIVE locks are compatible with one another and that’s why we can INSERT multiple times in the same table concurrently.

Avoid SELECT while DROP the same table.