Pular para o conteúdo principal

Article: Preventing Database Gridlock: Recognizing and Resolving Deadlock Scenarios



Learn how deadlocks occur in database systems, understand their impact on performance, and discover practical techniques for identifying potential deadlock scenarios in your SQL code.


Understanding Database Deadlocks: When Your Queries Come to a Standstill

Imagine two trains* approaching each other on a single-track railway. Each train needs to reach the other side, but neither can proceed because the other train is blocking the way. Both engineers are waiting for the other to back up, resulting in a complete standstill. This real-world scenario perfectly illustrates what happens during a database deadlock we face all day.
*) I guess this "train example" to be very popular on it because it seems this was the first technology we face this challenge of having 2 direct opposite forces using the same resource, at least the first popular one we see people talking about in papers. If you know other story on this, please let me know on the comments.
In the database world, a deadlock occurs when two or more transactions are waiting indefinitely for one another to release locks on resources, creating a circular dependency that prevents any of them from proceeding. The database management system eventually detects this situation and terminates one of the transactions (which we could call the "victim"), allowing the others to complete.

There are many ways it could happen and in some level one could say it is one of that things the sets SGDBs apart (since it works in a very high level of operation inside the database application itself) but even though I believe this to be true I can say as well they are much alike, no matter the SGDB we are talking about, if it is a Relational Database type it MUST have some kind of DeadLock Resolution Mechanism because it is a base principal of relational databases. And to ease the explanation we will be using the mysql in this article and considering we are not explaining it on a high level of operation most of the scenarios here will apply for any other relational SGDB as well.

For a deadlock to occur in a database system, four conditions must be present simultaneously (to what we call "The Four Horsemen of Deadlocks Necessary Conditions"):
  1. Mutual Exclusion: Resources (like database rows) cannot be shared simultaneously.
  2. Hold and Wait: Transactions hold resources while waiting for additional ones.
  3. No Preemption: Resources cannot be forcibly taken away from transactions.
  4. Circular Wait: A circular chain of transactions exists, with each waiting for a resource held by the next transaction in the chain.
Understanding these conditions helps us identify potential deadlock scenarios in our database interactions and develop strategies to prevent them.
These are commonly known as well as the "Coffman Conditions" (this was the reference I had when I first get to know of it) or "Coffman's Deadlock Conditions". It was formalized by computer scientist Edward G. Coffman, Jr. in 1971.

Common Database Deadlock: Scenario 1 - The Classic Update Cross-Lock



The most common deadlock occurs when two transactions update the same rows in different orders:

-- Transaction 1
START TRANSACTION;
UPDATE customers SET balance = balance - 100 WHERE id = 123;
-- Some processing time elapses
UPDATE wallets SET payout = 200 WHERE id = 456;
COMMIT;

-- Transaction 2 (running concurrently)
START TRANSACTION;
UPDATE wallets SET payout = 150 WHERE id = 456;
-- Some processing time elapses
UPDATE customers SET balance = balance + 150 WHERE id = 123;
COMMIT;

In this scenario, Transaction 1 locks customer #123 and waits for a lock on wallets #456, while Transaction 2 locks wallets #456 and waits for a lock on customer #123. Neither can proceed, creating a perfect deadlock.

Common Database Deadlock: Scenario 2: The Subtle INSERT IGNORE Deadlock



A less obvious but equally problematic scenario involves concurrent INSERT IGNORE operations:

-- Transaction 1
START TRANSACTION;
INSERT IGNORE INTO content (title, user_id, type) 
VALUES ('Explosive news content', 101, 'politics');
COMMIT;

-- Transaction 2 (running concurrently)
START TRANSACTION;
INSERT IGNORE INTO content (title, user_id, type)
VALUES ('Adventure news content', 102, 'sports');
COMMIT;

This deadlock occurs because both transactions:

First acquire exclusive (X) locks on the Supremum Record* to check for duplicates
Then both try to acquire insert intention locks on the same Supremum Record
When these operations run simultaneously, they can create a circular wait condition that results in a deadlock.
*) Supremum Record is a behind-the-scenes helper that ensures the database knows where a page’s records end, especially when managing locks.

This "INSERT IGNORE" is a MySQL/MariaDB exclusive command (not an SQL99 standard nor any other review that I know of) but for every other SGDB there probably have a similar command. Bellow is a small list of SGDBs and related commands that works similar to the "INSERT IGNORE" to consider and that could result in an "INSERT IGNORE" deadlock as well.

SGDB:Command
PostgreSQL:INSERT ... ON CONFLICT DO NOTHING
SQLite:INSERT OR IGNORE
Oracle:MERGE
SQL Server:MERGE / IF NOT EXISTS INSERT

Common Database Deadlock: Scenario 3: The Index-Related Deadlock



Deadlocks can also occur when different transactions access the same data through different indexes:

-- Transaction 1
START TRANSACTION;
SELECT * FROM wallets WHERE id = 1 FOR UPDATE;
UPDATE wallets SET value = value + 10 WHERE id = 1 AND payout > 100;
COMMIT;

-- Transaction 2 (running concurrently) 
START TRANSACTION;
SELECT * FROM wallets WHERE payout > 100 AND id = 1 FOR UPDATE;
UPDATE wallets SET value = value + 20 WHERE id = 1;
COMMIT;

If the table has separate indexes on both id and payout, the database might use different access paths, potentially causing deadlocks even when operating on the same record.

In the sample we use the "FOR UPDATE" illustration because it may result in a "Gap Lock"** or in a "Next-key lock"* and are specially possible when using InnoDB engine on MySQL/MariaDB.

Gap Locks lock the space between index records, or the gap before the first or after the last index record in an index. Unlike row locks that locks actual data, gap locks prevent other transactions from inserting new rows that would fall within the locked gap.

A Next-key Lock locks the index record and the gap before that index record. For example, if there are index entries for values 10, 20, and 30, the possible next-key lock ranges would be: a) The gap before 10; b) The value 10 and the gap between 10 and 20; c) The value 20 and the gap between 20 and 30; d) The value 30 and the gap after 30 

How to Search for Potential Deadlocks in Your Code Repository

With a better understanding of how the locks works you gain in this article try to identify transaction patterns that resembles those we saw. Search your codebase for transaction blocks that:
  • Begin and end with START TRANSACTION and COMMIT statements (you just need to take care for the commands inside those commands);
  • Include multiple UPDATE, INSERT, or DELETE operations;
  • Operate on related tables;
  • May have conditional logic that affects the order of operations;
  • Look for code like:

START TRANSACTION;
-- Multiple database operations here
-- Particularly on different tables
COMMIT;

The following I try to lay a little more light on those ...

Review Lock Acquisition Order

Examine your transaction code to identify the order in which locks are acquired. Look for instances where different code paths might acquire locks in different orders:

Check for transactions that update related tables in different orders
Look for conditional logic that might change the order of operations
Pay special attention to transactions that might run concurrently

Analyze Database Monitoring Output

Review your database logs and monitoring tools for "Deadlock detected" messages
and "Transaction rollbacks" due to deadlocks.

One toll to use on MySQL/MAriaDB would be the "SHOW ENGINE INNODB STATUS" command output on the section "Latest detected deadlocks". Example deadlock information from MySQL:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2023-04-25 10:15:32 0x7a9c0fa70f90
*** (1) TRANSACTION:
TRANSACTION 284052907210, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 250819, OS thread handle 140155019614944, query id 62870507 172.1.44.6 mydatabase update
INSERT IGNORE INTO...

There you should have access to a report on what happened to the scenario, which command was placed first, which were lock waiting for the first and which of them were dropped (the victim of the stalemate) and issue a Deadlock error.

Bellow a list of alternatives you could search more info on other SGDBs

DatabaseEquivalent to SHOW ENGINE INNODB STATUS
MySQLSHOW ENGINE INNODB STATUS;
PostgreSQLpg_locks, pg_stat_activity, log settings
Oraclev$session, dba_blockers, trace files
SQL Serversys.dm_tran_locks, Extended Events
SQLiteBasic PRAGMA info (no real locks inspection)

Examine Common Deadlock Patterns

Look specifically for:
  • Multiple INSERT IGNORE (mysql specific, for others see the table on the "INSERT IGNORE" item on this article) statements on the same table
  • Transactions that perform read-then-write operations
  • Operations that might escalate from row locks to table locks
  • Foreign key relationships that might create implicit locks

Use Static Code Analysis Tools

Consider using database-specific static analysis tools to identify potential deadlock scenarios in your codebase. These tools can often detect lock acquisition patterns that might lead to deadlocks.

Preventing Deadlocks: Best Practices

  1. Consistent Order: Always access tables and rows in the same order across all transactions;
  2. Keep Transactions Short: Minimize the duration and scope of transactions. This is not only the amount of time the queries inside the transaction takes to run the full script but the amount of commands you have. To shorten it consider not to use a Search then Insert OR Update but Try to Insert and Update on fail mechanism. And try not to use calculations inside the transaction, when opening a transaction try only to push the changes to the database and finish it;
  3. Use Appropriate Isolation Levels: Consider using READ COMMITTED instead of REPEATABLE READ when possible;
  4. Add Retry Logic: Implement deadlock detection and retry mechanisms in your application. When implementing retry logic, avoid immediate retries as they might exacerbate lock contention. Instead, add a delay. Some prefer randomized delay, I myself use a fixed time of half of a second of delay on those cases;
  5. Review Indexes: Ensure proper indexing to minimize lock contention here (sometimes) less is more, if you have too many indexes it will take longer for an update or insert to run;
  6. Monitor and Analyze: Regularly check deadlock logs to identify patterns;
By understanding how deadlocks occur and implementing these proactive measures, you can significantly reduce their occurrence in your database systems, leading to improved performance and reliability.

Referencies:

  • Coffman, E. G., Elphick, M., & Shoshani, A. (1971). "System Deadlocks." ACM Computing Surveys, 3(2), 67-78.
  • MySQL manual;

Comentários

Postagens mais visitadas deste blog

PHP: Always use an index to add a values to an array

  Demand Always identify the array index when manipulating the array directly Description Whenever you are adding a new item to the array you must add an index so that php does not need to recalculate the size of the array to identify in which position the new content will be added. Examples 1: # Bad code: 2: do { 3: $array[] = $newItem; 4: } while(!feof($fp)); 5: 6: # Good code 7: $array = []; 8: $index = 0; 9: do { 10: $array[$index++] = $newItem; 11: } while(!feof($fp)); Examples Explanation When instantiating a new item in an array in the "bad example" php will identify that you expect the item to be added to the end of the array, since array is a primitive type and has no attributes php always needs to call a calculation function to identify the current size of the array before trying to add a new item to the end of it. By monitoring the growth of the array in an external variable and automatically reference which position will...

General: Prevent arbitrary precision arithmetic

Demand Some simple decimal calculations like 4.6 * 100 may lead in irregularities when dealing with numbers. Description During arithmetic operations like addition, subtraction, multiplication, and division, the numbers are manipulated according to the rules defined by the IEEE 754 standard. However, due to the finite precision of the representation, rounding errors may occur, leading to small discrepancies between the expected and actual results of computations. In simpler terms, the computer operates using powers of 2 (binary-based), so whenever we need to work with a number that cannot be exactly represented in binary (like 0.1, which is a base-10 fraction), it uses an approximation rather than an infinite number and is incapable of presenting the exact correct value. To prevent it we may use some libraries to prevent the problem. In Javascript and Typescript we have the BigNumbers library (and others). PHP has a built-in library for arbitrary-precision arithmetic called BCMath (Bin...