TM

[Transaction Management]

A different kind of TM :)

Ch.10

What is a 'transaction'?

'ACID'

'ACID' (+S) is an acronym to express the desirable properties of a transaction:

Tracking updates

Concurrency - 'many at once'

Consider the following pair of transactions, starting with 35 units of an item: purchase 100 units, then sell 30 units:

If the transactions T1 and T2 happen one after another (not concurrently), the PROD_QOH value would/should be 105, as shown above.

Three different kinds of errors are possible, when interleaved transactions are not handled properly.

Error #1: lost updates

'Lost update' problem:

Here, instead of 105, our QOH comes out to be 5, which is incorrect.

Consider this rollback situation:

Proper rollback (with sequential transactions):

Here the total is 5, which is correct - the QOH goes from 35 to 135, gets rolled back to 35, after which the purchase (of 30 units) happens.

Error #2: reading uncommitted data

'Uncommitted data' problem (improper rollback):

Here, the total comes out to 105, when it should have been 5.

Consider the following fix (of a typo made earlier - an incorrect order of 10 units was placed for 1558-QW1 by mistake, instead of ordering 1546-QQ2; now we're fixing that error):

In the above, T1 is a transaction that sums up QOH; T2 is the 'correction' transaction (that fixes the incorrect purchasing).

Proper retrieval of modified data:

The summing transaction gets proper totals for both affected products, so the total (of 92) is correct.

Error #3: improper [premature] retrieval

'Inconsistent retrievals' problem (improper ("before update") retrieval of modified data):

T1 should be doing 65+13 (which would be correct), but instead does 65+23 (which makes it incorrect) - in other words, T1 retrieves the correct value (25) for 1546-QQ2, but gets the incorrect value (23) for 1558-QW1.

Inconsistent retrieval is also called a 'dirty read'.

Look up: non-repeatable reads, phantom row reads.

Concurrent, serializable schedule

A serializable schedule makes concurrency immaterial (non-issue).

Also, DB admins can set a specific 'isolation' level ('TIL') for a DB, which prescribes the extent to which a transaction can be isolated from others [the more the isolation, the more other transactions are 'starved'; the less the isolation, the higher the chances for invalid results].

More on TILs: https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/transaction-isolation-levels?view=sql-server-ver15 and https://www.infoq.com/articles/Isolation-Levels/

TRL pessimistic 'lock' situations: restrooms, RCS, Robert's Rules Of Order..

Locking: granularity

Locking: types [unlocked, locked_read, locked_write]

Read(shared), write(exclusive) lock

'2PL'

The 2PL type that we are discussing, where a transaction acquires all the locks it needs before processing starts, is called 'Conservative' 2PL (or 'Static' 2PL).

2PL [cont'd]

Once all the locks are acquired, a transaction can proceed 'smoothly', ie won't 'hang' or lead to dirty reads etc.

As for the middle point above (unlocking can't precede locking), here is a loose analogy:

OK:
{
  {
    {
      {
        {
          ....
        }
      }
    }
  }
}

Not OK (OK in programming, though!):
{
  {
  }
  {
   {
      {
      }
   }
  }
}

2PL [cont'd]

Deadlocks..

Timestamping-based schemes prevent deadlocks (see upcoming slide); 2PL avoids deadlocks; detection is used in both.

How/why a deadlock occurs

In the above, we see that T1 locks X, T2 locks Y, then deadlock occurs because T1 wants Y and T2 wants X. Maybe you are thinking - why can't each of them release what they are holding (since they might be done with it), and grab what they want next (ie. T1 would release X, T2 would release Y)? BECAUSE THEY CAN'T - they NEED to access what the other has, BEFORE they can release what they have. Eg. T1 might need to read Y that is locked by T2, in order to update its X; T2 might need T1's X to use in an expression to compare with its Y. They need access to each other's resources, **before** they can release their own! That is what causes deadlocking.

Note that more than two transactions can become deadlocked as well, on account of 'circular' (cyclical) waiting.

Here's a humorous take on deadlocking that can occur in human interactions.

Deadlock occurrence in 2PL

Earlier we noted that the 2PL scheme cannot prevent deadlock creation. Here is an example of how a deadlock could occur (at the end of step 5):

Here is one way to get monotonically increasing GUIDs..

Deadlock prevention

Note that wound-wait is a preemptive deadlock prevention scheme, whereas wait-die is a non-preemptive one..

PS: Deadlock detection is periodically carried out by detecting cycles in waiting transactions.

Deadlock 'indifference' ['fix-it-later']