Ways to control Concurrency

Chapter 18

Only sections 1, 5, 6, 7

Brief review of transaction processing

• Because transactions are interleaved by the OS, some sort of concurrency control is needed

• A schedule is the order of operations of concurrent executing transactions

– take the interleaving into account

• Some schedules are serializable, others are not

Overview of subject

• We will look at ways to control concurrency to ensure the isolation property of concurrently executing transactions

• Most of these techniques ensure serializability of schedules

• They use protocols that guarantee serializability

– Most DBMSs use locking of data items

– Another set of protocols use timestamps

• Another factor that affects concurrency controls the granularity of the data items

– The granularity may be anything from a data attribute or tuple, or entire table

Summary of topics for today

• Locking data items

• Binary locks

• Shared/exclusive (read/write) locks

• Two phase locking

• Problems with two phase locking

– Deadlock

– Starvation

Locking data items

• A lock is a variable associated with a data item

– Exactly what is meant by a data item is determined by the granularity.

• The lock describes the status of the item; what possible operations can be applied to it

• Usually there is one lock for each data item in the DB

• Locks are used to guarantee the serializability of transaction schedules

• There are problems with using locks: deadlock and starvation

Types of locks

• Binary locks

– These are simple two state variables associated with a data item

– But they are so restrictive that that are not used

– We will discuss them, to get an idea of locks

• Shared/exclusive locks

– These provide a more general locking capability

– They are used in practical DB locking schemes

Binary locks

• A binary lock can have two states: locked or unlocked (zero or one)

– If the value lock(x) = 1, x cannot be accessed,
if the value lock(x) = 0, then x can be accessed

• Two operations are used:

– Lock_item(x) and unlock_item(x)

• Look at the implementation of lock_item and unlock_item

• The code for the two operations must not be interleaved while it is executing (it is a critical section)

Implementing a binary lock

• In its simplest form, a lock can be a record with three fields < data item, 0 or 1, changeLock>

– It also needs associated with its queue for transactions that are waiting to access the item.

• The system needs to maintain a lock table

– This would contain the names of those items that are locked (usually a hash table)

– If a data item is not in the table, it is assumed unlocked.

• The DBMS must have a lock manager subsystem to keep track of and control access to locks

Rules transactions T must follow

• T must lock(x) before reading or writing x

• T must unlock (x) after all reads and writes are completed

• T must not lock(x) if it already has it locked

• T must not unlock(x) unless it already holds the lock on x

• These rules can be enforced by the lock manager

Binary lock problems

• A binary lock forces mutual exclusion on the data item

– At most one transaction can access a given item at a time

– If all a transaction wants to do is read, several transactions should be allowed to access at once

– But if a transaction wants to write, it must have exclusive access.

• A multiple-mode lock solves this problem

Shared/Exclusive (read/write) locks

• There are three locking options instead of just two

– Read_lock(x) other transactions are allowed to read the item

– Every transaction reading sets a read_lock on x

– Write_lock(x) a single transaction exclusively holds the lock on the item

– Unlock(x) unlocks either type of lock

• A lock then, can have three states

• Each record in the lock table will have four fields <data item, lock state, num reads, change lock>

• Look at implementation of locks and unlock

• Each function must be indivisible

Rules transaction T for read/write locks

• T must issue either a read or write lock before reading or writing an item

• T must issue a write lock if it is going to write

• T must unlock after operations are completed

• T must not issue locks if it already holds them

• T must not issue an unlock for a lock it does not hold

Conversion of locks

• Sometimes a transaction may want to convert a lock it holds to a different type of lock

• Upgrading

– To convert a read lock to a write lock, no other transaction can be holding a read lock

• Downgrading

– It is easier to convert a write lock to a read lock

• If upgrading and downgrading of locks is permitted, extra information must be kept in the lock table

– The specific transaction hold the lock must be kept, instead of just the fact that some transaction hold the lock

Locks and serializability

• Just the fact that locks are used does not guarantee serializability

• Look at figure 18.3

• The incorrect result occurred because the transactions were unlocked too early

• To guarantee serializability, an additional protocol must be added

– This will concern the positioning of locks and unlocks for each transaction

– The best known is two-phase locking

Two-phase locking

• All locking operations in a transaction precede the first unlock operation

– All locks must be made before anything is unlocked

• A transaction can be divided into two phases

– An expanding phase where the number of locks increase

– A shrinking phase, where the number of locks decrease

• If lock conversion is allowed:

– Upgrading must be done during the expanding phrase

• From read lock to write lock

– Downgrading must be done during the shrinking phase

• From write lock to read lock

• It can be proved that if every transaction follows the two phase locking, the schedule is always serializable.

– Now, no test for a serializable schedule is necessary

Problems with two phase-locking

• It may limit the amount of concurrency that can occur in a schedule

• Not all the possible serializable schedules are permitted.

• Some time slice allocations may result in deadlock

– Look at possible time slices in Figure 18.4

• Starvation may also occur

Variations of two-phase locking

• Basic

– This is what we have been talking about

– It guarantees serializability, but may cause deadlock

• Conservative

– All locks are made before the transaction begins execution

– This requires predeclaring its read and write sets

– If any of the items in the sets cannot be locked, none are locked; it waits until all are available for locking

– As you might guess, this is not practical for several reasons

– This is deadlock free

• Strict 2PL

– This is the most popular variation of 2PL

– A transaction does not unlock any of its write locks until after it commits or aborts

– No other transaction can read or write an item that this transaction writes, until it is done

– This is not deadlock free

• Rigorous 2PL

– This is more restrictive that strict 2PL

• A transaction does not release any of its locks until after it commits or aborts

Summary of variations of 2PL

• Conservative 2PL

– Since it must lock all items before it starts, it is always in the shrinking phase

• Rigorous 2PL

– Since it does not unlock any of its locks until done, it is always in the expanding phase

• The only variation that guarantees deadlock free is conservative.

Dealing with deadlock

• Deadlock occurs when each transaction in a set of two or more is waiting for some item that is locked by another transaction

• Deadlock prevention protocols

– But they may cause some transactions to be aborted and restarted needlessly

• This is true even though those transactions may never actually cause deadlock

• Deadlock detection and timeouts

– Waiting until deadlock occurs and detecting it, then fixing it is a more practical approach.

Deadlock prevention

1. Require a transaction to lock all the items it needs in advance; not very practical

2. Timestamp protocols

– If T1 starts execution before T2, we say that T1 < T2

– Wait-die:

• An older waits on a younger transaction that has an item it needs

• A younger transaction requesting an item held by an older transaction is aborted, then restarted

– Wound-wait:

• A younger T waits on an older transaction

• If the older T needs an item the younger has locked, the younger is aborted, then restarted.

Deadlock prevention

3. No time stamping required

– No waiting

• If a transactions unable to obtain a lock, it aborts, and restarts later

• Causes a lot of unnecessary aborts and restarts

– Cautious waiting

• Suppose you have two transactions T1 and T2. T2 has a lock on item X that T1 needs. T1 waits only if T2 is not also waiting for another item

– A transaction only waits for transactions that are not waiting.

• If T2 later becomes blocked, deadlock still will not occur

Deadlock detection

• A more practical method of dealing with deadlock is to wait until it occurs, then do something.

• The classic way of detecting deadlock is a wait-for graph

– Every transaction executing has a node on the graph, along with all the items it has locked, and all it is waiting for

– There is directed edge from every transaction waiting for an item to the transaction locking that item.

– If the graph has any cycles, deadlock has occurred.

• The graph must be updated every time a transaction asks for a lock, gets a lock, or releases a lock

Deadlock

• Checking for cycles in a directed graph is not especially easy for a computer

• The decision must be made when the system should check for a deadlock

– This could be based on:

• The number of transactions currently running, or

• The amount of time some transactions have been waiting

• Victim selection

– Usually, the victim should NOT be a transaction that has been running a long time, and had made a lot of updates

• Timeouts instead of deadlock detection

– If as transaction waits longer than some specified period, deadlock is assumed, and it is aborted

– This method is at least simple, with low overhead

Starvation

• Starvation is when a transaction continually gets aborted or left waiting, while others are executing normally.

– This may occur if the waiting or aborting scheme is unfair

• Ways to assure fairness

– Use a first-come first-served queue for waits

– Use priorities for transactions that have been waiting longest, or aborted.

Granularity of data items

– Fine granularity refers to small data item size like an attribute

– Coarse granularity refers to large data item size like a disk block or a whole file

• Tradeoffs when deciding on granularity

– The larger the data item size, the lower the degree of concurrency

– The smaller the data item size, the more items there are in the DB

• Every item has associated with it a lock;

• there will be more locks and unlocks;

• the lock table will be larger;

• if timestamping is used, more timestamps

What is the best granularity?

• Depends on the type of transactions involved.

– If most transactions access a small number of records; granularity should be one record

– If most transaction access many records in the same file; use block or file granularity