The First Part of Concurrency Control Summary
Concurrency Control is the management of concurrently executing transactions. A transaction is a sequence of a write and a read on the database. Hence, the database only sees the read and write operations. The goal of concurrency control is to deal with interleaving transactions.
Let me show you in the figure below what is transaction. Transaction to transfer $100 from account R to account S. However, the transaction manager did not see some of transactions 3 and 6.
To solve the above issue in the transaction, we need every transaction to maintain the following properties called ACID.
- Atomicity: all actions in the transaction happen or do not happen
- Consistency: If the database starts out in a consistent state, it should end in a consistent state. For example, in an application that transfers funds from one account to another, the consistency property ensures that the total value of funds in both accounts is the same at the start and end of each transaction.
- Isolation: Each execution is isolated from the other. The intermediate state of a transaction is invisible to other transactions. As a result, transactions that run concurrently appear to be serialized.
4. Durability: After a transaction successfully completes, changes to data persist and are not undone, even in the event of a system failure. For example, in an application that transfers funds from one account to another, the durability property ensures that the changes made to each account will not be reversed.
A transaction isolation level is defined by the following:
- Dirty Read: A Dirty read is a situation when a transaction reads data that has not yet been committed.
- Non Repeatable read: Non Repeatable read occurs when a transaction reads the same row twice and gets a different value each time.
- Phantom Read: Phantom Read occurs when two same queries are executed, but the rows retrieved by the two, are different.
The different States of a Transaction:
Now, let me introduce what is the schedule. A schedule is a sequence of transactions. A serial schedule is one where transactions are executed sequentially. The keyword here is serial order. We need multiple transactions running concurrently and we need them to be serializable as they are running on a single machine. Below are serialized transactions. Below are equivalent.
See the figure below for all types of scheduling. A serializable execution is guaranteed to be serializable. Serializable execution is defined to be an execution of operations in which concurrently executing transactions appear to be serially executing.
The transactions are executed in a non-serial manner, keeping the end result correct and the same as the serial schedule. A serializable schedule helps in improving both resource utilization and CPU throughput. These are of two types:
Conflict Serializable: A schedule is called conflict serializable if it can be transformed into a serial schedule by swapping non-conflicting operations. Two operations are said to be conflicting if all conditions satisfy:
- They belong to different transactions.
- They operate on the same data item.
- At Least one of them is a write operation.
View Serializable: A Schedule is called view serializable if it is view equal to a serial schedule (no overlapping transactions). A conflict schedule is a view serializable but if the serializability contains blind writes, then the view serializable does not conflict-serializable.
When operations of a transaction are interleaved with operations of other transactions of a schedule, the schedule is called a Concurrent schedule. So how can we combine the advantages of both serializability and concurrency?
Schedules and concurrency control mechanisms help. Schedules arrange each step of the transaction in a serial pattern so the transaction can go to completion without the worry of interference from other transactions. Concurrency control mechanisms manage the flow of transactions and improve the efficiency of the system.
There are two categories of concurrency control: pessimistic and optimistic. Pessimistic concurrency control assumes that there is a high likelihood that two or more transactions will try to access the same resources. A transaction will lock access to the resources it needs to keep others from modifying them. Optimistic concurrency control assumes that it is not likely that multiple transactions will access the same resource. Optimistic techniques will not lock access to resources. Instead, they will check for conflicts at commit time. If it turns out that multiple transactions did indeed access resources concurrently, one or more of them will have to abort.
Please read the second part of the concurrency control summary.