Data
Integrity
Integrity constraints over relations
A database is only as good as the information stored
in it
A DBMS must help prevent the entry of incorrect
information
An integrity constraint is a condition
specified on the schema that restricts the data that can be stored
Integrity constraints are specified when you create
tables in using SQLs Data Definition Language (DDL)
Types of Integrity constrains
Application based constraints
(semantic integrity constraints)
Schema-based Constraints
(can be specified in the DDL)
Types of general constraints
Domain Constraintmust be atomic values
Key
Constraint
Entity
Integrity Constraint
Referential
Integrity Constraint
Key Constraints
First we need to look at types of keys
Superkey
Any set of attributes in a relation that make a tuple
unique
Candidate key
Primary Key
Foreign Key
Formal definition of Candidate Key
Let R be a relation with attributes A1,
A2,...,An.
Let K (the candidate key) be a set of attributes of R;
K = {Ai,
Aj,...,Ak}
K is a
candidate key of R if and only if the following two properties are
satisfied:
Uniqueness - At any given time, no two tuples of R have
the same value for Ai,
Aj,..., and Ak.
Minimality - None of Ai,
Aj,...Ak
can be discarded from K without destroying the uniqueness property.
These two properties must be time independent.
Primary key
Choosing a
primary key from the candidate keys is arbitrary
Primary keys are the only tuple-level addressing
mechanism within the relation model.
Key integrity constraint states that
a tuple must have a Primary
Key that
is unique and minimal
Entity integrity constraint states that
the Primary Key must not be NULL
All DBMS support this constraint
These constraints are specified on individual
relations.
Foreign key
Specifies a
constraint between two relations
Formal definition of a
Foreign Key
Foreign Key is a set of attributes
FK = {A1, A2,...An}
FK is a foreign
key of R1 if it satisfies two rules:
1. FK has the same domain as the PRIMARY
KEY of R2
We
say FK refers to the relation R2
2. A value of FK must occur as a value of
the PRIMARY KEY
or the FK may be NULL.
Note about foreign keys:
The foreign key must refer to a primary key; they must
have the same domain.
The foreign key does not have to be a primary key.
The foreign key can be NULL.
A foreign key can refer to its own relation.
R1 and R2 need not be distinct.
You must specify two things for a foreign key:
1) the attribute name(s)
2) the relation to which it
refers.
Foreign to
primary key matches represent the glue that holds the database together.
Look at the foreign keys in company DB.
The Company Database foreign keys
dept_location.dnumber references department
department.mgrssn
references employee
works_on.essn references employee
works_on.pno references project
project.dnum references department
dependent.essn references employee
employee.dno
references department
employee.superssn
references employee
Summary of constraints on a Relational
Database
Domain
Constraint
The value of each attribute A must be an atomic value
from the domain for that attribute.
Key Constraint
A primary key must be unique and minimal.
Entity
Integrity Constraint
No primary key value can be NULL
Referential
Integrity Constraint
The value of a foreign key must match a value of the
primary key in the relation to which it refers.
Constraints
violations caused by data manipulation
Manipulation is divided into retrievals or updates
Since retrievals
dont change anything, they do not violate constraints
There are three basic update operations:
Insert a tuple or tuples
Can
violate any of the four types of constraints
Delete a tuple or tuples
Can
modify only the referential integrity constraint
Modify a tuple or tuples
Usually
the only problems occur when a primary or foreign key is modified.
Violations?
Tell which
constraint they violate, if any. Choices:
Key ConstraintA primary key must be unique and minimal.
Entity
Integrity Constraint No primary key
value can be NULL
Referential Integrity Constraint Foreign key violation
Insert <'Product A', 4, 'Belaire',
2> into project
Update the DNO of the employee tuple
with ssn = 999887777 to 7.
Update the ssn of the employee tuple with ssn = 999887777 to 987654321
Delete the works_on tuple with essn=999887777 and pno = 10.
Insert <Production', 4, '943775543'',
'01-oct-88'> into department
Delete the employee tuple with ssn = '987654321'.
Insert
<'453453453', 'John', 'M', '12-dec-60', 'spouse'> into dependent.
Delete the project tuple with pno = 10
What if updates violate the integrity
constraints?
Two choices:
1) Reject the operation that would result in an
illegal state.
2) Accept the operation,
cascade the results to keep the DB legal.