The
Relational Model: Operations
The
Relational Algebra
Operations to manipulate the data
The relations model not only describes the structure
and constraints for a database, but also a set of operations to manipulate the
data
This set of operations constitute the relational
algebra
They enable the user to retrieve data from the
database
The result of a retrieval operation is a new relation,
which has been formed from one or m
Relational algebra
Consists of operations that take one or two(or more)
relations as input and
produce a new relation as their result.
The fundamental operations are:
Select limits
the number of tuples
Project limits
the number of attributes
Union puts two
relations together
set difference subtracts
one relation from another
Cartesian product-- cross product of two relations
Rename renames a relation
Other, non-fundamental operations:
set intersection
natural join
division
assignment
Fundamental Operations
Select
selects tuples that satisfy a given predicate.
called a unary operator because it operates on one relation.
For select we use
the Greek letter sigma F
Form: F
condition (RELATION NAME)
The condition is a boolean expression
Examples of Select:
.F branch_name="Perryridge" (LOAN)
.F amount > 1200 (LOAN)
the Booleans AND/OR can be used to connect conditions
The selection condition is applied independently to
each tuple in the relation
If the condition
evaluated to TRUE, then the tuple is selected
the degree of the resulting relation is the same as the
original one (same # of attributes)
Project
Limits the
attributes to those specified
A unary operator
The Greek letter pi B is the
selection operator
Form: B attribute
list (RELATION NAME)
Example of Project:
B loan_number, amount (LOAN)
The degree is equal to the number of attributes named
in the attribute list
They will be
listed in the same order as in the list
If the attribute list contains only nonkey attributes, there may be duplicates
project removes any duplicates, so the result will still be a
relation
The number of tuples is always <= the number in the
original relation
Select and Project
If you think of a relation as a table,
the select operations selects some of the rows and
discards other;
the project operations keeps some of the columns
and throws the others away
This lets the user look at only the information they
are interested in.
Select and project work on only from one table
Combining Relational Operators
Query: Find
those customers who live in Harrison.
B c_name (F c_city =
"Harrison" (CUSTOMER))
You need to visualize the intermediate table.
If you project before you select, you have lost all
the attributes but the one named.
You could rename
the intermediate table
City_h ฿ (F c_city =
"Harrison" (CUSTOMER)
B c_name(City_h)
Set Theoretic Operations
These are the standard set operations
Union,
intersection and difference
A relation should be thought of as a set of
tuples
These operate on two sets, i.e. two relations, so are
used to merge the relations in different ways
All these
operations must be union-compatible
Union-compatible
Two requirements to union compatibility
the relations must be of the same degree (same # of
attributes)
the domains of corresponding attributes must be the same
Note that the relations in a union may be temporary,
the result of other operations
Duplicate tuples are eliminated
Union c
puts two compatible sets together
a binary operator
Form: (relation1) c (relation2)
Your text uses the convention that the resulting
relation has the same name as the first relation
Query: Find the names of all bank customers
who have either an account or loan or both.
We need information from two relations
Names of all customers with a loan: Bc_name (BORROWER)
Names of all customers with an account: B c_name
(DEPOSITOR)
The c of these two sets
B c_name (BORROWER) c Bc_name(DEPOSITOR)
Or, you could rename the relations then do the union
r1 := B c_name (BORROWER)
r2 := B c_name (DEPOSITOR)
r1 c r2
Set Difference
finds tuples that are in one relation but not in another
Binary operator
Form: relation1 difference relation2
The relations must be union compatible
(attributes same number &
type)
Query: Find all customers of the bank who
have an account but not a loan.
B c_name
(DEPOSITOR) difference B c_name (BORROWER)
Queries
1) Find the loans from Perryridge
where the amount of the loan > 1200
F branch_name="Perryridge AND amount >
1200 (LOAN)
2) Find those customers who live in Harrison.
B c_name (F c_city
= "Harrison" (CUSTOMER))
3) Are there any branches that have accounts but not
loans? Which one(s)? Loans but not account?
4) Find all
customers of the bank who have an either an account or a loan or both.
Answers to queries
3) B branch_name (ACCOUNT) difference
B branch_name
(LOAN)
4) r1 ฿ B c_name (BORROWER)
r2 ฿ B c_name (DEPOSITOR)
r1 c r2
Renaming
You can rename attributes as well as relations
Renaming relations
New_name ฿ (relations algebra expression)
Renaming attributes
temp := Fbalance > 500(Loan)
high(account, bal฿ Baccount-number, balance(temp)
Cartesian-Product X
Cross product of two relations (binary operator)
The relations do not have to be union-compatible
In the new relation
the number of attributes is the sum of # of
attributes in the two relations.
the number of tuples is the product of the
number of tuples in the two relations.
Each tuple
in the first is combined with every tuple in the second
Example:
borrower X loan
Attributes
of resultant relation
5 attributes
since the same attribute name appears in both they must be
distinguished
borrower.loan_number and loan.loan_number
Tuples of
resultant relation
56 tuples
a tuple is constructed out of each possible pairs of
tuples; one from each relation
Cartesian-Product X
Cartesian Product
uses
The X operation gives a lot of useless tuples, but it can
be used to put together relations that are not union compatible
Almost always select is then used to
eliminate the unwanted tuples
Since select is used almost universally with X, a special operation join was
created to specify this sequence as a single operation
Non Fundamental
Operators
The previous fundamental operators of the relational
algebra are sufficient to express any relations-algebra query.
But if we define a few more operations, some of the
queries become simpler.
So the following operators do not give any more power
to the algebra, just make life easier.
Join ฎ
forms a Cartesian product of two relations,
then performs a selection forcing =, < or >
on those attributes that appear in both relations, and then removes all
duplicates
Form: R1 ฎ condition R2
Example:
BORROWER ฎ B.loan-num=L.loan-num LOAN
Resultant
relation
5 attributes, two from Borrower, three from Loan
Only those tuples where the condition was true are
included
Equijoin
If the join
condition is equality (the most common), the join is called an equijoin
In an equijoin,
one or more pairs of attributes have identical values
A natural join is an equijoin
where one of the identical columns is eliminated.
Natural Join *
The standard
definition of a natural join requires the two join attributes have the same
name Borrower * Loan
If they dont have the same names one can be renamed
A nonstandard definition allows you to specify the
join attributes from each relation
The names do not
have to be the same
Employee * (DNO,
DNUMBER) Department
Query: Find the names and amount of all
customers who have loans at the bank.
.B c_name, amount
(BORROWER * LOAN)
Query: Find the names of all branches with
customers living in Harrison who have an account.
Find all customers living in Harrison
R1฿ F
c_city="Harrison" (CUSTOMER)
Find customers in Harrison with an account
R2 ฿ ( R1 * DEPOSITOR)
Now find which branch those accounts are in
.B branch_name (R2 * ACCOUNT ))
You can do all this in one statement, just make sure
you get the order correct
Set Intersection 1
Finds tuples that are in both relation1 and relation2
Form: relation11 relation2
The relations must be union compatible
Query: Find all customers who have both a
loan and an account.
.B c_name (BORROWER) 1 Bc_name(DEPOSITOR)
Query: Are there any customers who have neither
loans nor accounts? Give their names.
R1 ฿ B c_name (BORROWER) c
B
c_name (DEPOSITOR)
B c_name
(CUSTOMER) R1
Division
Suited to queries that include the phrase "for
all"
Query:Find all customers who have an account at all the branches
in Brooklyn
First, get two intermediate relations:
1)Find all branches in Brooklyn
BinB฿B branch_name
(Fbranch_city = "Brooklyn"
(BRANCH))
2) Find all customer's
branches
.CB ฿ B
c_name, branch_name
(DEPOSITOR*ACCOUNT)
3)
divide R2 ๗ R1
The division operation
It is applied to two relations ( R
and Sub) where one is a subset of the other (Sub a subset of R)
R(Z) is divided by Sub(X);
Here X,Y, and Z are sets of attributes
Let Y = Z X
So Z= X union Y
The result relation will have attribute(s) Y
For a tuple to appear in the result of the division
the values must appear in R in combination with every tuple in Sub
Find
all customers who have an account at all the branches in Brooklyn
1)Find all branches in Brooklyn
B branch_name (Fbranch_city = "Brooklyn"
(BRANCH))
2) Find all customer's
branches
.B c_name, branch_name (DEPOSITOR*ACCOUNT)
3)divide 2 by 1 thus:
.B c_name, branch_name (DEPOSITOR *ACCOUNT) divide B branch_name(Fbranch_city="Brooklyn"
(BRANCH))
Additional Relational
Operations
Relational algebra cannot perform some operations that
are needed by a DBMS.
Extra operations have been defined that extend the
power of the relational algebra
Aggregate functions
Grouping
Aggregate
Functions
A mathematical functions that takes a collection of
values and returns a single value as a result
Form function
attribute (RELATION)
Query: Find the sum of salaries of all the part
time employees;
sum salary (PT-WORKS)
Query: find
the average salary.
avg salary (PT-WORKS)
the result of both these queries will be a relation with
one attribute and one tuple
Naming the attributes
either rename, or
concatenation of the function name with the attribute name
common functions are:
sum, avg, min, max, count
if we want to eliminate duplicates the word distinct
can be added
Query: Find the number of branches appearing
in the pt-works relation
count distinct branch_name
(PT-WORKS)
Grouping
Functions
Another type of request involves.
1) grouping the tuples by an
attribute and then
2) applying an aggregate
function independently to each group of tuples.
Form:
<grouping attribute> group <function-attribute-pair list> Relation
Query: Find
the total salary of all part-time employees at each branch individually.
b_name group sum
salary (pt-works)
this means group by b_name and
sum salary by those groups
Query: For
each branch find the sum of the salaries and the maximum salary for part-time
employees.
Result (b_name, sum, max) ฿ ( b_name group sum salary, max salary (PT-WORKS)
Queries
What is the total amount of outstanding loans at each
branch?
Give the branch names of those branches that have
assets over 1,000,000
List the customer names who have accounts at Brighton
In what branches does Johnson have an account?
Working with the
Company database
Query: Retrieve the social
security numbers of all employees who either work in department 5 or directly
supervise an employee who works in department 5
.dept5_emps <-- select dno=5
(employee)
result1 <--
project ssn (dept_emps)
result2 (ssn) <-- project
superssn(dep5_emps)
Result <--
result1 union result2