The Relational Model: Operations

The Relational Algebra

 

Operations to manipulate the data

•      The relations model not only describes the structure and constraints for a database, but also a set of operations to manipulate the data

•      This set of operations constitute the relational algebra

•      They enable the user to retrieve data from the database

•      The result of a retrieval operation is a new relation, which has been formed from one or m

 

Relational algebra

•      Consists of operations that take one or two(or more)  relations as input and  produce a new relation as their result.

The fundamental operations are:

•      Select – limits the number of tuples

•      Project –limits the number of attributes

•      Union – puts two relations together

•      set difference – subtracts one relation from another

•      Cartesian product-- cross product of two relations

•      Rename – renames a relation

 

Other, non-fundamental operations:

•      set intersection

•      natural join

•      division

•      assignment

 

Fundamental Operations   

Select

•      selects tuples that satisfy a given predicate.

•      called a unary operator because it operates on one relation.

•      For select we use the Greek letter sigma F

•      Form: F condition (RELATION NAME)

•      The condition is a boolean expression

 

Examples of Select:

•        .F branch_name="Perryridge" (LOAN)

•      .F amount > 1200 (LOAN)

•      the Booleans AND/OR can be used to connect conditions

•      The selection condition is applied independently to each tuple in the relation

–    If the condition evaluated to TRUE, then the tuple is selected

•       the degree of the resulting relation is the same as the original one (same # of attributes)

Project

•      Limits   the attributes to those specified

•      A unary operator

•      The Greek letter pi B is the selection operator

•      Form: B attribute list (RELATION NAME)

 

Example of Project:

B loan_number, amount (LOAN)

•      The degree is equal to the number of attributes named in the attribute list

–    They will be listed in the same order as in the list

•      If the attribute list contains only nonkey attributes, there may be duplicates

–    project removes any duplicates, so the result will still be a relation

•      The number of tuples is always <= the number in the original relation

 

Select and Project

–   If you think of a relation as a table,

–   the select operations selects some of the rows and discards other;

–   the project operations keeps some of the columns and throws the others away

–   This lets the user look at only the information they are interested in. 

–   Select and project work on only from one table

 

Combining Relational Operators

•      Query: Find those customers who live in Harrison.

B c_name (F c_city = "Harrison" (CUSTOMER))

•      You need to visualize the intermediate table. 

•      If you project before you select, you have lost all the attributes but the one named.

•      You could rename the intermediate table
City_h
฿ (F c_city = "Harrison" (CUSTOMER)

B c_name(City_h)

 

Set Theoretic Operations

•      These are the standard set operations

–   Union, intersection and difference

•      A relation should be thought of as a set of tuples

•      These operate on two sets, i.e. two relations, so are used to merge the relations in different ways

•      All these operations must be union-compatible

 

Union-compatible

•      Two requirements to union compatibility

–   the relations must be of the same degree (same # of attributes)

–   the domains of corresponding attributes must be the same

•      Note that the relations in a union may be temporary, the result of other operations

•      Duplicate tuples are eliminated

Union   c

•      puts two compatible sets together

•      a binary operator

•      Form: (relation1) c (relation2)

•      Your text uses the convention that the resulting relation has the same name as the first relation

 

Query: Find the names of all bank customers who have either an account or loan or both.

•      We need information from two relations

–  Names of all customers with a loan:   Bc_name (BORROWER)

–  Names of all customers with an account: B c_name (DEPOSITOR)

The c of these two sets

  B c_name (BORROWER) c Bc_name(DEPOSITOR)

•      Or, you could rename the relations then do the union

r1 := B c_name (BORROWER)

r2 := B c_name (DEPOSITOR)

r1 c r2

 

Set Difference

•      finds tuples that are in one relation but not in another

•      Binary operator

•      Form: relation1 difference relation2

•      The relations must be union compatible

–   (attributes same number & type)

 

Query: Find all customers of the bank who have an account but not a loan.

B c_name (DEPOSITOR) difference B c_name (BORROWER)

Queries

•      1) Find the loans from Perryridge where the amount of the loan > 1200

       F branch_name="Perryridge“ AND amount > 1200 (LOAN)

•      2) Find those customers who live in Harrison.

       B c_name (F c_city = "Harrison" (CUSTOMER))

•      3) Are there any branches that have accounts but not loans?  Which one(s)?  Loans but not account?

•       4) Find all customers of the bank who have an either  an account or a loan or both.

 

Answers to queries

3)  B branch_name (ACCOUNT) difference
                               
B branch_name (LOAN)

4) r1 ฿ B c_name (BORROWER)

r2 ฿ B c_name (DEPOSITOR)

r1 c r2

 

Renaming

•      You can rename attributes as well as relations

•      Renaming relations

–   New_name ฿ (relations algebra expression)

•      Renaming attributes

•      temp := Fbalance > 500(Loan)

•      high(account, bal฿ Baccount-number, balance(temp)

 

Cartesian-Product  X

•      Cross product of two relations (binary operator)

•      The relations do not have to be union-compatible

•      In the new relation

–    the number of attributes is the sum of # of attributes in the two relations.

–    the number of tuples is the product of the number of tuples in the two relations.

•      Each tuple in the first is combined with every tuple in the second

 

Example:  borrower X loan

•      Attributes of resultant relation

–   5 attributes

–   since the same attribute name appears in both they must be distinguished

–   borrower.loan_number and loan.loan_number

•      Tuples of resultant relation

–   56 tuples

–   a tuple is constructed out of each possible pairs of tuples; one from each relation

Cartesian-Product  X

 

 

Cartesian Product uses

•      The X operation gives a lot of useless tuples, but it can be used to put together relations that are not union compatible

•      Almost always select is then used to eliminate the unwanted tuples

•      Since select is used almost universally with X, a special operation join was created to specify this sequence as a single operation

 

Non Fundamental Operators

•      The previous fundamental operators of the relational algebra are sufficient to express any relations-algebra query. 

•      But if we define a few more operations, some of the queries become simpler. 

•      So the following operators do not give any more power to the algebra, just make life easier.

 

 Join 

•      forms a Cartesian product of two relations,

–  then performs a selection forcing =, < or > on those attributes that appear in both relations, and then removes all duplicates

•      Form:  R1 condition  R2

•       Example:
     BORROWER
B.loan-num=L.loan-num LOAN

 

Resultant relation

•      5 attributes, two from Borrower, three from Loan

•      Only those tuples where the condition was true are included

 

Equijoin

•      If the join condition is equality (the most common), the join is called an equijoin

•      In an equijoin, one or more pairs of attributes have identical values

•      A natural join is an equijoin where one of the identical columns is eliminated.

 

Natural Join  *

•      The standard definition of a natural join requires the two join attributes have the same name   Borrower * Loan

–   If they don’t have the same names one can be renamed

•      A nonstandard definition allows you to specify the join attributes from each relation

–   The names do not have to be the same
Employee * (DNO, DNUMBER) Department

 

 

Query: Find the names and amount of all customers who have loans at the bank.

•     .B c_name, amount (BORROWER * LOAN)

 

Query: Find the names of all branches with customers living in Harrison who have an account.

•     Find all customers living in Harrison
R1
฿ F c_city="Harrison" (CUSTOMER)

•      Find customers in Harrison with an account

•      R2 ฿ ( R1 * DEPOSITOR)

•      Now find which branch those accounts are in

•     .B branch_name (R2 * ACCOUNT ))

•     You can do all this in one statement, just make sure you get the order correct

Set Intersection   1

•      Finds tuples that are in both relation1 and relation2

•      Form: relation11 relation2

•      The relations must be union compatible

 

Query: Find all customers who have both a loan and an account.

•      .B c_name  (BORROWER) 1 Bc_name(DEPOSITOR)

 

•      Query:  Are there any customers who have neither loans nor accounts?  Give their names.

R1 ฿ B c_name (BORROWER) c B c_name (DEPOSITOR)

B c_name (CUSTOMER) – R1

 

 

Division

•      Suited to queries that include the phrase "for all"

•      Query:Find all customers who have an account at all the branches in Brooklyn

•      First, get two intermediate relations:

1)Find all branches in Brooklyn

    BinB฿B branch_name (Fbranch_city = "Brooklyn" (BRANCH))

2) Find all customer's branches

•      .CB ฿ B c_name, branch_name (DEPOSITOR*ACCOUNT)

•      3) divide     R2  ๗ R1

 

The division operation

•      It is applied to two relations ( R and Sub) where one is a subset of the other (Sub a subset of R)

•      R(Z) is divided by Sub(X);

–    Here X,Y, and Z are sets of attributes

•      Let Y = Z – X

–    So Z= X union Y

•      The result relation will have attribute(s) Y

•      For a tuple to appear in the result of the division the values must appear in R in combination with every tuple in Sub

 

Find all customers who have an account at all the branches in Brooklyn

1)Find all branches in Brooklyn

    B branch_name (Fbranch_city = "Brooklyn" (BRANCH))

2) Find all customer's branches

•      .B c_name, branch_name (DEPOSITOR*ACCOUNT)                  

•      3)divide 2 by 1 thus:

•      .B c_name, branch_name (DEPOSITOR *ACCOUNT) divide B branch_name(Fbranch_city="Brooklyn" (BRANCH))

Additional Relational Operations

•      Relational algebra cannot perform some operations that are needed by a DBMS.

•      Extra operations have been defined that extend the power of the relational algebra

–   Aggregate functions

–   Grouping

 

 

Aggregate Functions

•      A mathematical functions that takes a collection of values and returns a single value as a result

•      Form function attribute (RELATION)

•      Query:  Find the sum of salaries of all the part time employees;

            sum salary (PT-WORKS)

•      Query:   find the average salary.

            avg salary (PT-WORKS)

•      the result of both these queries will be a relation with one attribute and one tuple

•      Naming the attributes

–   either rename, or

–   concatenation of the function name with the attribute name

•      common functions are:  sum, avg, min, max, count            

•      if we want to eliminate duplicates the word distinct can be added

Query: Find the number of branches appearing in the pt-works relation

•      count distinct branch_name (PT-WORKS)

 

Grouping Functions

•      Another type of request involves.

–   1) grouping the tuples by an attribute and then

–   2) applying an aggregate function independently to each group of tuples.

•      Form:
<grouping attribute> group <function-attribute-pair list>  Relation

 

•      Query: Find the total salary of all part-time employees at each branch individually.

        b_name group sum salary (pt-works)

•      this means group by b_name and sum salary by those groups

•      Query: For each branch find the sum of the salaries and the maximum salary for part-time employees.

•      Result (b_name, sum, max) ฿    ( b_name group sum salary, max salary (PT-WORKS)

 

Queries

•      What is the total amount of outstanding loans at each branch?

•      Give the branch names of those branches that have assets over 1,000,000

•      List the customer names who have accounts at Brighton

•      In what branches does Johnson have an account?

•             

Working with the Company database

Query:  Retrieve the social security numbers of all employees who either work in department 5 or directly supervise an employee who works in department 5

•      .dept5_emps <-- select dno=5 (employee)

•      result1   <-- project ssn (dept_emps)

•      result2 (ssn) <-- project superssn(dep5_emps)

•      Result   <-- result1 union result2