Data
Manipulation: Retrieval Operations
The DML
component of SQL
Retrieval: three
clauses
• select, from and where.
– Select - use
to list the attributes desired.
• Corresponds
to the π in relational
algebra
– From - list
the relations to be joined.
– Where - the
condition to specify which tuples.
• Corresponds
to sigma in relational algebra
• Select A1,
A2,...An (any
number of attributes)
From R1, R2,...Rm (any number of relations)
Where predicate (can be compound)
The Select
clause
• Query: Find
the names of all branches in the loan relation.
select branch_name
from loan;
• Notice that you are selecting column
• What is the result of this query?
Duplicate tuples
now appear.
• Duplicates are not allowed in sets (i.e. relations)
• In practice, removing duplicate tuples is time
consuming
• So SQL allows duplicates.
• To force the elimination of duplicate tuples use distinct
select distinct branch_name
from loan;
• The keyword all is used to specify not to
remove duplicate tuples;
–
it is the default, so seldom used.
Other options with Select
• The symbol * is used to denote "all the
attributes” .
select
*
from loan;
• The select clause can contain arithmetic expressions.
select
branch_name, loan_num, amount*100
from loan;
The WHERE clause
• Query: Find
all loan numbers for loans made at the Perryridge
branch with loan amounts greater than $1200.
select
loan_num
from loan
where branch_name=‘Perryridge’
and amount > 1200;
• The between and not between operators:
select loan_num
from loan
where amount between
90000 and10000;
The From clause
• The FROM clause by itself defines a Cartesian product if more than one
relation is specified;
–
you must use a
where clause to make it a join.
–
Query: For all customers who have a loan from the bank, find
their names and loan numbers.
select c_name, borrower.loan_num
from borrower, loan
where borrower.loan_num=loan.loan_num; (this is the join condition)
• Query: Find
the names and loan numbers of all customers who have a loan at the Perryridge branch.
• select c_name, borrower.loan_num
from borrower, loan
where borrower.loan_num=loan.loan_num
and branch_name=‘Perryridge’;
The Rename operation
• You can rename either the tables or columns in a
statement using as.
• Renaming a column
select c_name, borrower.loan_num
as loan_id
from borrower, loan
where borrower.loan_num=
loan.loan_num
and branch_name="Perryridge;
Renaming a table
select c_name,
B.loan_num
from borrower as B, loan as
L
where B.loan_num=L.loan_num;
• In Oracle, the as is
optional.
String Operations
• The most commonly used operation on strings is pattern
matching.
• Two symbols describe patterns:
– 1) Percent (%): The % character matches any substring
of zero or more characters.
– 2) Underscore (_): The _ character matches any single
character.
• Patterns are case sensitive
Examples using strings:
• "Perry%" matches any string beginning with
"Perry".
• "%idge%" matches
any string containing "idge" as a
substring.
• "_ _ _" matches any string of exactly three
characters.
• "_ _ _%" matches any string of at least
three characters.
• The like operator is the comparison operator.
select c_name
from customer
where c_street
like "%Main%";
Other comparison operators
– ilike – performs
case-insensitive pattern matching
– You can also use like and ilike
with the not operator
Ordering the Display of Tuples
• order by - used
to control the order in which the tuples are displayed.
• select distinct c_name
from borrower as B, loan as
L
where B.loan_num=L.loan_num
and branch_name
="Perryridge”
order by c_name;
• The default is ascending order by default; but may
specify using desc or asc.
• order by amount desc, loan_num asc;
• order by requires
a sort, so may be costly timewise.
– distinct also requires a sort
Set Operations
• The operations union, intersect, and except
operate about the same as their algebra counterparts. They are all union compatible.
• Each automatically eliminates duplicate tuples. If we want to retain all duplicates write union
all (or intersect all, etc)
Set Operations: union
• Find all the customers having a loan, an account or
both at the bank.
(select
c_name from depositor)
union
(select c_name from borrower);
Set Operations:intersect
• Find all customers who have both a loan and an account
at the bank.
(select
distinct c_name from depositor)
intersect
(select distinct c_name from borrower);
• except query: Find
all customers who have an account but no loan at the bank
(select distinct c_name from depositor)
except
(select distinct c_name from borrower);
Aggregate Functions
• These are functions that take a collection of values
as input and return a single value.
• SQL has five built-in aggregate functions: avg, min, max, sum, count.
–
The functions are
used in select (or having, as we will see later)
• Query: Find
the average account balance at the Perryridge branch.
• select avg(balance)
from account
where branch_name=‘Perryridge’;
Group by:
• Find the average account balance at each branch.
• select branch_name, avg(balance)
from account
group by branch_name;
• Duplicates are not eliminated here. If we want to eliminate them, use distinct.
Another Example
• Query: Find
the number of depositors for each branch.
select branch_name, count(distinct
c_name)
from depositor as D, account as A
where D.account_number=A.account_number
group by branch_name;
• Count can be used four ways:
– count(*)
– count(expression)
– count(all expression)
– count(distinct expression)
Group by using Having
• having is used to impose a predicate which is applied after
the formation of groups.
• Query: List
only those branches where the average account balance is more than $1200.
• select branch_name, avg(balance)
from account
group by branch_name
having avg(balance) > 1200;
Query:Find
the average balance for each customer who lives in Harrison and has at least three accounts.
• select D.c_name, avg(balance)
from depositor as D, account as A, customer as C
where D.account_number = A.account_number
• and D.c_name = C.c_name
and city = "Harrison”
group by D.c_name
having count (distinct D.account_number >=
3;)