Data Manipulation: Retrieval Operations

The DML component of SQL

Retrieval: three clauses

      select, from and where.

   Select - use to list the attributes desired.

    Corresponds to the π in relational algebra

   From - list the relations to be joined.

   Where - the condition to specify which tuples.

    Corresponds to sigma in relational algebra

      Select A1, A2,...An  (any number of attributes)
From R1, R2,...Rm  (any number of relations)
Where predicate (can be compound)

 

The Select clause

      Query: Find the names of all branches in the loan relation.

select branch_name
from loan;

      Notice that you are selecting column

      What is the result of this query?

           

Duplicate tuples now appear.

      Duplicates are not allowed in sets (i.e. relations)

      In practice, removing duplicate tuples is time consuming

      So SQL allows duplicates.

      To force the elimination of duplicate tuples use distinct
        select distinct
branch_name
        from loan;

      The keyword all is used to specify not to remove duplicate tuples;

    it is the default, so seldom used.

 

Other options with Select

      The symbol * is used to denote "all the attributes” .
        select *
        from loan;

      The select clause can contain arithmetic expressions.
        select branch_name, loan_num, amount*100
        from loan;

 

The WHERE clause

      Query: Find all loan numbers for loans made at the Perryridge branch with loan amounts greater than $1200.

        select loan_num
        from loan
        where branch_name=‘Perryridge
           and amount > 1200;

      The between and not between operators:
        select loan_num
        from loan
        where amount between 90000 and10000;

The From clause

      The FROM clause by itself defines a Cartesian product if more than one relation is specified;

   you must use a where clause to make it a join.

   Query: For all customers who have a loan from the bank, find their names and loan numbers.
  select c_name, borrower.loan_num
  from borrower, loan
  where borrower.loan_num=loan.loan_num; (this is the join condition)

      Query: Find the names and loan numbers of all customers who have a loan at the Perryridge branch.

      select c_name, borrower.loan_num
from borrower, loan
where borrower.loan_num=loan.loan_num
        and branch_name=‘Perryridge’;

 

The Rename operation

      You can rename either the tables or columns in a statement using as.

      Renaming a column
    select c_name,   borrower.loan_num as loan_id
    from borrower, loan
    where borrower.loan_num= loan.loan_num
        and branch_name="Perryridge;
Renaming a table
  select c_name, B.loan_num
  from borrower as B, loan as L
   where B.loan_num=L.loan_num;

      In Oracle, the as is optional.

 

String Operations

      The most commonly used operation on strings is pattern matching.

      Two symbols describe patterns:

   1) Percent (%): The % character matches any substring of zero or more characters.

   2) Underscore (_): The _ character matches any single character.

      Patterns are case sensitive

 

Examples using strings:

      "Perry%" matches any string beginning with "Perry".

      "%idge%" matches any string containing "idge" as a substring.

      "_ _ _" matches any string of exactly three characters.

      "_ _ _%" matches any string of at least three characters.

      The like  operator is the comparison operator.
  select c_name
  from customer
   where c_street like "%Main%";
Other comparison operators

   ilikeperforms case-insensitive pattern matching

   You can also use like and ilike with the not operator

 

Ordering the Display of Tuples

      order by - used to control the order in which the tuples are displayed.

      select distinct c_name
 from borrower as B, loan as L
  where B.loan_num=L.loan_num
     and
branch_name ="Perryridge
     order by c_name;          

      The default is ascending order by default; but may specify using desc or asc.

      order by amount desc, loan_num asc;

      order by requires a sort, so may be costly timewise.

   distinct also requires a sort

 

Set Operations

      The operations union, intersect, and except operate about the same as their algebra counterparts.  They are all union compatible.

      Each automatically eliminates duplicate tuples.  If we want to retain all duplicates write union all (or intersect all, etc)

 

Set Operations: union

      Find all the customers having a loan, an account or both at the bank.
 (select c_name from depositor)
 union
(select c_name from borrower);

 

Set Operations:intersect

      Find all customers who have both a loan and an account at the bank.
        (select distinct c_name from depositor)
        intersect
        (select distinct c_name from borrower);

      except query: Find all customers who have an account but no loan at the bank
        (select distinct c_name from depositor)
        except
        (select distinct c_name from borrower);

 

Aggregate Functions

      These are functions that take a collection of values as input and return a single value.

      SQL has five built-in aggregate functions: avg, min, max, sum, count.

    The functions are used in select (or having, as we will see later)

      Query: Find the average account balance at the Perryridge branch.

      select avg(balance)
from account
where branch_name=‘Perryridge’;

 

Group by:

      Find the average account balance at each branch.

      select branch_name, avg(balance)
from account
group by  branch_name;

      Duplicates are not eliminated here.  If we want to eliminate them, use distinct.

 

Another Example

      Query: Find the number of depositors for each branch.
select branch_name, count(distinct c_name)
from depositor as D, account as A
where D.account_number=A.account_number
  group by branch_name;

      Count can be used four ways:

    count(*)

    count(expression)

    count(all expression)

    count(distinct expression)

 

Group by using Having

      having is used to impose a predicate which is applied after the formation of groups.

      Query: List only those branches where the average account balance is more than $1200.

      select branch_name, avg(balance)
from account
group by branch_name
having avg(balance) > 1200;

 

Query:Find the average balance for each customer who lives in Harrison and has at least three accounts.

      select D.c_name, avg(balance)
from depositor as D, account as A, customer as C
where D.account_number = A.account_number

               and D.c_name = C.c_name
        and city = "
Harrison
group by D.c_name
having count (distinct D.account_number >= 3;)