The Relational Model

Chapter 5

 

The relational data model

•      The relational mode was first introduced by Ted Codd in 1970

•      It was immediately attractive because of its simplicity and mathematical foundations

•      The model uses the concept of a mathematical relation as its basic building block

•      It has its theoretical basis in set theory and predicate logic

–    Operations in set theory:
    
Union, intersection, Cartesian product

•      When it was first introduced, the overhead made implementing it VERY slow.

 

Relational Model:  The database is a collection of relations

•      Each relation resembles a table of values

•      But there are important difference between relations and tables stored as“flat files”

•      One of the differences is the terminology

–    New terms were introduced with the relational model because the old terms were not precise

–    With a table the rows are often called records, the columns may be called fields

–    In relational terminology, the meaning is much more precise

 

Relational Terminology

•      A relational schema is made up of a relation name and a list of attributes

•      Attribute is the name of a role played by some domain in the relation

•      Domain

–    A set of atomic values (indivisible values)

–    A pool of values from which the actual attribute must be drawn

•    May be finite or infinite

–    A data type and/or format is also specified for each domain

•      tuple – an ordered list of values corresponding to the attributes in the relational schema

•      cardinality - number of items in a set

–   We speak of the cardinality of a domain

–   The cardinality of a relation is the number of tuples present at any instance of time

•      The degree of a relation  is the number of attributes

•      primary key - unique identifier of a tuple

 

Formal definitions

•      R denotes the relational schema

•      R is made up of a list of attributes, denoted
R(A1, A2, …An).

•      Each attribute Ai is the name of a role played by some domain D in the schema R

•      D the domain of Ai and is denoted by dom(Ai)

•      The degree of a relation is the number of attributes n of R

•      Example:
STUDENT(Name, StudentNumber, Class, Major)

 

A further look at Domains

•      A domain is basically a data type

•      A domain is often more restrictive that the parent data type

–    It is the set of all possible values of the type

–    Can be finite or infinite

•      Must be atomic values

–    No composite attributes (use the simple components)

–    Multivalued attributes must be represented as separate relations

•      Each attribute must be "defined on" exactly one underlying domain

•      At any given time, there may be attributes in the domain that are not currently being used

•      Most DBMS do not support domains: i.e. let you define the possible values, and check to make sure an attribute value is in the domain.

 

The Cartesian product

•      This is all possible combinations of values from the domains

•      If all the domains are finite, the total number of tuples in the Cartesian product for a relation is the product of the cardinality of each domain.

•      Example: 

–    If domain A is the letters of the alphabet;

–    Domain B is the integers 0-99

–    Domain C is the grades a student might receive in a course

–    What would be the cardinality of the total number of possible tuples?

 

Characteristics of relations

•      A relation is defined as a set of tuples

–    Mathematically, elements of a set have no order among them

–    So theoretically, there is no ordering of tuples

–    When records are stored in a file, there IS ordering of the records

•    This just is not part of the relational definition

•      Ordering values within a tuple

–    We defined tuples as an ordered list of values.

–    All the tuples must be ordered in the same way, but which order is unimportant

•      Null is used to represent values that are unknown or do not apply

 

Relational Model Notation

•      Overhead or text 132.

 

The relational model is divided into three parts:

•      Data structure

–     The data in the database is perceived by the user as tables, and nothing but tables

•      Data integrity

–    The tables satisfy certain integrity constraints

•      Data manipulations or operators

–    The operators derive tables from tables

•      The actual implementation of any DBMS is a subset of the relational model.

 

Relational structure summary

•      A relation is perceived by the user as a 2-D structure composed of tuples and attributes.

•      Each table must have a primary key that uniquely identifies each tuple.

•      Each tuple represents a single entity within the entity set.

•      Each column represents an attribute of the entity, and each has a distinct name.

–    All values of an attribute must come from the same domain

•      Some relations represent entities, others relationships

–    Which table in the company DB represent entitites, which relationships?