Database Concepts and Architecture

 

Data models

•      The purpose of a model is to provide a level of abstraction (i.e. hide details)

•      The model is a set of concepts that can be used to describe the structure of a database.  The structure includes:

•    data types and length

•    relationships

•    Constraints

•      A data model often includes:

–    A set of operations for retrievals & updates

–    A set of valid user-defined operations that are allowed

•      Different models have different objectives.

 

Data model categories

–   A High level or conceptual model is close to the way users perceive the data.

•   This type of model is often used to transmit ideas from end user to the DB designer.

•   The E-R diagram is of this type

–   A representational or implementation model hide some details of how the data is stored, but can be implemented on a computer

•   The relational model is of this type

–   A low level or physical model is the model of how the data is physically stored on the computer.

•   This would include stored in a hash table or B+ tree

•      The main data model we will look at is the relational model

–    the data is perceived by the user as tables

–    the operations on the tables generates new tables from base tables

•      But we will use the E-R (Entity-Relationship) model to help with design

•      We will also look at indexes and B+ trees, low level models

•      The Object-oriented model is becoming more and more important, as is the object oriented/relational model

•      The Hierarchical and Network model are older models (covered in the appendix of your text)

 

Database Schemas and Instance

•      Understanding database systems requires making a clear distinction between the database’s structure (the schema) and its contents (the state or instances.)

•      Schema - the description of the structure database

–    the meta data (specified during db design)

–    does not change often

•      Instance - the actual data

–    changes all the time

–    the db at an instant in time

 

The DBMS architecture

•      The architecture helps achieve the important characteristics of a DBMS

•      Three of these important characteristics are:

–   Data independence

–   Multiple user views

–   A catalog to store the metadata that is available to both the user and the DBMS

•      The goal of the architecture used is to separate the user applications and the physical database

 

Three-schema Architecture

•      Reminder:  A schema is a description of the structure of the database

•      Internal level has an internal schema

•      Conceptual level has a conceptual schema

•      External level usually has several external schemas

•      Note that each of these schemas are only descriptions of data;  the data really only exists at the physical level

 

External Level

–   The end user is presented with special views that are tailored to their specific needs

–    Some of these views may be forms to fill out, others for interactive retrieval of information, etc.

–   Many different views may be available  that are completely different that the way the base tables are arranged

Conceptual or logical level

–   Description of the structure of the entire DB

–   Hides the details of physical storage

–   Concentrates on describing entities, data types, relationships, user operations, and constraints.

–   In the relational model, this is the base tables

–   Changes at this level can be made without having any effect on the external level

 

Internal Level

•      This is a description of the physical storage structure of the database

•      Operations performed here are translated into modifications of the contents and structure of the files

 

Advantages of the three tiered architecture

•      This division into levels allows both developers and users to work on their own levels

–    They do not need to know the details of the other levels

–    AND they do not have to know anything about changes in the other levels

•      Most DBMS do not separate the three levels completely.

 

Mappings

•      Note this:

–   The only data that is saved in the database is at the internal level

•   Both the data and metadata are saved by the DBMS at the physical level

–   Each user group refers only to its own external schema

–   so the DBMS must transform a request on an external schema

•      into a request against the conceptual schema,

•      then into a request on the internal schema for processing.

•      What if the request is for a retrieval?

–   A lot of overhead!

•      The process of transforming requests and results between levels are called mappings.

•      These mappings may be time consuming, so some DBMS do not separate the three levels completely.

 

Data independence                                  

•      One reason for the three level architecture is to give data independence.

•      One way to define data independence: the capacity to change the schema at one level of a DB system without having to change the schema at the next higher level

•      If the schema at one level is changed,

–   and the mapping to the next higher level are changed,

–   the schema at the next level can remain unchanged.

 

Two types of data independence

•      Logical data independence

–    Possible to change the conceptual schema without changing external schema or application programs

–    Columns could be added to tables without changing the existing external schema or application programs

–    Changes here necessitate changes to the physical level

•      Physical data independence

–    Possible to change internal schema without changing the conceptual schema

•      Example of physical data independence:

–   DBA decides it is more efficient to store the data in a B+ tree instead of a heap, so changes the internal schema

–   then he changes the mapping to the conceptual schema

–   conceptual schema remains unchanged.

 

Database languages

•       Data Definition Language (DDL) is used to specify the internal & conceptual schemas.

–   The DBMS will identify descriptions of the schema and store the schema descriptions in the DBMS catalog.

–   After the structure is defined and made machine readable, data can be added to the database.

•      Data Manipulation Language (DML) is used to manipulate the DB

•      Manipulation includes:

–   retrieval (querying the DB)

–   inserting records

–   deleting records

–   changing data already entered

 

SQL is both DDL and DML for the different levels

•      It is based  on Sequel ( and often called Sequel)

•      SQL is a comprehensive integrated language that includes constructs for:

–    conceptual schema definition,

–    view definition

–    data manipulation

•      SQL can either be embedded in a host language or stand alone, called a query language.