Database Concepts and Architecture

 

Data models

•      The purpose of a model is to provide a level of abstraction (i.e. hide details)

•      The model is a set of concepts that can be used to describe the structure of a database.  The structure includes:

•    data types and length

•    relationships

•    Constraints

•      A data model often includes:

–    A set of operations for retrievals & updates

–    A set of valid user-defined operations that are allowed

•      Different models have different objectives.

 

Data model categories

–   A High level or conceptual model is close to the way users perceive the data.

•   This type of model is often used to transmit ideas from end user to the DB designer.

•   The E-R diagram is of this type

–   A representational or implementation model hide some details of how the data is stored, but can be implemented on a computer

•   The relational model is of this type

–   A low level or physical model is the model of how the data is physically stored on the computer.

•   This would include stored in a hash table or B+ tree

•      The main data model we will look at is the relational model

–    the data is perceived by the user as tables

–    the operations on the tables generates new tables from base tables

•      But we will use the E-R (Entity-Relationship) model to help with design

•      We will also look at indexes and B+ trees, low level models

•      The Object-oriented model is becoming more and more important, as is the object oriented/relational model

•      The Hierarchical and Network model are older models (covered in the appendix of your text)

 

Database Schemas and Instance

•      Understanding database systems requires making a clear distinction between the database’s structure (the schema) and its contents (the state or instances.)

•      Schema - the description of the database structure

–   Called the meta data (specified during db design)

–   Does not change often

•      Instance - the actual data

–   Changes all the time

–   The instance is the data at an instant in time

 

The DBMS architecture

•      The architecture helps achieve the important characteristics of a DBMS

•      Three of these important characteristics are:

–    Data independence

–    Multiple user views

–    A catalog to store the metadata that is available to both the user and the DBMS

•      The goal of the architecture used (three schemas) is to separate the user applications and the physical database

Three-schema Architecture

•      Reminder:  A schema is a description of the structure of the database

•      Internal schema –describes the physical storage structure of the db

•      Conceptual schema – describes the structure of the entire db for all users

•      External schemas – multiple descriptions of the db, depending on what the particular user needs

•      Notice that each of these schemas are only descriptions of data;  the data really only exists at the physical level

•      Note: most DBMS do not separate the three levels completely

 

External Level

•      The end user is presented with special views that are tailored to their specific needs

•       Some of these views may be forms to fill out, others for interactive retrieval of information, etc.

•      Many different views may be available  that are completely different that the way the base tables are arranged

•      Note in your book the difference interfaces often provided by DBMS

–    Web menus, forms, GUI, Natural language, interfaces for parametric users and the DBA

 

Conceptual or logical level

•      Description of the structure of the entire DB

•      Hides the details of physical storage

•      Concentrates on describing entities, data types, relationships, user operations, and constraints.

•      In the relational model, this is the base tables

•      Changes at this level can be made without having any effect on the external level

 

Internal Level

•      This is a description of the physical storage structure of the database

–    This includes data structures like hash tables, indexes, B+ trees.

•      Operations performed here are translated into modifications of the contents and structure of the files

Advantages of the three tiered architecture

•      This division into levels allows both developers and users to work on their own levels

–    They do not need to know the details of the other levels

–    AND they do not have to know anything about changes in the other levels

 

Mappings

•      Note this:

–   The only data that is saved in the database is at the internal level

•   Both the data and metadata are saved by the DBMS at the physical level

–   Each user group refers only to its own external schema

–   so the DBMS must transform a request on an external schema

•      into a request against the conceptual schema,

•      then into a request on the internal schema for processing.

•      What if the request is for a retrieval?

–   A lot of overhead!

•      The process of transforming requests and results between levels are called mappings.

•      These mappings may be time consuming, so some DBMS do not separate the three levels completely.

 

Data independence                                  

•      One reason for the three level architecture is to give data independence.

•      One way to define data independence: the capacity to change the schema at one level of a DB system without having to change the schema at the next higher level

•      If the schema at one level is changed,

–   and the mapping to the next higher level are changed,

–   the schema at the next level can remain unchanged.

 

Two types of data independence

•      Logical data independence

–    Possible to change the conceptual schema without changing external schema or application programs

–    Columns could be added to tables without changing the existing external schema or application programs

–    Changes here necessitate changes to the physical level

•      Physical data independence

–    Possible to change internal schema without changing the conceptual schema

–    The mappings from physical to conceptual must be changed, of course

 

Data independence (cont.)

•      Example of physical data independence:

–   DBA decides it is more efficient to store the data in a B+ tree instead of a hash table, so changes the internal schema

–   then he changes the mapping to the conceptual schema

–   conceptual schema remains unchanged.

 

Client/Server Architectures

•      First, look at client/servers in general, not necessarily DBMS.

•      In its simplest form, the server is on a (usually powerful) computer that has information to share, and sits and waits for a client to ask it for information

•      The client interacts with the user, finds out what they want, and then goes to the server and asks for the information, the takes it back to the client.

 

Two-tier client/server architecture for DBMS

•      May be implemented several ways

–   Client: user interface & application programs;
Server: the DBMS

–   A connection must be made between the client and server;

•   The standard ODBC provides an API (application programming interface) which allows clients to call the DBMS

•   Vendors provide ODBC drivers to interact with their DBMS

•   Another standard JDBC is used for the Java

•      The tendency has been to shift more and more of the functionality to the client

 

Three-tier client/server  architecture for DBMS

•      Many web applications use a three tier architecture.

•      An server is added between the two tiers we talked about above

•      The middle tier is sometimes called the web server, or the application server

–   One of the main functions of the middle tier is to provide additional security

–   It can also do some processing of requests from the client.

 

Classification of DBMS

•      Database management systems can be classified several ways, including

–   The data model used (this is the main criterion)

•   Possible models: relational, object, relational-object, XML, hierarchical, network,

–   The number of users

•   Possible: single user, multiuser

–   The number of sites that store the data

•   Possible: Centralized, distributed