Database Concepts and Architecture

Data models

• The purpose of a model is to provide a level of abstraction (i.e. hide details)

• The model is a set of concepts that can be used to describe the structure of a database. The structure includes:

• data types and length

• relationships

• Constraints

• A data model often includes:

– A set of operations for retrievals & updates

– A set of valid user-defined operations that are allowed

• Different models have different objectives.

Data model categories

– A High level or conceptual model is close to the way users perceive the data.

• This type of model is often used to transmit ideas from end user to the DB designer.

• The E-R diagram is of this type

– A representational or implementation model hide some details of how the data is stored, but can be implemented on a computer

• The relational model is of this type

– A low level or physical model is the model of how the data is physically stored on the computer.

• This would include stored in a hash table or B+ tree

• The main data model we will look at is the relational model

– the data is perceived by the user as tables

– the operations on the tables generates new tables from base tables

• But we will use the E-R (Entity-Relationship) model to help with design

• We will also look at indexes and B+ trees, low level models

• The Object-oriented model is becoming more and more important, as is the object oriented/relational model

• The Hierarchical and Network model are older models (covered in the appendix of your text)

Database Schemas and Instance

• Understanding database systems requires making a clear distinction between the database’s structure (the schema) and its contents (the state or instances.)

• Schema - the description of the database structure

– Called the meta data (specified during db design)

– Does not change often

• Instance - the actual data

– Changes all the time

– The instance is the data at an instant in time

The DBMS architecture

• The architecture helps achieve the important characteristics of a DBMS

• Three of these important characteristics are:

– Data independence

– Multiple user views

– A catalog to store the metadata that is available to both the user and the DBMS

• The goal of the architecture used (three schemas) is to separate the user applications and the physical database

Three-schema Architecture

• Reminder: A schema is a description of the structure of the database

• Internal schema –describes the physical storage structure of the db

• Conceptual schema – describes the structure of the entire db for all users

• External schemas – multiple descriptions of the db, depending on what the particular user needs

• Notice that each of these schemas are only descriptions of data; the data really only exists at the physical level

• Note: most DBMS do not separate the three levels completely

External Level

• The end user is presented with special views that are tailored to their specific needs

• Some of these views may be forms to fill out, others for interactive retrieval of information, etc.

• Many different views may be available that are completely different that the way the base tables are arranged

• Note in your book the difference interfaces often provided by DBMS

– Web menus, forms, GUI, Natural language, interfaces for parametric users and the DBA

Conceptual or logical level

• Description of the structure of the entire DB

• Hides the details of physical storage

• Concentrates on describing entities, data types, relationships, user operations, and constraints.

• In the relational model, this is the base tables

• Changes at this level can be made without having any effect on the external level

Internal Level

• This is a description of the physical storage structure of the database

– This includes data structures like hash tables, indexes, B+ trees.

• Operations performed here are translated into modifications of the contents and structure of the files

Advantages of the three tiered architecture

• This division into levels allows both developers and users to work on their own levels

– They do not need to know the details of the other levels

– AND they do not have to know anything about changes in the other levels

Mappings

• Note this:

– The only data that is saved in the database is at the internal level

• Both the data and metadata are saved by the DBMS at the physical level

– Each user group refers only to its own external schema

– so the DBMS must transform a request on an external schema

• into a request against the conceptual schema,

• then into a request on the internal schema for processing.

• What if the request is for a retrieval?

– A lot of overhead!

• The process of transforming requests and results between levels are called mappings.

• These mappings may be time consuming, so some DBMS do not separate the three levels completely.

Data independence

• One reason for the three level architecture is to give data independence.

• One way to define data independence: the capacity to change the schema at one level of a DB system without having to change the schema at the next higher level

• If the schema at one level is changed,

– and the mapping to the next higher level are changed,

– the schema at the next level can remain unchanged.

Two types of data independence

• Logical data independence

– Possible to change the conceptual schema without changing external schema or application programs

– Columns could be added to tables without changing the existing external schema or application programs

– Changes here necessitate changes to the physical level

• Physical data independence

– Possible to change internal schema without changing the conceptual schema

– The mappings from physical to conceptual must be changed, of course

Data independence (cont.)

• Example of physical data independence:

– DBA decides it is more efficient to store the data in a B+ tree instead of a hash table, so changes the internal schema

– then he changes the mapping to the conceptual schema

– conceptual schema remains unchanged.

Client/Server Architectures

• First, look at client/servers in general, not necessarily DBMS.

• In its simplest form, the server is on a (usually powerful) computer that has information to share, and sits and waits for a client to ask it for information

• The client interacts with the user, finds out what they want, and then goes to the server and asks for the information, the takes it back to the client.

Two-tier client/server architecture for DBMS

• May be implemented several ways

– Client: user interface & application programs;
Server: the DBMS

– A connection must be made between the client and server;

• The standard ODBC provides an API (application programming interface) which allows clients to call the DBMS

• Vendors provide ODBC drivers to interact with their DBMS

• Another standard JDBC is used for the Java

• The tendency has been to shift more and more of the functionality to the client

Three-tier client/server architecture for DBMS

• Many web applications use a three tier architecture.

• An server is added between the two tiers we talked about above

• The middle tier is sometimes called the web server, or the application server

– One of the main functions of the middle tier is to provide additional security

– It can also do some processing of requests from the client.

Classification of DBMS

• Database management systems can be classified several ways, including

– The data model used (this is the main criterion)

• Possible models: relational, object, relational-object, XML, hierarchical, network,

– The number of users

• Possible: single user, multiuser

– The number of sites that store the data

• Possible: Centralized, distributed