Data Storage
Chapter
13
Storing the Data for Databases
Databases are stored physically as files of
records
Today we will look at some of the aspects of physical
storage. This includes
The fact that it is much slower than electronic storage
A little about how it works
Some ways to speed it up
Memory hierarchy
Primary storage
is the fastest and most expensive
It includes cache
memory(static RAM) and main memory(DRAM)
Flash memory
is EEPROM (electronically erasable programmable ROM)
Advantages is the
fast access speed
Disadvantage is
that an entire block must be erased and written over at a time
Secondary storage is MUCH slower and less expensive.
Types of secondary storage
The main types of secondary storage include (in order
of decreasing speed)
magnetic disks,
optical storage
and
magnetic tapes
Your book gives a great description of disks and disk
drives and tapes and.
Read it if you
have not covered this material earlier
One of the relative newcomers to the field is RAID
technology
Disks
Disks are organized into tracks, then sectors, then
blocks(or pages)
a
block has a specific hardware address
this
is its surface number, track number, and block number
Transfer from disk
to RAM is in BLOCK units
An entire block is transferred into a buffer in RAM
with a READ command
A WRITE command copies the the contents of the buffer
to a disk block
Sometimes several contiguous blocks, called a cluster,
are transferred as a unit
Time to transfer a disk block
Seek time --
find the correct track
Rotational delay-- find the correct sector
together these
two take about 15-60 milliseconds
Block transfer time --
Takes about 1-2
milliseconds for each block
Note that this is
much less than the seek& rotational delay
So, it takes
roughly 10-60 milliseconds (.01 to .06 seconds) to read or write a block on
disk
In that time, a typical machine can execute several
million instructions
The time to
retrieve data from secondary storage is very high compared to CPU speeds
Result: I/O is a major bottleneck
Secondary storage speed
There is a great need
to improve the performance of disk access
The performance of RAM has increased MUCH faster than
disk accesses
The capacities
and speed of RAM double or quadruple every few years
Speed of disk
storage improves at less than 10% per year
Transfer rates
improve at ~ 20% per year
Disk capacity
improve at about 50% per year
Data striping
One way to
improve the speed of disk access is data striping
Data striping
distributes data transparently over multiple disks to make them appear as a
single large, fast disk.
It improves
overall I/O because it allows multiple I/Os in parallel
This improves overall transfer rates since the data is
being transferred from many disks
Data striping
also helps balance the load among disks.
In addition, the data is often stored redundantly, with
error checking, improving reliability.
This architecture (called RAID) can improve both
speed and reliability.
Redundant Arrays of Independent Disks RAID technology
The main goal of RAID is to even out the widely
different rates of electronic storage and physical storage
It does this by using smaller parallel disks acting as
a single logical disk
The data is striped on the smaller disks, so it can be
read in parallel, thus speeding it up
Improving reliability with RAID
For an array of n disks, the likelihood of failure is
n times as much as for that of one disk
If the Mean
Time To Failure (MTTF) of a disk drive is 200,000 hours (about 22.8 years)
then that of a bank of 100 disks becomes only 2000 hours (83.3 days)
So the system
must be prepared for failure.
An obvious solution is to keep the data in more than
one place
There are many
disadvantages to this which must be overcome
One technique for introducing redundancy is called
mirroring.
Data mirroring
Data is written
redundantly to two disks that are treated as one logical disk
If a disk fails, the other disk is used until the
first is repaired
Disk mirroring doubles the rate at which read requests
are handled, since a read can go to either disk
Using mirror disks does not speed up writing
but neither does it slow it down, when
compared with a single disk, if the writing takes place in parallel.
Improving Performance with RAID
Data striping
gives a higher transfer rate, since data is transferred in parallel
Data striping may
be applied at different levels of granularity
Fine grained disk arrays use small data blocks so that all
requests are serviced using all the disks at the same time
The idea is to maximize the data transfer rate
Course grain disk array use large data blocks so that small
requests can be serviced in parallel since they will access only a few disks
Large requests can still benefit from high transfer
rates by using may disks
RAID architecture level 0 to level 6
Level 0 has no redundancy (which gets the highest
write performance) while level 6which
keeps two redundant disks
The granularity of the striping varies from bit-level striping
to block-level striping
Designers of RAID setup have to confront many design
decisions
The level of RAID
The number of
disks
The choice of
parity schemes
The grouping of
disks for striping
Reference on data striping and RAID
Buffering of Blocks
When several blocks are transferred from disk to RAM
and all the block addresses are known, a buffer is needed for each block
The CPU can be reading one buffer while another is
being transferred
This is possible because usually there is a separate
I/O processor
Double buffering eliminates the seek time and
rotational delay for all but the first block
Putting file records on disk
Data is usually
stored in the form of records (we called them tuples)
The record is made of fields of related data
Each field has a specified data type, which may be of
fixed size, or variable
Most DBMS systems also allow images, digitized video
ro audio streams or free text
These are referred to as BLOBs (Binary Large Objects)
A file is a sequence
of records
The records may
be either fixed length of variable length
Fixed length records make locating a record relatively
easy
The
system must know the address of the block of the first record, and since it
knows the length of each record, it can find the start of all the records
Variable length records
Why records may be of variable length
a field itself
can be of variable length
a field may be
optional
there may be some
repeating groups
a block may
contain records from different files
How the file system finds each variable length record
The system must
know the address (track, sector, block) of the first record
Separator
characters must be used in the record in several ways
Can
be used to terminate variable length fields, or
If
some fields are optional, can be used to separate pairs of <attribute,
value> pairs, then
Can
be used to indicate the end of the record
Record blocking
Since the block is the unit of data transfer, record
addresses are block addresses
If the block is larger
than a record
multiple records per block
blocking
factor is the # of records/block
if records do not completely fill the block, space is
wasted
This space can be used by a partial record, which spans
two blocks
Spanned vs. Unspanned
Records
Records that span
blocks
fractions of a record can be stored on a block to use
unfilled space
then, a pointer at the end of the block points to the
next block
If a record is larger than a block, it must be spanned
Allocating files blocks on disk
If a file is
larger than one disk block, several blocks must be allocated to store the
file. This can be done several ways:
Contiguous allocation
reading
the whole file is very fast, using double buffering
expanding
the file is difficult
Linked allocation
easy
to expand the file
slower to read the whole file
Allocating clusters of consecutive blocks
the clusters are linked together
clusters are sometimes called segments or extents.
Indexed allocation
index blocks contain pointers to the actual file
blocks
File Headers(descriptors)
The file header contains information needed by the
programs that access the file records.
It includes:
Information to
determine the disk address of the file blocks
This
includes the track and sector
Record format
descriptions
field
length & order of fields for unspanned fixed-length records
May
include other information, such as a table for external hashing
To find a record on disk
One or more blocks are copied into RAM buffers
Using the information in the file header a search is
done for the desired record within the buffer
If the address of the block that contains the desired
records is not known a linear search through the file blocks must be done
This can be very
time consuming!
So, a good
file organization makes it easier to find the address of the block
desired
The goal is to have as few block transfers as possible
Good file organizations
What is meant by
a good file organization depends on how the file will be used.
Usually we expect
that some search conditions will be used more often than others.
A good file
organization should perform as efficiently as possible the operations we expect
to apply frequently to a file
Example
The EMPLOYEE file may be search frequently by SSN to
modify records, so an organization that makes finding records by SSN is
important
But it may also be used frequently by payroll who must
access the employees by department
Some possible file organizations
Unordered records (heap files)
Sorted files
Hashed files
When deciding on a file organization, need to
consider:
How fast will
retrievals be (read only)
How fast will
updates be (read/write)
Unordered or heap files
Sometimes called a sequential or pile file
It is not the
heap that is used for priority queues
Indexes can be
used with this organization
Inserting records
is very efficient
Just
retrieve the last block in the file, add the record, then write it back to disk
Searching
requires a linear search
Must
search through the file, transferring block after block into RAM
Deleting is also
expensive since it must find the record
First
the block where the record is kept must be found and copied into RAM, then the
record deleted from the buffer, then rewrite the block back to disk
Ordered files (sorted)
Must be ordered
on one field in the record (often a key field)
Retrievals are
often faster than unordered
Especially if need to read the records in order
It may be possible to use a binary search (but this
doesnt work nearly as well as searching an ordered array)
Access to any but
the ordering field requires a sequential search of all the blocks until found
Inserting records
is expensive, since the file ordering must be maintained
Sometimes an overflow file is used; periodically it is
sorted and merged with the original file
Ordered files are
rarely used without a primary index.
Hashing
Can result in retrievals in constant time
The idea is to have a function (the hash function)
that takes a key field and turns it into the address where the record is stored
Then a search for the record within the block is done
For most records, only one block access is needed
Types of hashing
Internal hashing
External hashing
Hashing with
dynamic file expansion
Hash functions
The function should give us key in ΰ index out
We need to be sure that the the index is within the
range of our array
The mod function will do this for us
Suppose the key field is SS#
If we want 99
records in our array, we could divide by 99 and use the remainder.
Suppose the key field is a name, or other string
We could add
together the ASCII codes of the letters then divide by 99 and use the remainder
Internal hashing
Used when the
database can be fit into memory
The information is kept in an array of records
Each record has a
unique key field (name, number, whatever)
A hash
function relates the key to the index
the hash function maps to the key
A collision
occurs when two keys hash to the same index
Open addressing
check the array for an open position, and put the record there
Chaining the
record has a pointer field; when a collision occurs, it points to the space
where the record can be found
Double hashing
apply a second hash function if the first results in a collision
External hashing
External hashing
is used when the database is kept on disk
The address space
is made of buckets, each of which holds multiple records
A bucket may be
one disk block, or a cluster of contiguous blocks
The hash function
maps a key into a bucket number
Then a table in
the file header converts the bucket number into the corresponding disk block
address.
Since each bucket
can hold multiple records, the collision problem is less severe than internal
hashing
When a bucket becomes full it can point to another
bucket where records that hash to the bucket are kept
Searching for a
value in a field that is not the hashed field requires a sequential search
This means all the blocks have to be transferred to
RAM, the sequentially searched