Databases

database is any system for holding records of information.  It can be manual, for example, a card index, or a folder.  Usually we use it in the context of a computerised system.

If we were to retrieve information from a large manual database, it could take a long time, and there is the possibility of mistakes.  It will take about a second from a computer.  We can also use data to build up profiles, which is much harder, if not impossible, with a manual system.  For example, the Police hold records on a database of every person who has any criminal conviction. Most criminals show a pattern in the way that they commit crimes.  If a crime has a particular pattern to it, then the database can show very quickly a list of suspects.  Any fingerprint patterns can also be compared quickly and DNA patterns can lead the Police to a particular suspect.

Computerised databases use software called database management systems to access, update, and manipulate the data.  Microsoft Access is a DBMS.

The simplest kind of database is called a flat file, in which all the data records are held on a single file.  A data record is a collection of related data, for example, name, address, and telephone number.  They are simple to use, and even a spreadsheet can be used as one, and all word-processor programs allow for tables to be made.  They can be rather like a manual record, but on a computer.  However they have drawbacks:

  • Redundant data can make the file large;
  • Errors can arise through typing the data in different record;
  • It is harder to update and modify the data.

Flat files can be used throughout an organisation, which each department keeping their own records.  This can be referred to as a flat file information and retrieval system.  It has problems such as:

  • Redundancy of data
  • Adding new fields to a file is more difficult as all programs have to be modified;
  • The various different files are not able to link together.

Here is a typical flat file database table:

Student

First Name

Course 1

Course 2

Course 3

Arnfield Jason Physics Chemistry Maths
Byland Claire Biology Maths Chemistry
Caulfield Jason Physics Computing Maths
Denham Ian Computing Technology Electronics
Egan Jonathan Computing Law English

You can see how simple it is.  However if we had a whole college of 1500 students, the table would cover many pages.  A flat file database management system can do a simple search.  Each department has its own copy.

Now let’s rearrange the table by subject:

Subject

Student

Physics Arnfield
Caulfield
Chemistry Arnfield
Byland
Maths Arnfield
Byland
Caulfield
Computing Caulfield
Denham
Egan
And so on…

Notice that some of the cells in the subject column are blank.  The computer may not like that and return an error message “Cannot have a null value”.  Null means nothing in the field.  So we would have to modify our table to:

Subject

Student

Physics Arnfield
Physics Caulfield
Chemistry Arnfield
Chemistry Byland
Maths Arnfield
Mths Byland
Maths Cauliflower
Computing Caulfield
Computing Denham
Computing Egan
And so on…

There is redundancy here, which means that data is repeated.  This is inefficient use of computer space, and with a large file, the wasted space would be horrendous.

All these problems can be solved by a relational database system.  The advantages of this over flat files are many.  The principal advantages are:

  • All data is in a common pool, accessible by all applications.
  • The system is much more flexible.
  • Much less data is needed to be held.
  • Easier to maintain high quality information.

However there are still a couple of major problems:

  • Unproductive maintenance, where programs were still dependent on the structure of data. If a department needed to add, say, a new field, all the programs would have to be altered for all departments.
  • Lack of security, so that even the most sensitive information could be accessed

This was addressed by software called the database management system (DBMS).  There are two essential features:

1.      Storage structure of the data is hidden from the user.

2.      Restricted user access to the data.

Databases have their own particular jargon with which we need to be familiar.  Before that makes sense, we need to understand the hierarchy of data

Level

Term

Meaning

Lowest

Bit Binary digit, “on” or “off”, the 0’s and 1’s.
Byte 8 bits form a byte.  Bytes can represent characters, or other data, such as parts of a picture or a sound
Field Characters* grouped together, e.g. the letters or numbers that form a car registration number.
Record All the information about somebody or something
File A collection of related records

Highest

Database Lots of files linked so that information can be got from several files at the same time.

* Characters are represented by the ASCII codes (American Standard Code for Information Exchange).  With 8 bits, 28 (256) combinations can be represented.  In all its functions, arithmetical, or word-processing, or whatever, the computer is adding these combinations.  The vast majority of modern computers use the ASCII system.  There is another system called EBCDIC, used by a few old-fashioned mainframes.

Before a database is set up, we need to decide how we are going to lay out the records:

1.      The order in which fields are stored

2.      What each field is called

3.      What kinds of field, such as text, numeric, etc.

4.      The field length. Differences between lengths of a particular type of field can cause problems when the computer links the fields.

5.      The range of the data entered into the field.

6.      What validation rules there are.

This allows the automatic validation of data, e.g. rejection of a date like 31-2-84.  If there are to be any kinds of calculations, the field must be number.  In any database, there needs to be aprimary key, a unique identifier for any item.  In this database Cust_ID is the primary key as each customer has their own unique identity.

Notice also that the underscore character (_) is used where there is a space.  This is because some database management systems do not allow white space characters.  The name of the table, tblCustomer, is consistent with the Leszynski–Reddick Convention, which is the industry standard.

How a Relational Database is made up

The key idea for a relational database is that it is made up of different tables which have links between them.  They are linked using the primary keys.

The other fields in a table are dependent on the primary key.  So for the Customer Identity, the fields Address_1, Address_2, etc are dependent on the Customer_ID, which is logical.  Those fields would not depend on the Primary key Goods_ID, which would be in another table.

Go on to Database Design

VN:F [1.8.1_1037]
Rating: 0.0/10 (0 votes cast)
VN:F [1.8.1_1037]
Rating: 0 (from 0 votes)
  • Share/Bookmark

Improve the web with Nofollow Reciprocity.
Easy AdSense by Unreal