What Is a Database?
Before we can answer the central question of this chapter, namely, What is MySQL? you and I must find a common language. Therefore, this section presents a rudimentary database glossary, without going into great detail. (If you have already had significant dealings with relational databases, you can skip the next couple of pages in good conscience.)
There is scarcely to be found a term that is less precise than database. A database can be a list of addresses residing in a spreadsheet program (such as Excel), or it can be the administration files of a telecommunications firm in which several million calls are registered daily, their charges accurately calculated, monthly bills computed, and warning letters sent to those who are in arrears.
A simple database can be a stand-alone operation (residing locally on a computer for a single user), while others may be used simultaneously by thousands of users, with the data parceled out among several computers and dozens of hard drives. The size of a database can range from a few kilobytes into the terabytes.
In ordinary usage, the word "database" is used to refer to the actual data, the resulting database files, the database system (such as MySQL or Oracle), or a database client (such as a PHP script or a program written in C++). Thus there arises a great potential for confusion as soon as two people begin to converse on the subject of databases.
Relations, Database Systems, Servers and Clients
A database is an ordered collection of data, which is normally stored in one or more associated files. The data are structured as tables, where cross references among tables are possible. The existence of such relations among the tables leads to the database being called a relational database.
Let us clarify matters with an example. A database might consist of a table with data on a firm's customers (name, address, etc.), a table with data on the products the firm offers, and finally, a table containing the firm's orders. Through the table of orders it is possible to access the data in the other two tables (for example, via customer and product numbers).
MySQL, Oracle, Microsoft SQL Server, and IBM DB2 are examples of relational database systems. Such a system includes the programs for managing relational databases. Among the tasks of a relational database system are not only the secure storage of data, but also such jobs as the processing of commands for querying, analyzing, and sorting existing data and for storing new data. All of this should be able to take place not only on a single computer, but over a network as well. Instead of a database system we shall often speak of a database server.
Where there are servers, there are clients. Every program that is connected to the database system is called a database client. Database clients have the job of simplifying the use of the database for the end user. No user of a database system in his or her right mind would wish to communicate directly with the database server. That is much too abstract and inconvenient. (Let programmers worry about such direct communication!) Instead, the user has a right to expect convenient tables, list boxes, and so on to enable the location of data or to input new data.
Database clients can assume a variety of forms, and indeed, they are often not recognized by the user as database programs at all. Some examples of this type of client are HTML pages for the display and input of messages in an on-line discussion group, a traditional program with several windows for managing addresses and appointments, and a Perl script for executing administrative tasks. There is thus wide scope for database programming.
Relational Versus Object-Oriented Database Systems
Relational databases have dominated the database world for decades, and they are particularly well suited for business data, which usually lend themselves to structuring in the form of tables. Except for the following two paragraphs, this entire book discusses only relational databases (though we shall not always stress this point).
Another kind of database is the object-oriented database. Such databases can store free-standing objects (without having to arrange them in tables). Although in recent years there has been a trend in the direction of object-oriented programming languages (such as Object-Store, O2, Caché), object-oriented databases have found only a small market niche.
Note that relational databases can be accessed by means of object-oriented programming languages. However, that does not turn a relational database into an object-oriented one. Object-oriented database systems enable direct access to objects defined in the programming language in question and the storage of such objects in the database without conversion (persistency). It is precisely this that is not possible with relational database systems, in which everything must be structured in tables.
Tables, Records, Fields, Queries, SQL, Index, Keys
We have already mentioned tables, which are the structures in which the actual data are located. Every line in such a table is called a data record, or simply record, where the structure of each record is determined by the definition of the table. For example, in a table of addresses every record might contain fields for family name, given name, street, and so on. For every field there are precise conditions on the type of information that can be stored (such as a number in a particular format, or a character string with a predetermined maximum number of characters).
The description of a database consisting of several tables with all of its fields, relations, and indexes (see below) is called a database model. This model defines the construction of the data structures and at the same time provides the format in which the actual data are to be stored.
Tables usually contain their data in no particular order (more precisely, the order is usually that in which the data have been entered or modified). However, for efficient use of the data it is necessary that from these unordered data a list can be created that is ordered according to one or more criteria. It is frequently useful for such a list to contain only a selection of the data in the table. For example, one could obtain a list of all of one's customers, ordered by ZIP code, who have ordered a rubber ducky within the past twelve months.
To create such a list, one formulates queries. The result of the query is again a table; however, it is one that exists in active memory (RAM) and not on the hard drive.
To formulate a query one uses SQL instructions, which are commands for selecting and extracting data. The abbreviation SQL stands for Structured Query Language, which has become a standard in the formulation of database queries. Needless to say, every producer of a database system offers certain extensions to this standard, which dilutes the goal of compatibility among various database systems.
When tables get large, the speed at which a query can be answered depends significantly on whether there is a suitable index giving the order of the data fields. An index is an auxiliary table that contains only information about the order of the records. An index is also called a key.
An index speeds up access to data, but it has disadvantages as well. First, every index increases the amount of storage on the hard drive necessary for the database file, and second, the index must be updated each time the data are altered, and this costs time. (Thus an index saves time in the reading of data, but it costs time in entering and altering data. It thus depends on the use to which the data are to be put whether an index is on the whole a net plus or minus in the quest for efficiency.)
A special case of an index is a primary index, or primary key, which is distinguished in that the primary index must ensure a unique reference to a record. Often, for this purpose one simply uses a running index number (ID number). Primary indexes play a significant role in relational databases, and they can speed up access to data considerably.
What is MySQL?
Part 1: Database glossary
Part 2: MySQL features
Part 3: MySQL limitations
Part 4: MySQL version numbers
Part 5: MySQL licensing
Part 6: MySQL version names
Part 7: MySQL alternatives
The above tip is excerpted from from Chapter 1, "What is MySQL?" of The Definitive Guide to MySQL 5 by Michael Kofler, courtesy of Apress. Find it helpful? Purchase the book here.
About the author: Michael Kofler holds a Ph.D. in computer science from Graz Technical University. He has written a number of successful computer books on topics such as Visual Basic, Visual Basic .NET, and Linux. Michael is the author of The Definitive Guide to MySQL 5, Third Edition and Definitive Guide to Excel VBA, Second Edition from Apress.
Dig deeper on Database Management Products and Solutions