JMA201-01
Fundamentals of Application Development

DEPARTMENT SCHEDULE SYLLABUS RESOURCES
Decomposing Tables and Joins

As we begin to work with databases we must consider normalization. Normalization of database tables is a process that looks to reduce inefficiency. In practical terms, it means we look for repeated values, and then decompose them.

Consider the following database table, based on cars:

I might fill in data like this:

By storing this data in a database, we gain the ability to sort and filter with SQL as we discussed earlier. However, it is inefficient. Notice the repetition in the Make column. If we have a lot of cars with the same make - even more likely if this database is for a Ford dealership, for example - we are repeating the word Ford over and over again.

Second, realize that we have the option to show only cars made by Ford with a simple select statement:

Select * from Cars where Make="Ford"

However - if a user had mistyped when adding a value - let's say they typed ford, or FORD, or Fjord... well, those cars, while MEANT to be Ford, would not show up in a report.

Normalization is a process of decomposing - moving an attribute to a different table, making a primary key, and using the primary key value in place of the attribute value. It eliminates redundancy. Here's how we would do this.

  1. Create a new table to hold the decomposed value. I'm going to create a table called Makes, and choose primary key column and a column to hold the attribute value:

  2. Next, let's see it with some data:

  3. Now is the hard part - we have to find a way to change the old table to make use of our ID's for the Make values. When we use the Primary key of one table as a value in another table, it is called a Foreign key. Here's what we're going to do - change the design if the Cars table by removing the Makes column, and add in a numeric column instead called MID which will hold the makes primary key. (Warning - this is difficult to do after data has been entered into the cars table, because certain constraints - like required - will likely be violated when we add the new column in. While it is possible to do, it is trickier; therefore, we will do this with NO records in the cars table.) So - lets open the Cars table and delete the values out (if you are just creating a database you could skip the deletion step!):

  4. Now, we flip into design view of the table, and remove the Make Column, and add an MID column with a type set to integer:

    And save the table design when we're done.
  5. To represent the data, I now use the foreign key (i.e. the primary key from Makes, MID) in the Cars table:

    A 2 in the MID column of the cars table in this case means that car is a Chevy.
  6. Using this sample data, what would it mean if we tried to put a 5 in the MID column of Cars for a record? There is no Make with an MID of 5. We would have a problem known as a Violation of Referential Integrity. We can (and should) tell Access to prevent these types of problems. To do this, in Access 2010, we will first close all the tables. Then, flip to the Database Tools tab, and click on Relationships:

  7. A window will pop up, asking what tables we want to add to our Relationships diagram:

    Click each table you want to add (both in our case) and click Add. When you have done this, click Close. This closes the Show Table Dialog, and allows us to proceed.
  8. The tables we added show up in the diagram.

    Click the Primary key (MID in Makes) and drag it to the Foreign key (MID in Cars), and let go.
  9. A dialog appears:

    This tells the database that we are making a relationship between the two columns. Check Enforce Referential Integrity, and hit OK.
  10. Now, the diagram shows a line between the related columns with a one sign and an infinity sign. This means MID in makes will have one value for a Make (i.e. Ford) but this value can be used many times in Cars.

  11. Close this relationships window.
  12. Open both tables up, and try to add a new Car. In this example, I'm trying to add a Sentra... I made up a value for MID, guessing that when I get around to adding Nissan to the Makes table I might have used number 7 for MID. However, the DB refuses this - it prevents violations of referential integrity. I got the following message:

    If I wanted to add a Nissan Sentra, I'd have to add Sentra to the Makes table, make a note of it's MID, and then use that in the Cars table. This may seem like a pain, but the issues we discussed above - typos of Ford/ford/fjord - should be eliminated because we only use Ford's key value. Also, note that instead of 4 characters every time we use the word Ford, there's only 1 number (1). If I had 100 Ford car records, I go from 400 caracters to represent that information down to 100!
  13. Now, of course, we have a problem - the Cars table is hard to read, because we don't easily know what make something is. We can solve this with a special type of query known as an Inner Join. Simply put. it instructs a database to:
    1. Retrieve related values from two different tables based on the equality of a key value (MID in our case)
    2. Shows the related values in reassembled record form
  14. So here's my inner join query on these two tables:
    Select Makes.Make, Cars.Model, Cars.Price from Makes Inner Join Cars on Makes.MID=Cars.MID
    And here's what I saw when I ran this:
  15. Congratulations - you've just normalized your first table... unless you were already in database class.