top of page
Search

Database Normalization

  • Writer: Maria Elena Morillo Tejada
    Maria Elena Morillo Tejada
  • May 24, 2022
  • 3 min read

Updated: Dec 6, 2023

Is the process of the structuring a relational database to eliminate or reduce data redundancy and improve data integrity. These rules were proposed by Edgar F. Codd as part of his relational model. The main part of a normalization process is including new tables and establishing relationships between new tables and the principal table follows the normalization rules to design a database more flexible and protect the data.


The process of normalization in a table is a progressive work, that’s means the highest level of database normalization can not be applied if the database doesn’t have the lowest level of normalization. For instances, if you want to implement a Third normal form in your model you have applied the First and the Second normal form before. Normalization has ten normal forms to make improvements in relational databases but implementing the First, Second, and Third can help to improve data integrity also reducing data redundancy.


In this post we will implement the normal forms until the third normal form, because the others in many cases are used for others purpose, like educational or specific rules that a database model design needs.


To start implementing the normal forms we have the set of data showed in the image below, that is totally denormalized.


De-normalized table Person

This table keeps track of a few pieces of information:

- Country where reside the person

- Address

- Phone numbers

- Gender


In this model, you can see how the data is repeated in some columns and doesn’t have any structure established, also has columns that are used for different purposes and not for the main purpose of the table Person, that it is to show the principal information of one person. For the problems mention before, we have to implement the normal forms to eliminate those.


First Normal Form (1FN): “No repeating groups”


The first normal form in a database is implemented if each column has a single value, that’s mean that you can’t have two addresses or two phones in an Address or Phone column.

To implement First normal form, we have to:

  • Create new row to storage each address and phone number in a unique row for each person.

  • Create a row identifier or ‘Primary Key’


We created: the same a row with the different information about the person identified with the primary key in Person table.


1FN



Second Normal Form (2FN): “Eliminate redundant data”


  • First normal form Implemented

  • Each non-key attribute must be functionally dependent on the primary key


The second normal form is implemented if the database is previously in First normal form.

‘Each non-key attribute must be functionally dependent on the primary key’, that’s means that the multiple Phone and Address values for each Person in the above table is not functionally dependent on Person (primary key), so this relationship is not in second normal form.


For this, we have to create a new table to store the multiple data about Address and Phone for each person related to the Person table by a foreign key (see my last post about Keys and Constraints).


Second Normal Form




Third Normal Form(3FN): "Has no transitive functional dependencies"

  • Second normal form implemented

  • Has no transitive functional dependency


The third normal form is implemented if the database is previously in Second normal form.


The third form say: ‘Has no transitive functional dependency’, as you can see, in the table Person with have information about each country of each person, but the information of the Country no depends on the Person primary key.



Third normal form implemented.



Another example is to adding another table to storage all the address postcode.





Conclusion


Database normalization is a common process that helps in eliminating data redundancy and improving in the data integrity. you should normalize if you feel that introducing update or insert anomalies can severely impact the accuracy or performance of your database application.


Also, normalization is a good choice to make relational models more robust and easier to understand. In other part, normalization is not recommended in system s where you are looking for fast and easy return of data, like a data warehouse system, where the information is storage like a heap of data. Lastly, to implement normalization you should know in a more deeply manner about database design and how to do relationships with others tables.

 
 
 

Comments


Post: Blog2 Post

©2022 by María Morillo - Coffee SQL and Tech -

bottom of page