A table is in 1NF if it contains only atomic values. That is, each column should have only one value for each row.
Example:
Suppose we have a table students
that stores information about students and their courses. An unnormalized table might look like this:
CREATE TABLE unnormalized_students (
student_id INT,
student_name VARCHAR(100),
courses VARCHAR(200)
);
INSERT INTO unnormalized_students (student_id, student_name, courses)
VALUES (1, 'John Doe', 'Math,Physics');
To convert it to 1NF, we need to split the courses
column into multiple rows:
CREATE TABLE normalized_students (
student_id INT,
student_name VARCHAR(100),
course VARCHAR(100)
);
INSERT INTO normalized_students (student_id, student_name, course)
VALUES (1, 'John Doe', 'Math'), (1, 'John Doe', 'Physics');
A table is in 2NF if it is in 1NF and every non - key attribute is fully functionally dependent on the primary key.
Example:
Consider a table orders
with columns order_id
, product_id
, product_name
, and quantity
. If the primary key is a composite key of order_id
and product_id
, and product_name
depends only on product_id
, the table is not in 2NF.
-- Non - 2NF table
CREATE TABLE non_2nf_orders (
order_id INT,
product_id INT,
product_name VARCHAR(100),
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- Normalized to 2NF
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100)
);
CREATE TABLE orders (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
A table is in 3NF if it is in 2NF and there are no transitive dependencies. A transitive dependency occurs when a non - key attribute depends on another non - key attribute.
Example:
Suppose we have a table employees
with columns employee_id
, department_id
, department_name
. If department_name
depends on department_id
, and department_id
is not part of the primary key, the table is not in 3NF.
-- Non - 3NF table
CREATE TABLE non_3nf_employees (
employee_id INT PRIMARY KEY,
department_id INT,
department_name VARCHAR(100)
);
-- Normalized to 3NF
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100)
);
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
Many database design tools, such as MySQL Workbench, can assist in the normalization process. These tools allow you to visualize the database schema, identify potential normalization issues, and generate SQL code for creating normalized tables.
Using the correct data types for columns helps in reducing data redundancy and improving data integrity. For example, use INT
for storing integer values instead of VARCHAR
if the data is always a number.
-- Correct usage of data types
CREATE TABLE users (
user_id INT PRIMARY KEY,
age INT,
username VARCHAR(50)
);
Every table should have a primary key. The primary key uniquely identifies each row in the table and helps in maintaining data integrity.
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
Foreign keys are used to establish relationships between tables. They ensure that the data in one table refers to valid data in another table.
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
Over - normalization can lead to performance issues, as it may require multiple joins to retrieve data. Therefore, it is important to strike a balance between normalization and performance. In some cases, denormalization (the opposite of normalization) may be necessary for performance reasons.
Maintain detailed documentation of the database schema, including the tables, columns, primary keys, foreign keys, and relationships. This documentation helps in understanding the database structure and making future changes.
As the data requirements change over time, the database schema may need to be updated. Regularly review the schema to ensure that it remains normalized and efficient.
SQL database normalization is a fundamental concept in database design. By following the rules of 1NF, 2NF, and 3NF, and applying the best practices, you can create a database that is efficient, reliable, and easy to maintain. However, it is important to balance normalization with performance requirements and to adapt the schema as the data needs change.