Skip to content
IRC-Coding IRC-Coding
Normal forms 1NF 2NF 3NF BCNF 4NF 5NF Database normalization Functional dependencies

Database Normalization: 1NF to 5NF Complete Guide

Master all database normal forms 1NF through 5NF with functional dependencies, multi-valued dependencies, and join dependencies.

S

schutzgeist

2 min read

Normal Forms: 1NF, 2NF, 3NF, BCNF, 4NF & 5NF Complete

This post is a comprehensive explanation of all normal forms in database theory – from 1NF to 5NF with practical examples.

In a Nutshell

Normal forms are formal rules for structuring relational databases that progressively eliminate redundancy and ensure data integrity. Each higher normal form builds on the previous one.

Compact Technical Description

Normalization is the process of progressively improving database structures through the application of formal rules. Each normal form addresses specific anomalies and dependencies.

Overview of Normal Forms:

  • 1NF: Atomic values, no repeating groups
  • 2NF: 1NF + no partial dependencies
  • 3NF: 2NF + no transitive dependencies
  • BCNF: Stronger 3NF with stricter rules
  • 4NF: 3NF + no multi-valued dependencies
  • 5NF: 4NF + no join dependencies

Mathematical Foundations:

  • Functional Dependencies: X → Y
  • Multi-Valued Dependencies: X →→ Y
  • Join Dependencies: (A, B, C)

Practical Relevance: The first three normal forms (1NF-3NF) are sufficient for most applications. Higher normal forms (4NF-5NF) are relevant for complex data structures and academic contexts.

Exam-Relevant Key Points

  • 1NF: Atomic values, no repeating groups, unique primary keys
  • 2NF: No partial dependencies on composite primary key
  • 3NF: No transitive dependencies via non-key attributes
  • BCNF: Every determinant is a candidate key
  • 4NF: No multi-valued dependencies except trivial ones
  • 5NF: No join dependencies except through candidate keys
  • Functional Dependencies: Foundation of normalization
  • IHK-relevant: Especially 1NF-3NF for database design

Core Components

  1. Functional Dependency (FD): X → Y (X uniquely determines Y)
  2. Multi-Valued Dependency (MVD): X →→ Y (X determines multiple Y values)
  3. Join Dependency (JD): (A, B, C) (table can be reconstructed from joins)
  4. Determinant: Attribute that determines other attributes
  5. Candidate Key: Minimal unique identifiers
  6. Primary Key: Selected candidate key
  7. Partial Dependency: Dependency on part of composite key
  8. Transitive Dependency: Indirect dependency via non-key attributes

Practical Examples

First Normal Form (1NF)

-- Problem: Non-atomic values and repeating groups
CREATE TABLE Bestellungen_Schlecht (
    bestell_id INT,
    kunde VARCHAR(100),
    artikel_liste VARCHAR(500), -- Non-atomic
    telefonnummern VARCHAR(200) -- Repeating group
);

-- 1NF Solution: Atomic values
CREATE TABLE Bestellungen_1NF (
    bestell_id INT PRIMARY KEY,
    kunden_id INT,
    bestelldatum DATE
);

CREATE TABLE Bestellpositionen (
    bestell_id INT,
    positionsnummer INT,
    artikel_id INT,
    menge INT,
    PRIMARY KEY (bestell_id, positionsnummer)
);

CREATE TABLE Kunden_Telefone (
    kunden_id INT,
    telefonnummer VARCHAR(20),
    PRIMARY KEY (kunden_id, telefonnummer)
);

Second Normal Form (2NF)

-- Problem: Partial dependencies
CREATE TABLE Bestellpositionen_2NF_Problem (
    bestell_id INT,
    artikel_id INT,
    menge INT,
    artikel_name VARCHAR(100), -- Depends only on artikel_id
    preis DECIMAL(10,2),       -- Depends only on artikel_id
    PRIMARY KEY (bestell_id, artikel_id)
);

-- 2NF Solution: Eliminate partial dependencies
CREATE TABLE Bestellpositionen_2NF (
    bestell_id INT,
    artikel_id INT,
    menge INT,
    PRIMARY KEY (bestell_id, artikel_id)
);

CREATE TABLE Artikel (
    artikel_id INT PRIMARY KEY,
    artikel_name VARCHAR(100),
    preis DECIMAL(10,2)
);

Third Normal Form (3NF)

-- Problem: Transitive dependencies
CREATE TABLE Mitarbeiter_3NF_Problem (
    mitarbeiter_id INT PRIMARY KEY,
    name VARCHAR(100),
    abteilungs_id INT,
    abteilungs_name VARCHAR(100), -- Transitive: abteilungs_id → abteilungs_name
    standort VARCHAR(50)           -- Transitive: abteilungs_id → standort
);

-- 3NF Solution: Eliminate transitive dependencies
CREATE TABLE Mitarbeiter_3NF (
    mitarbeiter_id INT PRIMARY KEY,
    name VARCHAR(100),
    abteilungs_id INT
);

CREATE TABLE Abteilungen (
    abteilungs_id INT PRIMARY KEY,
    abteilungs_name VARCHAR(100),
    standort VARCHAR(50)
);

Boyce-Codd Normal Form (BCNF)

-- Problem: 3NF but not BCNF
CREATE TABLE Projektmitarbeiter_BCNF_Problem (
    projekt_id INT,
    mitarbeiter_id INT,
    rolle VARCHAR(50),
    PRIMARY KEY (projekt_id, mitarbeiter_id)
);

-- Functional Dependencies:
-- projekt_id, mitarbeiter_id → rolle
-- projekt_id, rolle → mitarbeiter_id (Determinant is not a candidate key!)

-- BCNF Solution
CREATE TABLE Projektrollen (
    projekt_id INT,
    rolle VARCHAR(50),
    mitarbeiter_id INT,
    PRIMARY KEY (projekt_id, rolle)
);

Fourth Normal Form (4NF)

-- Problem: Multi-Valued Dependencies
CREATE TABLE Mitarbeiter_Kenntnisse_4NF_Problem (
    mitarbeiter_id INT,
    programmiersprache VARCHAR(50),
    projekt_id INT,
    PRIMARY KEY (mitarbeiter_id, programmiersprache, projekt_id)
);

-- MVDs: mitarbeiter_id →→ programmiersprache, mitarbeiter_id →→ projekt_id

-- 4NF Solution: Separation of MVDs
CREATE TABLE Mitarbeiter_Sprachen (
    mitarbeiter_id INT,
    programmiersprache VARCHAR(50),
    PRIMARY KEY (mitarbeiter_id, programmiersprache)
);

CREATE TABLE Mitarbeiter_Projekte (
    mitarbeiter_id INT,
    projekt_id INT,
    PRIMARY KEY (mitarbeiter_id, projekt_id)
);

Fifth Normal Form (5NF)

-- Problem: Join Dependencies
CREATE TABLE Lieferanten_Produkte_Kunden_5NF_Problem (
    lieferant_id INT,
    produkt_id INT,
    kunden_id INT,
    PRIMARY KEY (lieferant_id, produkt_id, kunden_id)
);

-- JD: *(Lieferanten_Produkte, Produkte_Kunden, Lieferanten_Kunden)*

-- 5NF Solution: Decomposition into projections
CREATE TABLE Lieferanten_Produkte (
    lieferant_id INT,
    produkt_id INT,
    PRIMARY KEY (lieferant_id, produkt_id)
);

CREATE TABLE Produkte_Kunden (
    produkt_id INT,
    kunden_id INT,
    PRIMARY KEY (produkt_id, kunden_id)
);

CREATE TABLE Lieferanten_Kunden (
    lieferant_id INT,
    kunden_id INT,
    PRIMARY KEY (lieferant_id, kunden_id)
);

Normalization Process

Step 1: Identification of Dependencies

-- Analysis of functional dependencies
-- Example: Order system
-- bestell_id → bestelldatum, kunden_id
-- kunden_id → kundenname, adresse
-- bestell_id, artikel_id → menge, einzelpreis
-- artikel_id → artikelname, lagerbestand

Step 2: Application of Normal Forms

-- 1NF: Ensure atomic values
-- 2NF: Eliminate partial dependencies
-- 3NF: Eliminate transitive dependencies
-- BCNF: Determinants as candidate keys
-- 4NF: Eliminate multi-valued dependencies
-- 5NF: Eliminate join dependencies

Step 3: Verification

-- Test for anomalies
INSERT INTO ... -- Insert anomalies?
UPDATE ...     -- Update anomalies?
DELETE ...     -- Delete anomalies?

Advantages and Disadvantages

Advantages of Normalization

  • Data Integrity: Avoidance of redundancy and inconsistencies
  • Maintainability: Changes required only in one place
  • Storage Efficiency: Reduction of redundancy
  • Consistency: Uniform data representation
  • Scalability: Better performance with large data volumes

Disadvantages

  • Performance: More joins required
  • Complexity: More complex data structure
  • Write Performance: Distributed updates
  • Over-Engineering: Excessive normalization can be counterproductive

Denormalization Strategies

Targeted Redundancy for Performance

-- Denormalized version for reporting
CREATE TABLE Bestellungen_Report (
    bestell_id INT PRIMARY KEY,
    bestelldatum DATE,
    kundenname VARCHAR(100), -- Redundant from customer table
    artikelname VARCHAR(100), -- Redundant from article table
    menge INT,
    gesamtpreis DECIMAL(10,2)
);

Common Exam Questions

  1. When is a table in 3NF but not in BCNF? When a determinant is not a candidate key (example: project employees with roles).

  2. What is the difference between 3NF and BCNF? BCNF is stricter: every determinant must be a candidate key.

  3. Explain Multi-Valued Dependencies! X →→ Y means that for each X value, the set of Y values is independent of other attributes.

  4. When are higher normal forms (4NF, 5NF) practically relevant? With very complex data structures involving many relationships, usually in academic or specialized systems.

Most Important Sources

  1. https://de.wikipedia.org/wiki/Normalisierung_(Datenbank)
  2. https://www.gatech.edu/coe/cse/database-normalization
  3. https://www.sql-tutorial.ru/sql-normalization-complete.html

Keine Bücher für Kategorie "datenbanken" gefunden.

Back to Blog
Share:

Related Posts