Home/blogs/database_index

database_index

Database Index (SQL)

TL;DR

A database index is a data structure that improves the speed of data retrieval operations on a table at the cost of additional space and slower writes. It works similarly to the index of a book — letting the database quickly locate the rows you're querying for.


What is an Index?

An index in SQL databases is a performance optimization feature used to speed up the retrieval of rows. It’s created on one or more columns of a table and works like a lookup table that the database can use instead of scanning the entire dataset.

For example, you can create an index on the email column of a users table to speed up queries that search by email:

CREATE INDEX idx_users_email ON users(email);

Without an index, the database must scan the entire users table (a full table scan) to find the requested data.


How an Index Works

Under the hood, most relational databases use B-trees or variants like B+ trees for indexing. These structures allow logarithmic time complexity (O(log n)) for reads, compared to linear time complexity (O(n)) for full scans.

Imagine a Python example using a dictionary as a basic analogy:

# Unindexed: O(n) search
users = [{'id': 1, 'email': 'a@example.com'}, {'id': 2, 'email': 'b@example.com'}]
def find_user_by_email(email):
    for user in users:
        if user['email'] == email:
            return user

# Indexed: O(1) lookup using hash (simplified)
user_index = {'a@example.com': {'id': 1, 'email': 'a@example.com'},
              'b@example.com': {'id': 2, 'email': 'b@example.com'}}
def find_user_by_email_indexed(email):
    return user_index.get(email)

In reality, SQL databases use tree-based structures (not hash maps) to support range queries and ordering.


Pros and Cons

✅ Pros

❌ Cons


When to Use Indexes

Avoid indexing: