- Published on
What Are SQL Indexes? Why They Matter and How They Work
- Authors
What Are SQL Indexes? Why They Matter and How They Work
Indexes are one of the most important tools for improving database performance.
In this article, you’ll learn:
- What SQL indexes are
- Why they dramatically speed up queries
- How to create them
- An example showing performance differences
- How they work internally
Let’s make indexes intuitive.
What Is an Index?
An index in SQL is a special data structure (usually a B-Tree) that allows the database to find rows much faster, without scanning the entire table.
You can think of an index as:
- A sorted list of values
- Each value points to a row in the main table
- Organized so the database can search it quickly (logarithmic time)
Without an index → the database must check every row.
With an index → the database jumps directly to the matching values.
Why SQL Indexes Were Created
Indexes exist to solve one problem:
Speed up queries that search for specific values.
They were created because full table scans become too slow as datasets grow.
In a table with 10 rows, scanning them all is fast.
In a table with 10 million rows, doing this repeatedly becomes unusable.
Indexes allow:
- Fast lookups
- Fast sorting (
ORDER BY) - Fast joins
- Fast uniqueness checks (via UNIQUE index)
Databases like PostgreSQL, MySQL, and SQLite all rely on them.
Example with JOINs
Indexes also speed up JOINs dramatically.
Let’s use the users and orders tables:
SELECT users.name, orders.product
FROM orders
JOIN users ON orders.user_id = users.id
WHERE users.email = 'bob@example.com';
Recommended indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
Without Indexes: Full Table Scans on Both Sides
When no indexes exist on users.email and orders.user_id, the database must:
- Scan every row in
usersto find any rows matching the email condition - Scan every row in
ordersto find rows whoseuser_idmatches those user IDs
This is called a sequential scan or full table scan.
Example scale:
userstable: 100,000 rowsorderstable: 1,000,000 rows
The database must examine:
- 100,000 user rows
- 1,000,000 order rows
Even though the final result might be a single row, the search cost is very high.
Approximate performance (varies by machine):
- Checking 100k users + 1M orders
- Without indexes: ~300–600 ms
- With indexes: ~1–3 ms
That's a 100× to 500× improvement.
With Indexes: Fast, Direct Lookup
With the two indexes:
idx_users_email→ allows instant lookup of the user with that emailidx_orders_user_id→ allows instant lookup of all orders belonging to that user
Instead of scanning millions of rows:
- The database jumps directly to the correct entries in the index (a B-tree lookup)
- Only the relevant rows are touched
- The JOIN becomes extremely fast
What Influences Performance?
Exact timing varies by hardware, but these machine-level factors matter most:
| Factor | Effect |
|---|---|
| Disk type (SSD vs HDD) | Huge impact: SSDs are 10–100× faster for random access |
| RAM size | If tables/indexes fit in memory, queries become drastically faster |
| CPU speed | Affects how quickly the B-tree nodes are scanned |
| Buffer cache | Warm caches make repeated queries much faster |
| Table size | The larger the table, the more important indexes become |
Indexes are essential in applications because even powerful machines struggle with repeated full-table scans as data grows.
- Without indexes: the database must read every row in both tables
- With indexes: the database performs fast lookups using efficient data structures
- This makes JOINs scale to millions of rows with millisecond-level response times
How Indexes Work Internally (Simple Explanation)
Most SQL databases use **B-Trees **, which:
- Keep values sorted
- Allow binary-like search
- Have extremely shallow depth (usually 3–5 levels even for huge tables)
This means:
- Finding a row requires only a few comparisons
- Access is predictable and fast
- Indexes grow logarithmically, not linearly
You don’t need to understand the algorithm deeply, but here’s the key idea:
It’s much faster to search a sorted tree than a million unsorted rows.
When Should You Use an Index?
Use an index when you:
SELECTby a column frequently- Filter with
WHERE - Sort using
ORDER BY - JOIN tables on foreign keys
- Enforce uniqueness (
UNIQUE INDEX) - Search using equality or range queries
Indexes are most valuable on:
- username
- created_at
- foreign keys (
user_id,post_id) - slugs
- lookup columns
When Not to Use an Index
Indexes have costs:
- They take disk space
- They slow down writes (INSERT/UPDATE/DELETE must update the index)
Avoid indexing:
- boolean columns
- very low-cardinality columns (e.g., gender, status = “active”)
- columns rarely used in queries
- very large text columns