What Are SQL Indexes? Why They Matter and How They Work

Indexes are one of the most important tools for improving database performance.

In this article, you’ll learn:

What SQL indexes are
Why they dramatically speed up queries
How to create them
An example showing performance differences
How they work internally

Let’s make indexes intuitive.

What Is an Index?

An index in SQL is a special data structure (usually a B-Tree) that allows the database to find rows much faster, without scanning the entire table.

You can think of an index as:

A sorted list of values
Each value points to a row in the main table
Organized so the database can search it quickly (logarithmic time)

Without an index → the database must check every row.
With an index → the database jumps directly to the matching values.

Why SQL Indexes Were Created

Indexes exist to solve one problem:

Speed up queries that search for specific values.

They were created because full table scans become too slow as datasets grow.
In a table with 10 rows, scanning them all is fast.
In a table with 10 million rows, doing this repeatedly becomes unusable.

Indexes allow:

Fast lookups
Fast sorting (ORDER BY)
Fast joins
Fast uniqueness checks (via UNIQUE index)

Databases like PostgreSQL, MySQL, and SQLite all rely on them.

Example with JOINs

Indexes also speed up JOINs dramatically.

Let’s use the users and orders tables:

SELECT users.name, orders.product
FROM orders
JOIN users ON orders.user_id = users.id
WHERE users.email = 'bob@example.com';

Recommended indexes

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);

Without Indexes: Full Table Scans on Both Sides

When no indexes exist on users.email and orders.user_id, the database must:

Scan every row in users to find any rows matching the email condition
Scan every row in orders to find rows whose user_id matches those user IDs

This is called a sequential scan or full table scan.

Example scale:

users table: 100,000 rows
orders table: 1,000,000 rows

The database must examine:

100,000 user rows
1,000,000 order rows

Even though the final result might be a single row, the search cost is very high.

Approximate performance (varies by machine):

Checking 100k users + 1M orders
Without indexes: ~300–600 ms
With indexes: ~1–3 ms

That's a 100× to 500× improvement.

With Indexes: Fast, Direct Lookup

With the two indexes:

idx_users_email → allows instant lookup of the user with that email
idx_orders_user_id → allows instant lookup of all orders belonging to that user

Instead of scanning millions of rows:

The database jumps directly to the correct entries in the index (a B-tree lookup)
Only the relevant rows are touched
The JOIN becomes extremely fast

What Influences Performance?

Exact timing varies by hardware, but these machine-level factors matter most:

Factor	Effect
Disk type (SSD vs HDD)	Huge impact: SSDs are 10–100× faster for random access
RAM size	If tables/indexes fit in memory, queries become drastically faster
CPU speed	Affects how quickly the B-tree nodes are scanned
Buffer cache	Warm caches make repeated queries much faster
Table size	The larger the table, the more important indexes become

Indexes are essential in applications because even powerful machines struggle with repeated full-table scans as data grows.

Without indexes: the database must read every row in both tables
With indexes: the database performs fast lookups using efficient data structures
This makes JOINs scale to millions of rows with millisecond-level response times

How Indexes Work Internally (Simple Explanation)

Most SQL databases use **B-Trees **, which:

Keep values sorted
Allow binary-like search
Have extremely shallow depth (usually 3–5 levels even for huge tables)

This means:

Finding a row requires only a few comparisons
Access is predictable and fast
Indexes grow logarithmically, not linearly

You don’t need to understand the algorithm deeply, but here’s the key idea:

It’s much faster to search a sorted tree than a million unsorted rows.

When Should You Use an Index?

Use an index when you:

SELECT by a column frequently
Filter with WHERE
Sort using ORDER BY
JOIN tables on foreign keys
Enforce uniqueness (UNIQUE INDEX)
Search using equality or range queries

Indexes are most valuable on:

email
username
created_at
foreign keys (user_id, post_id)
slugs
lookup columns

When Not to Use an Index

Indexes have costs:

They take disk space
They slow down writes (INSERT/UPDATE/DELETE must update the index)

Avoid indexing:

boolean columns
very low-cardinality columns (e.g., gender, status = “active”)
columns rarely used in queries
very large text columns