You have identified a slow API endpoint. You enabled the slow query log, isolated the bottleneck SQL statement, and realized a crucial WHERE clause was filtering on an unindexed column.
You confidently deployed a migration to add a B-Tree index. You ran the migration, verified the schema, and triggered the query again.
The result? The query is still slow.
When you run EXPLAIN ANALYZE, you see the dreaded words: Seq Scan (PostgreSQL), Table Scan (SQL Server), or ALL (MySQL). The optimizer is completely ignoring your new index.
This is not a bug in the database engine. It is a mathematical decision made by the Cost-Based Optimizer (CBO). This article explores exactly why databases ignore indexes and provides the rigorous technical fixes required to resolve it.
The Cost-Based Optimizer: It's Just Math
To fix the problem, you must understand the "Why." The Query Optimizer’s job is not to use indexes; its job is to retrieve data with the lowest estimated "cost."
Cost is usually defined by I/O operations.
- Sequential Scan: Reading data blocks linearly from the disk. This is fast per-block because it utilizes read-ahead buffering.
- Index Scan: Reading the B-Tree (Random I/O) to find a pointer, then jumping to the Heap (Table) to fetch the row (more Random I/O).
The Tipping Point: If your query retrieves more than roughly 5-10% of the table's total rows, the optimizer calculates that the overhead of bouncing between the Index and the Heap is more expensive than simply reading the entire table sequentially.
If your index is being ignored, you likely fall into one of three specific implementation traps: Cardinality issues, Non-SARGable queries, or Data Type mismatches.
Trap 1: The Cardinality (Selectivity) Trap
Cardinality refers to the uniqueness of data values in a column. An index on a unique ID has high cardinality. An index on a Boolean column (is_active) or a low-variance status column has low cardinality.
If you have 1,000,000 rows and query for status = 'active', and 990,000 rows are active, the index is useless. The database would have to perform 990,000 random lookups. A sequential scan is orders of magnitude faster here.
The Fix: Partial Indexes
If you frequently query for the rare condition (e.g., the 1% of users who are banned), a standard index might still be ignored due to outdated statistics. However, a Partial Index reduces the index size and forces the optimizer's hand by only indexing relevant rows.
Scenario: You want to quickly find archived orders, but 95% of your orders table consists of completed orders.
-- BAD: A full index on a low-cardinality column
-- The optimizer will likely ignore this for high-volume statuses
CREATE INDEX idx_orders_status ON orders (status);
-- GOOD: A Partial Index (PostgreSQL / SQL Server)
-- This index is tiny and highly selective.
CREATE INDEX idx_orders_archived
ON orders (created_at)
WHERE status = 'ARCHIVED';
Now, when you run this query:
SELECT * FROM orders WHERE status = 'ARCHIVED';
The database sees an index that perfectly matches the predicate and contains a very small subset of data. It is mathematically impossible for the Sequential Scan to be cheaper.
Trap 2: Functions on Columns (Non-SARGable Queries)
This is the most common reason for ignored indexes in modern application development.
SARGable stands for Search ARGument ABLE. For a query to be SARGable, the engine must be able to leverage the B-Tree structure. If you wrap a column in a function, the database cannot navigate the tree. It must compute that function for every single row in the table to see if it matches.
The Problem: wrapping the column
Imagine you have an index on created_at.
-- SLOW: Non-SARGable
-- The DB must calculate DATE(created_at) for every row in the table
-- BEFORE it can compare it to '2023-10-01'.
SELECT * FROM users
WHERE DATE(created_at) = '2023-10-01';
The Fix: Move the Math to the Value
Always transform the input value, not the stored column data. You want to compare the raw column against a range.
-- FAST: SARGable
-- The DB can immediately jump to the start of the range in the B-Tree
-- and stop reading at the end of the range.
SELECT * FROM users
WHERE created_at >= '2023-10-01 00:00:00'
AND created_at < '2023-10-02 00:00:00';
Alternative Fix: Functional Indexes
If you absolutely cannot change the query (perhaps it is generated by a legacy ORM), you can index the result of the function.
-- PostgreSQL
CREATE INDEX idx_users_created_date ON users (DATE(created_at));
-- MySQL (5.7+) / SQL Server
-- Use a Generated Column
ALTER TABLE users ADD COLUMN created_date DATE AS (DATE(created_at)) STORED;
CREATE INDEX idx_users_created_date ON users (created_date);
Trap 3: The Silent Killer - Implicit Type Casting
Data type mismatches usually occur when the application language (e.g., JavaScript/TypeScript) treats types loosely, but the database is strict.
Scenario: You have a reference_code column defined as VARCHAR (or text in Postgres), but your records store numeric-looking values like "10023".
Your API receives a JSON number 10023. Your ORM or query builder passes it as an integer.
-- The schema
CREATE TABLE transactions (
id SERIAL PRIMARY KEY,
reference_code VARCHAR(50) -- Note: String type
);
CREATE INDEX idx_ref_code ON transactions(reference_code);
-- The Query (Sent by application)
SELECT * FROM transactions WHERE reference_code = 10023;
Why the Index Fails
In PostgreSQL and SQL Server, integer has higher data type precedence than varchar. To compare them, the database implicitly casts the column to an integer:
-- What the DB actually runs:
SELECT * FROM transactions
WHERE CAST(reference_code AS INTEGER) = 10023;
This effectively wraps the column in a function (see Trap #2). The index is invalidated because the B-Tree is sorted alphabetically (string), not numerically.
The Fix: Align Your Types
Ensure your query parameters match the column definition exactly.
Option A: Fix the Application Code (Preferred)
// TypeScript / Node.js example
const refCodeInput: number = 10023;
// Cast to string BEFORE sending to the DB driver
const query = 'SELECT * FROM transactions WHERE reference_code = $1';
const values = [refCodeInput.toString()]; // Sends as string '10023'
Option B: Explicit Casting in SQL
If you are writing raw SQL, cast the value, not the column.
-- The DB can now use the string-based B-Tree index
SELECT * FROM transactions
WHERE reference_code = '10023';
-- OR
WHERE reference_code = CAST(10023 AS VARCHAR);
Deep Dive: Checking Statistics
Sometimes, your query is SARGable, your types match, and cardinality is fine, but the index is still ignored. This happens when the database's internal statistics map is outdated.
The database estimates row counts based on cached statistics. If you recently bulk-inserted or deleted 1,000,000 rows, the DB might think the table is still empty (and choose a Seq Scan) or huge (and choose a bad index).
The Fix: Update Statistics
PostgreSQL:
-- Updates stats for a specific table
ANALYZE VERBOSE transactions;
SQL Server:
UPDATE STATISTICS transactions;
MySQL:
ANALYZE TABLE transactions;
Summary
When an index is ignored, avoid the temptation to add more indexes. Instead, investigate the relationship between your query structure and the B-Tree.
- Check Cardinality: Is the query returning >10% of the table? If so, a Sequential Scan is correct. If you need a subset, use Partial Indexes.
- Check SARGability: Are you applying functions to the column? (
YEAR(date),UPPER(email)). Rewrite the query to filter against the raw column. - Check Data Types: Are you comparing Strings to Numbers? Align the input types to match the database schema exactly.
The database is deterministic. If you provide a path that is mathematically cheaper than reading the whole disk, the optimizer will take it.