Most likely, you’ll be asked this question in the context of discussing indexes, and they’ll expect you to understand that the order in which an index is created and used matters.
Let’s start with creation
When creating a composite index, you should be guided by selectivity. The most selective column (the one that narrows down results the most) should come first - for example, customer_id. Then you take a less selective column, such as status, and the last column, which should have the lowest selectivity, for example, created_at.
The reason for this order is the index structure. Simply put, it’s nested, and we can only use status efficiently if we already have customer_id.
Let’s look at an example
The index will be created correctly, but you need to explain the rule: when querying using this index, it will be efficient only if the values are provided from left to right.
So, what rules will the query planner follow when using this index:
- Obviously, if we provide
customer_id,status, andcreated_at, the index will work as expected. - If we provide only
customer_idandstatus, the index will still be used and will search by those two columns. - If we provide
customer_idandcreated_at, the index can still be used, but the query planner won’t takecreated_atinto account because we skipped status and thus broke the index prefix. You can see this by runningEXPLAIN. - If we provide only
created_at, the index won’t be used and a full scan will be performed, because we didn’t providecustomer_id, which is the first column (the index prefix).