A case study in 40x speedup of a SQLite query

Jan 29, 2026 9-minute read

I’m working on what is basically a fancy ETL script for migrating user model data from MongoDB to SQLite. The purpose is to check that

the relational model can accurately, losslessly represent all the document model data currently in Mongo;
the transformation logic is correct and lossless.

It does this by copying all the data over to SQLite, then copying it back and checking that it’s still the same. These are the steps the test performs:

load user data from a MongoDB archive instance, as Mongo model objects
transform them into bundles of SQLite model objects
save users (only) to SQLite
save associated user data to SQLite (linked ETH wallets, friendships). This requires the UserIDs from the previous step
re-load users AND associated data from SQLite
transform the SQLite models back into Mongo objects
diff the Mongo objects from step 1 and step 6, and assert they are equivalent

If the check in part 7 passes, this means that every intermediate step is semantically correct.

Because this is a testing and iteration stage, I do these operations on an in-memory SQLite database. So I expect them to be very fast. The data size is 3571 users, which is not big. I was therefore surprised when running this test took almost 20 seconds. This is way too slow to rapidly iterate (also a red flag since these same queries are used in production code), so I investigated.

Timing info and query analysis

After adding some timing collection info to the test, I narrowed the culprit down to steps 5 and onward. Step 5 (querying the saved data from SQLite) has several SQL queries in it, so I further broke down the timing data by query function. Here’s what I found (timing data is in microseconds per user):

Iterations: 3571

GetUserByKeycloakUUID         :    39.1 µs
GetWalletsForUser             :     7.1 µs
GetActiveFriendshipsForUser   :  4451.1 µs
assembling SQLite IR          :    15.9 µs
SQLite To Mongo conversion    :     0.5 µs
diffing                       :    24.6 µs

Clearly, the slow part is querying for a user’s active friendships. In the sample set, the friendships table has about 300,000 rows.

Here is the table and the query function:

create table friendships (
    rowid integer primary key,
    user1_id integer not null references users(rowid),
    user2_id integer not null references users(rowid),

    -- `status` can be "accepted", or 2 flavors of "pending" or "rejected"
    status integer not null references friendship_status(rowid),
    is_deleted integer not null default 0, -- soft deletion

    -- ...other fields

    unique(user1_id, user2_id),
    check(user1_id < user2_id) -- prevent mirror-image duplicates
) strict;

-- `user1_id` is already indexed by the `unique` clause
create index friendships_user2 on friendships(user2_id);

create index friendships_index_status_is_deleted ON friendships(status, is_deleted);

func (db DB) GetActiveFriendshipsForUser(uID UserID) (ret []Friendship) {
	PanicIf(db.DB.Select(&ret, `
	    select `+friendshipSQLFields+`
	      from friendships
         where status = 1
           and is_deleted = 0
           and (user1_id = ? or user2_id = ?)
	`, uID, uID))
	return
}

This doesn’t look like it should be super slow. Everything in it is indexed, especially the main filter on user1_id and user2_id. It should just be 2 index scans (one for the case where the given user is user1, one where they’re user2), assembling a set of around 100-200 rowids (300,000 double-ended friendships among 3571 users), looking them up in the table, and then filtering by status = 1 and is_deleted = 0.

SQLite’s `expert`

To figure out what’s going on, let’s check the expert tool. SQLite’s expert is a debugging tool, built in to the SQLite CLI shell, that explains how a query will run, and suggests new indexes that will make it faster.

From the SQLite docs:

For most non-trivial SQL databases, the key to performance is creating the right SQL indexes. In this context “the right SQL indexes” means those that cause the queries that an application needs to optimize to run fast. The “.expert” command can assist with this by proposing indexes that might assist with specific queries, were they present in the database.

The “.expert” command is issued first, followed by the SQL query on a separate line.

sqlite> .expert
sqlite> select * 
          from friendships
         where status = 1
           and is_deleted = 0
           and (user1_id = ? or user2_id = ?)
   ...> ;
(no new indexes)

SEARCH friendships USING INDEX friendships_index_status_is_deleted (status=? AND is_deleted=?)

Unsurprisingly, there’s no new index recommendations. However, the query plan is quite bad: it’s doing the index lookup on the (status, is_deleted) index. This index is going to have very bad selectivity, i.e., it doesn’t filter the results down very much. There’s only 5 unique statuses, and the majority of rows are likely in a single status (“friendship accepted”), with almost all the rest being one of the two flavors of “friendship pending”. And is_deleted won’t help at all, because the Mongo data model didn’t support soft-deleting friendships, so these will all be 0.

I wanted it to use the indexes on user1_id and user2_id. I know that these will have much better selectivity. However, the query planner didn’t want to. I thought this was odd, so I checked what it did if I removed the status and is_deleted clauses from the query altogether:

sqlite> .expert
sqlite> select *
          from friendships
         where (user1_id = ? or user2_id = ?)
   ...> ;
(no new indexes)

MULTI-INDEX OR
INDEX 1
SEARCH friendships USING INDEX sqlite_autoindex_friendships_1 (user1_id=?)
INDEX 2
SEARCH friendships USING INDEX friendships_user2 (user2_id=?)

That’s the query plan I want. However, now the query will be wrong; I still need the status and is_deleted filters.

Special SQLite solution (not quite optimal, but interesting)

Since SQLite is an in-process database, unlike other database engines, there’s no real distinction between the database engine and the application logic. I can, without penalty, move the status = 1 and is_deleted = 0 “where” clauses into Go code:

func (db DB) GetActiveFriendshipsForUser(uID UserID) (ret []Friendship) {
	// "is_deleted" and "status" filters removed from "where" clause!
	PanicIf(db.DB.Select(&ret, `
	    select `+friendshipSQLFields+`
	      from friendships
	     where (user1_id = ? or user2_id = ?)
	`, uID, uID))

	// Implement the "where" clause artificially
	tmp := ret[:0] // Allocation-free filtering, overwriting the underlying slice
	for _, f := range ret {
		if f.Status == 1 && !f.IsDeleted {
			tmp = append(tmp, f)
		}
	}
	ret = tmp
	return
}

This is both fun (it feels cheeky– “Umm you can’t just do that!!!”) and also solves the problem, taking advantage of SQLite as an embedded DB. Running the timing test again:

Iterations: 3571

GetUserByKeycloakUUID         :    33.7 µs
GetWalletsForUser             :     5.9 µs
GetActiveFriendshipsForUser   :   146.7 µs
assembling SQLite IR          :     9.6 µs
SQLite To Mongo conversion    :     0.3 µs
diffing                       :    20.5 µs

The performance problem is gone; querying a user’s friendships is down from 4451.1 microseconds to 146.7 microseconds, a 30x speedup. This operation is no longer meaningfully a bottleneck, and the problem is solved.

More optimal solution (not specific to SQLite)

However, we can still do a bit better. The “just do it in Go” solution, although fun, is slightly sub-optimal: it involves an extra traversal and partial memcopying of the results slice. By reformulating the query and the index, we can eliminate this by pushing it into the index scan, and squeeze out some more performance.

Consider the original query again (the one I wish I could run):

select * from friendships
 where status = 1
   and is_deleted = 0
   and (user1_id = ? or user2_id = ?)

There’s no index that can filter on the last condition, because it contains an OR operation. But, as the query planner noticed above (when I stripped out the other conditions), what I’m really trying to do here is combine 2 separate lookups: one where the user in question is “user 1”, and one for where they’re “user 2”. So the query can be reformulated as this:

select *
  from friendships
 where user1_id = ?
   and status = 1
   and is_deleted = 0

 union all

select *
  from friendships
 where user2_id = ?
   and status = 1
   and is_deleted = 0

This is logically the same query, just re-arranged. Whether it’s more readable to a person is a matter of opinion; but it’s definitely more tractable for the query planner.

Now run THIS through SQLite expert:

sqlite> .expert
sqlite> select *
          from friendships
         where user1_id = ?
           and status = 1
           and is_deleted = 0
         
         union all
        
        select *
          from friendships
         where user2_id = ?
           and status = 1
           and is_deleted = 0  
   ...> ;
CREATE INDEX friendships_idx_4b676fd3 ON friendships(user2_id, status, is_deleted);
CREATE INDEX friendships_idx_2f8b69da ON friendships(user1_id, status, is_deleted);

COMPOUND QUERY
LEFT-MOST SUBQUERY
SEARCH friendships USING INDEX friendships_idx_2f8b69da (user1_id=? AND status=? AND is_deleted=?)
UNION ALL
SEARCH friendships USING INDEX friendships_idx_4b676fd3 (user2_id=? AND status=? AND is_deleted=?)

This time, expert recommends new indexes, which comprise the whole query! If I try them out and run the timing tests again, I get this:

Iterations: 3571

GetUserByKeycloakUUID         :     34.7 µs
GetWalletsForUser             :      6.4 µs
GetActiveFriendshipsForUser   :    110.2 µs
assembling SQLite IR          :      9.2 µs
SQLite To Mongo conversion    :      0.3 µs
diffing                       :     21.8 µs

This isn’t a huge difference, but it does give us another speedup of 1.33x, by adding the remaining where clauses to the index lookup, eliminating the extra traversal. Since querying friend lists is a frequent operation in our app, this is probably worth it.

Summary

The final version of the “Get Friendships” query is a 40x speedup from the original. By getting annoyed at this test taking 20 seconds to run, I found a big improvement for a query frequently used in production code. The test I was working on, which originally took 20 seconds, now takes around 3 seconds; the problematic query went from ~99% of the test’s runtime to about 10%, no longer a major perf bottleneck.

Takeaways

Specific:

SQL’s or operation doesn’t play nice with indexed columns; if the operands of an or are both indexed, breaking it into 2 queries with union can help a lot

General:

Do timing and performance analysis to find bottlenecks
Making tests fast is a good way to make production code fast. Don’t accept slow tests
Making your code fast is always beneficial, because it means you can run it more frequently while developing. Slow code slows down your development.
Use in-memory SQLite to make testing faster
If a simple query in-memory takes more than a few 100 microseconds, something is probably wrong. A full millisecond is “kill yourself” territory.
Use SQLite expert to find ways to improve your query
Use SQLite

Reference: timings (repeated from above)

Initial version

Iterations: 3571

GetUserByKeycloakUUID         :    39.1 µs
GetWalletsForUser             :     7.1 µs
GetActiveFriendshipsForUser   :  4451.1 µs
assembling SQLite IR          :    15.9 µs
SQLite To Mongo conversion    :     0.5 µs
diffing                       :    24.6 µs

Doing filtering in Go

Iterations: 3571

GetUserByKeycloakUUID         :    33.7 µs
GetWalletsForUser             :     5.9 µs
GetActiveFriendshipsForUser   :   146.7 µs
assembling SQLite IR          :     9.6 µs
SQLite To Mongo conversion    :     0.3 µs
diffing                       :    20.5 µs

Using union

Iterations: 3571

GetUserByKeycloakUUID         :     34.7 µs
GetWalletsForUser             :      6.4 µs
GetActiveFriendshipsForUser   :    110.2 µs
assembling SQLite IR          :      9.2 µs
SQLite To Mongo conversion    :      0.3 µs
diffing                       :     21.8 µs