I’m working on what is basically a fancy ETL script for migrating user model data from MongoDB to SQLite. The purpose is to check that
- the relational model can accurately, losslessly represent all the document model data currently in Mongo;
- the transformation logic is correct and lossless.
It does this by copying all the data over to SQLite, then copying it back and checking that it’s still the same. These are the steps the test performs:
- load user data from a MongoDB archive instance, as Mongo model objects
- transform them into bundles of SQLite model objects
- save users (only) to SQLite
- save associated user data to SQLite (linked ETH wallets, friendships). This requires the UserIDs from the previous step
- re-load users AND associated data from SQLite
- transform the SQLite models back into Mongo objects
- diff the Mongo objects from step 1 and step 6, and assert they are equivalent
If the check in part 7 passes, this means that every intermediate step is semantically correct.
Because this is a testing and iteration stage, I do these operations on an in-memory SQLite database. So I expect them to be very fast. The data size is 3571 users, which is not big. I was therefore surprised when running this test took almost 20 seconds. This is way too slow to rapidly iterate (also a red flag since these same queries are used in production code), so I investigated.
Timing info and query analysis
After adding some timing collection info to the test, I narrowed the culprit down to steps 5 and onward. Step 5 (querying the saved data from SQLite) has several SQL queries in it, so I further broke down the timing data by query function. Here’s what I found (timing data is in microseconds per user):
Iterations: 3571
GetUserByKeycloakUUID : 39.1 µs
GetWalletsForUser : 7.1 µs
GetActiveFriendshipsForUser : 4451.1 µs
assembling SQLite IR : 15.9 µs
SQLite To Mongo conversion : 0.5 µs
diffing : 24.6 µs
Clearly, the slow part is querying for a user’s active friendships. In the sample set, the friendships table has about 300,000 rows.
Here is the table and the query function:
create table friendships (
rowid integer primary key,
user1_id integer not null references users(rowid),
user2_id integer not null references users(rowid),
-- `status` can be "accepted", or 2 flavors of "pending" or "rejected"
status integer not null references friendship_status(rowid),
is_deleted integer not null default 0, -- soft deletion
-- ...other fields
unique(user1_id, user2_id),
check(user1_id < user2_id) -- prevent mirror-image duplicates
) strict;
-- `user1_id` is already indexed by the `unique` clause
create index friendships_user2 on friendships(user2_id);
create index friendships_index_status_is_deleted ON friendships(status, is_deleted);
func (db DB) GetActiveFriendshipsForUser(uID UserID) (ret []Friendship) {
PanicIf(db.DB.Select(&ret, `
select `+friendshipSQLFields+`
from friendships
where status = 1
and is_deleted = 0
and (user1_id = ? or user2_id = ?)
`, uID, uID))
return
}
This doesn’t look like it should be super slow. Everything in it is indexed, especially the main filter on user1_id and user2_id. It should just be 2 index scans (one for the case where the given user is user1, one where they’re user2), assembling a set of around 100-200 rowids (300,000 double-ended friendships among 3571 users), looking them up in the table, and then filtering by status = 1 and is_deleted = 0.
SQLite’s expert
To figure out what’s going on, let’s check the expert tool. SQLite’s expert is a debugging tool, built in to the SQLite CLI shell, that explains how a query will run, and suggests new indexes that will make it faster.
From the SQLite docs:
For most non-trivial SQL databases, the key to performance is creating the right SQL indexes. In this context “the right SQL indexes” means those that cause the queries that an application needs to optimize to run fast. The “.expert” command can assist with this by proposing indexes that might assist with specific queries, were they present in the database.
The “.expert” command is issued first, followed by the SQL query on a separate line.
sqlite> .expert
sqlite> select *
from friendships
where status = 1
and is_deleted = 0
and (user1_id = ? or user2_id = ?)
...> ;
(no new indexes)
SEARCH friendships USING INDEX friendships_index_status_is_deleted (status=? AND is_deleted=?)
Unsurprisingly, there’s no new index recommendations. However, the query plan is quite bad: it’s doing the index lookup on the (status, is_deleted) index. This index is going to have very bad selectivity, i.e., it doesn’t filter the results down very much. There’s only 5 unique statuses, and the majority of rows are likely in a single status (“friendship accepted”), with almost all the rest being one of the two flavors of “friendship pending”. And is_deleted won’t help at all, because the Mongo data model didn’t support soft-deleting friendships, so these will all be 0.
I wanted it to use the indexes on user1_id and user2_id. I know that these will have much better selectivity. However, the query planner didn’t want to. I thought this was odd, so I checked what it did if I removed the status and is_deleted clauses from the query altogether:
sqlite> .expert
sqlite> select *
from friendships
where (user1_id = ? or user2_id = ?)
...> ;
(no new indexes)
MULTI-INDEX OR
INDEX 1
SEARCH friendships USING INDEX sqlite_autoindex_friendships_1 (user1_id=?)
INDEX 2
SEARCH friendships USING INDEX friendships_user2 (user2_id=?)
That’s the query plan I want. However, now the query will be wrong; I still need the status and is_deleted filters.
Special SQLite solution (not quite optimal, but interesting)
Since SQLite is an in-process database, unlike other database engines, there’s no real distinction between the database engine and the application logic. I can, without penalty, move the status = 1 and is_deleted = 0 “where” clauses into Go code:
func (db DB) GetActiveFriendshipsForUser(uID UserID) (ret []Friendship) {
// "is_deleted" and "status" filters removed from "where" clause!
PanicIf(db.DB.Select(&ret, `
select `+friendshipSQLFields+`
from friendships
where (user1_id = ? or user2_id = ?)
`, uID, uID))
// Implement the "where" clause artificially
tmp := ret[:0] // Allocation-free filtering, overwriting the underlying slice
for _, f := range ret {
if f.Status == 1 && !f.IsDeleted {
tmp = append(tmp, f)
}
}
ret = tmp
return
}
This is both fun (it feels cheeky– “Umm you can’t just do that!!!”) and also solves the problem, taking advantage of SQLite as an embedded DB. Running the timing test again:
Iterations: 3571
GetUserByKeycloakUUID : 33.7 µs
GetWalletsForUser : 5.9 µs
GetActiveFriendshipsForUser : 146.7 µs
assembling SQLite IR : 9.6 µs
SQLite To Mongo conversion : 0.3 µs
diffing : 20.5 µs
The performance problem is gone; querying a user’s friendships is down from 4451.1 microseconds to 146.7 microseconds, a 30x speedup. This operation is no longer meaningfully a bottleneck, and the problem is solved.
More optimal solution (not specific to SQLite)
However, we can still do a bit better. The “just do it in Go” solution, although fun, is slightly sub-optimal: it involves an extra traversal and partial memcopying of the results slice. By reformulating the query and the index, we can eliminate this by pushing it into the index scan, and squeeze out some more performance.
Consider the original query again (the one I wish I could run):
select * from friendships
where status = 1
and is_deleted = 0
and (user1_id = ? or user2_id = ?)
There’s no index that can filter on the last condition, because it contains an OR operation. But, as the query planner noticed above (when I stripped out the other conditions), what I’m really trying to do here is combine 2 separate lookups: one where the user in question is “user 1”, and one for where they’re “user 2”. So the query can be reformulated as this:
select *
from friendships
where user1_id = ?
and status = 1
and is_deleted = 0
union all
select *
from friendships
where user2_id = ?
and status = 1
and is_deleted = 0
This is logically the same query, just re-arranged. Whether it’s more readable to a person is a matter of opinion; but it’s definitely more tractable for the query planner.
Now run THIS through SQLite expert:
sqlite> .expert
sqlite> select *
from friendships
where user1_id = ?
and status = 1
and is_deleted = 0
union all
select *
from friendships
where user2_id = ?
and status = 1
and is_deleted = 0
...> ;
CREATE INDEX friendships_idx_4b676fd3 ON friendships(user2_id, status, is_deleted);
CREATE INDEX friendships_idx_2f8b69da ON friendships(user1_id, status, is_deleted);
COMPOUND QUERY
LEFT-MOST SUBQUERY
SEARCH friendships USING INDEX friendships_idx_2f8b69da (user1_id=? AND status=? AND is_deleted=?)
UNION ALL
SEARCH friendships USING INDEX friendships_idx_4b676fd3 (user2_id=? AND status=? AND is_deleted=?)
This time, expert recommends new indexes, which comprise the whole query! If I try them out and run the timing tests again, I get this:
Iterations: 3571
GetUserByKeycloakUUID : 34.7 µs
GetWalletsForUser : 6.4 µs
GetActiveFriendshipsForUser : 110.2 µs
assembling SQLite IR : 9.2 µs
SQLite To Mongo conversion : 0.3 µs
diffing : 21.8 µs
This isn’t a huge difference, but it does give us another speedup of 1.33x, by adding the remaining where clauses to the index lookup, eliminating the extra traversal. Since querying friend lists is a frequent operation in our app, this is probably worth it.
Summary
The final version of the “Get Friendships” query is a 40x speedup from the original. By getting annoyed at this test taking 20 seconds to run, I found a big improvement for a query frequently used in production code. The test I was working on, which originally took 20 seconds, now takes around 3 seconds; the problematic query went from ~99% of the test’s runtime to about 10%, no longer a major perf bottleneck.
Takeaways
Specific:
- SQL’s
oroperation doesn’t play nice with indexed columns; if the operands of anorare both indexed, breaking it into 2 queries withunioncan help a lot
General:
- Do timing and performance analysis to find bottlenecks
- Making tests fast is a good way to make production code fast. Don’t accept slow tests
- Making your code fast is always beneficial, because it means you can run it more frequently while developing. Slow code slows down your development.
- Use in-memory SQLite to make testing faster
- If a simple query in-memory takes more than a few 100 microseconds, something is probably wrong. A full millisecond is “kill yourself” territory.
- Use SQLite
expertto find ways to improve your query - Use SQLite
Reference: timings (repeated from above)
Initial version
Iterations: 3571
GetUserByKeycloakUUID : 39.1 µs
GetWalletsForUser : 7.1 µs
GetActiveFriendshipsForUser : 4451.1 µs
assembling SQLite IR : 15.9 µs
SQLite To Mongo conversion : 0.5 µs
diffing : 24.6 µs
Doing filtering in Go
Iterations: 3571
GetUserByKeycloakUUID : 33.7 µs
GetWalletsForUser : 5.9 µs
GetActiveFriendshipsForUser : 146.7 µs
assembling SQLite IR : 9.6 µs
SQLite To Mongo conversion : 0.3 µs
diffing : 20.5 µs
Using union
Iterations: 3571
GetUserByKeycloakUUID : 34.7 µs
GetWalletsForUser : 6.4 µs
GetActiveFriendshipsForUser : 110.2 µs
assembling SQLite IR : 9.2 µs
SQLite To Mongo conversion : 0.3 µs
diffing : 21.8 µs