Letarette performance

Performance

The metric we are currently focusing on is execution speed. We want to get great performance with limited resources.

There are of course other metrics that we are monitoring and want to improve, such as the quality of ranking and spelling. We will get to that.

The query execution speed is highly dependent on the performance of SQLite3 and FTS5, the core database components. What this translates to is basically raw I/O performance and how well-crafted the queries are.

The I/O cost can be significantly cut by having enough memory to hold the entire database file cached. Letarette pre-loads the database on launch to force it into the OS cache and avoid letting that cost hit client search requests.

Preliminary benchmarks

Load generation

By using the load generation tool lrload from the Letarette repo, it's fairly easy to perform repeatable measurements.

The lrload tool has no dependencies other than having a Letarette service available, and can be run in two modes, agent and runner.

The tool is launched in agent mode on the hosts where the load generating requests are going to be made from. Several instances can be run on a single host to use more cores.

$ ./lrload agent
[INFO] 2019/12/30 00:02:14 Agent waiting for load requests

When all agents have been started, the lrload tool is started in runner mode, using a JSON-format "test set" file. There are a couple of examples in the repo.

The load runner can be run from any host, and will start a synchronized run on all available agents and collecting the results using NATS messaging.

$ ./lrload run testdata/simple.json
Testset run on 1 concurrent agents in 2.95s

Success ratio: 100.0000%

Query processing times:
Mean:   0.01157186
Median: 0.00556206
90%:    0.028416386
95%:    0.028746996
99%:    0.036061194

Total roundtrip times:
Mean:   0.011806804
Median: 0.005537647
90%:    0.028416386
95%:    0.028716529
99%:    0.036061194

Test data set

The test data used here is a set of 1M articles extracted from the english Wikipedia. The title and text body from these articles have been loaded as Letarette documents into three index shards.

Each shard contains roughly

333k documents
1.9M unique terms
1.8GB indexed text
4.5GB index file

In these test runs, there is no stop-word handling, and the Letarette cache is turned off to measure raw index performance.

The tests were run on 1 to 10 concurrent agents, using two basic test sets, "simple" and "stopwords". The "simple" test set is 200 iterations of a single word, and the "stopwords" is 50 iterations of a random selection of the 15 most common words in the index.

The Letarette service instances ran on DigitalOcean GP droplets with 4 vCPUs and 16GB memory.

It's clear that after the four CPUs are fully loaded, the time per request scales linearly.