Letarette - lrcli docs

Documentation

`lrcli` - the Letarette CLI

lrcli is a tool for performing operations on and interacting with a Letarette installation. Most operations are performed on a specific instance, and need to be run with direct access to the index database file.

Some operations connect via NATS, and can be run from any host with access to the system NATS servers.

`lrcli` configuration

lrcli uses a subset of the configuration environment variables that the letarette service uses, and in addition, accepts command-line arguments for specific lrcli operations.

All operations that access the database directly accept the -d <database> argument, overriding the environment setting.

Searching

lrcli can be used to search a Letarette cluster.

Run lrcli search -l 10 -p 2 docs 'animal -dog' to get page 2 of the search results from the "docs" space, matching the query "animal -dog", containing at most 10 hits per page.

There is an initial short delay before the query is executed while lrcli figures out the current sharding configuration. Use -g 1 to make the Search Agent assume a shard size of 1 instead of waiting for the information.

Searches can also be performed interactively by running lrcli search -i docs.

Index operations

The index operations all access the index db file directly, and can be run while the Letarette service is running.

`lrcli index stats`

Shows some statistics and the current stemmer settings for an index.

`lrcli index check`

Runs an integrity check on the index.

`lrcli index compress`

Compresses previously uncompressed documents in the index.

`lrcli index optimize`

Performs step-wise index optimization by merging index b-trees. If cancelled before completion, the index will be partially optimized. Running the operation again will pick up where it left.

`lrcli index rebuild`

Completely rebuilds the index from all collected raw data. Since the whole rebuild is performed as one transaction, it requires about twice the index size of free disk space.

`lrcli index pgsize <size>`

Set page size for all new index storage blobs. The default size is 16384.

`lrcli index forcestemmer`

The letarette service refuses to start with a different stemming config than what the index was initially created with, since this would create an inconsistent index state. To force letarette to accept the new changes (from the current environment), run lrcli index forcestemmer.

Bulk-loading documents

Documents can be bulk-loaded into an index from json files. This can be used to reduce the initial sync time, or to create an index without a Document Manager.

During bulk-load, documents are loaded from files containing streams of JSON objects of the following format:

{
    "id": "string ID",
    "title": "document title",
    "text": "document text",
    "date": "ISO8601 date string"
}

The title or text field can be left empty.

Example file content:

{
  "id": "27532073",
  "title": "Love Me Do",
  "date": "2019-02-23T14:15:29.120Z"
}
{
  "id": "27532077",
  "title": "And I Love Her",
  "text": "A nice song",
  "date": "2019-02-23T14:15:29.120Z"
}
{
  "id": "27532078",
  "title": "Help!",
  "date": "2019-02-23T14:15:29.120Z"
}

Note that there is no comma between each JSON object.

The source file can be gzipped. If the documents do not have IDs, lrcli load can auto-assign IDs by using the -a flag. This can be be used to create indexes that do not need updating.

lrcli load [-d <db>] [-m <max>] [-a] <space> <json>

Updating synonyms

The index synonym list can be listed or updated from a file:

lrcli synonyms [-d <db>] [<json>]

Each synonym definition is a JSON array of the following format:

["description", ["synonym1", "synonym2", ...]]

Example:

["CM", ["cm", "centimeter", "centi"]]
["DM", ["dm", "decimeter", "deci"]]

Spelling index update

The spell fix index is automatically updated during normal operation. To force an update, run lrcli spelling update <mincount>, where <mincount> is the minimum number of occurances of a word in the main index to be included in the spell fix index.

Monitoring

lrcli monitor starts a basic cluster monitor that prints cluster status messages from the different service instances.

Low-level database access

The index database can be queried directly for debugging or development. The database is a plain Sqlite3 database, but uses special extensions, which makes it hard to query using the regular sqlite3 tool.

Using lrcli sql, the current index database can be queried directly like this:

lrcli sql "select count(*) from docs"

Since it might be hard to get the escaping of all quote characters right, the query can be loaded from a file:

lrcli sql @query.sql

Database migrations

The database schema is version managed using migrations. If, during development of Letarette itself, the migration state gets out of sync, lrcli can be used to reset the current migration level:

lrcli resetmigration 4

Note: this is only needed during development of Letarette when the service refuses to launch due to a previous unsuccessful migration.

Documentation

lrcli - the Letarette CLI

lrcli configuration