Types & Handles
| Type | Description |
|---|---|
KVStore | Opaque handle to an open database. |
KVColumnFamily | Opaque handle to a column family. |
KVIterator | Opaque handle to a key-value iterator. |
KVStoreStats | Statistics counters (see Diagnostics). |
KVStoreConfig | Configuration struct for kvstore_open_v2. |
Error Codes
| Code | Value | Description |
|---|---|---|
KVSTORE_OK | 0 | Operation succeeded. |
KVSTORE_ERROR | 1 | Generic error. |
KVSTORE_BUSY | 5 | Database locked by another connection. |
KVSTORE_LOCKED | 6 | Database locked within the same connection. |
KVSTORE_NOMEM | 7 | Memory allocation failed. |
KVSTORE_READONLY | 8 | Database is read-only. |
KVSTORE_CORRUPT | 11 | Database file is corrupted. |
KVSTORE_NOTFOUND | 12 | Key or column family not found. |
KVSTORE_AUTH_FAILED | 13 | Wrong password or encrypted store opened without a key. |
KVSTORE_PROTOCOL | 15 | Database lock protocol error. |
All codes are aliases for the corresponding SQLITE_* values.
Constants
Journal Modes
| Constant | Value | Description |
|---|---|---|
KVSTORE_JOURNAL_DELETE | 0 | Rollback journal mode. |
KVSTORE_JOURNAL_WAL | 1 | Write-Ahead Logging — recommended. |
Sync Levels
| Constant | Value | Description |
|---|---|---|
KVSTORE_SYNC_OFF | 0 | No fsync — fastest, not crash-safe. |
KVSTORE_SYNC_NORMAL | 1 | WAL-safe — survives process crash (default). |
KVSTORE_SYNC_FULL | 2 | Power-safe — fsync on every commit. |
Checkpoint Modes
| Constant | Value | Description |
|---|---|---|
KVSTORE_CHECKPOINT_PASSIVE | 0 | Non-blocking copy; may not flush all frames. |
KVSTORE_CHECKPOINT_FULL | 1 | Wait for writers, then copy all frames. |
KVSTORE_CHECKPOINT_RESTART | 2 | Like FULL, then reset WAL write position. |
KVSTORE_CHECKPOINT_TRUNCATE | 3 | Like RESTART, then truncate WAL file to zero. |
Other
| Constant | Value | Description |
|---|---|---|
KVSTORE_MAX_COLUMN_FAMILIES | 64 | Maximum column families per database. |
KVSTORE_NO_TTL | -1 | Sentinel: key exists but has no expiry. |
Configuration — KVStoreConfig
typedef struct KVStoreConfig KVStoreConfig; struct KVStoreConfig { int journalMode; /* KVSTORE_JOURNAL_WAL (default) or KVSTORE_JOURNAL_DELETE */ int syncLevel; /* KVSTORE_SYNC_NORMAL (default), _OFF, or _FULL */ int cacheSize; /* page cache in pages (0 = default = 2000 pages ≈ 8 MB) */ int pageSize; /* page size in bytes (0 = 4096; new databases only) */ int readOnly; /* 1 = open read-only; default 0 */ int busyTimeout; /* ms to retry on SQLITE_BUSY (0 = fail immediately) */ int walSizeLimit; /* auto-checkpoint every N commits in WAL mode (0 = off) */ };
| Field | Default | Notes |
|---|---|---|
journalMode | KVSTORE_JOURNAL_WAL | WAL strongly recommended. |
syncLevel | KVSTORE_SYNC_NORMAL | Safe against process crash. |
cacheSize | 2000 (~8 MB) | Larger cache improves read-heavy loads. |
pageSize | 4096 bytes | Ignored for existing databases. |
readOnly | 0 | |
busyTimeout | 0 (fail immediately) | Set > 0 for multi-process workloads. |
walSizeLimit | 0 (disabled) | Auto-checkpoint every N committed writes. |
Database Lifecycle
kvstore_open_v2
Open or create a database with full configuration control. Pass NULL for pConfig to use all defaults (WAL mode, NORMAL sync, 8 MB cache). Returns KVSTORE_AUTH_FAILED if the file is an encrypted store — use kvstore_open_encrypted instead.
/* default open */ KVStore *kv; kvstore_open_v2("mydb.db", &kv, NULL); /* fully configured */ KVStoreConfig cfg = {0}; cfg.journalMode = KVSTORE_JOURNAL_WAL; cfg.syncLevel = KVSTORE_SYNC_FULL; cfg.cacheSize = 4000; cfg.busyTimeout = 5000; kvstore_open_v2("mydb.db", &kv, &cfg); /* read-only */ KVStoreConfig ro = {0}; ro.readOnly = 1; kvstore_open_v2("mydb.db", &kv, &ro);
KVSTORE_OK on success, error code otherwise.
kvstore_open
Simplified open. Equivalent to kvstore_open_v2 with only journalMode set; all other fields use defaults. Returns KVSTORE_AUTH_FAILED if the file is an encrypted store.
KVStore *kv = NULL; kvstore_open("mydata.db", &kv, KVSTORE_JOURNAL_WAL);
kvstore_close
Close the database and free all resources. Any uncommitted transaction is rolled back. Runs a WAL checkpoint before closing.
Key-Value Operations — Default Column Family
kvstore_put
Insert or update a key-value pair. If the key already exists, its value is replaced. Keys and values are binary-safe. Without an explicit transaction, each call auto-commits.
kvstore_get
Retrieve a value by key. The returned buffer is heap-allocated by SNKV. Caller must free with sqliteFree() / snkv_free().
void *value = NULL; int vlen = 0; int rc = kvstore_get(kv, "user:1", 6, &value, &vlen); if (rc == KVSTORE_OK) { printf("%.*s\n", vlen, (char*)value); sqliteFree(value); }
KVSTORE_OK if found · KVSTORE_NOTFOUND if key does not exist.
kvstore_delete
Delete a key. Returns KVSTORE_NOTFOUND if the key did not exist.
kvstore_exists
Check if a key exists without reading its value. More efficient than kvstore_get for existence checks. Sets *pExists to 1 if found, 0 otherwise.
Column Family Management
Column families provide multiple logical key-value namespaces within a single file. Each is stored in its own B-tree. Maximum KVSTORE_MAX_COLUMN_FAMILIES (64) per database.
kvstore_cf_create
Create a new column family. zName max 255 characters, must be unique. *ppCF receives the handle.
kvstore_cf_open
Open an existing column family. Returns KVSTORE_NOTFOUND if it does not exist.
kvstore_cf_get_default
Get a handle to the default column family. Always exists; created automatically on open.
kvstore_cf_drop
Delete a column family and all its data permanently. The default column family cannot be dropped. Also removes any hidden TTL index CFs.
kvstore_cf_list
List all column families. Caller must free each name and the array with sqliteFree().
char **names = NULL; int count = 0; kvstore_cf_list(kv, &names, &count); for (int i = 0; i < count; i++) { printf("%s\n", names[i]); sqliteFree(names[i]); } sqliteFree(names);
kvstore_cf_close
Release the column family handle. Does not delete data.
CF Key-Value Operations
Identical to the default-CF variants but take a KVColumnFamily* handle.
kvstore_cf_get with sqliteFree().Iterators
Iterate all key-value pairs in ascending lexicographic key order.
kvstore_iterator_create / kvstore_cf_iterator_create
Create a forward iterator for the default or a specific column family.
kvstore_iterator_first / next / eof
Position at first entry, advance to next, or test for end-of-sequence.
kvstore_iterator_key / value
Read the current key or value. Pointers are owned by the iterator — do not free them. Valid until the next next() or close() call.
KVIterator *it = NULL; kvstore_iterator_create(kv, &it); kvstore_iterator_first(it); while (!kvstore_iterator_eof(it)) { void *key, *val; int klen, vlen; kvstore_iterator_key(it, &key, &klen); kvstore_iterator_value(it, &val, &vlen); printf("%.*s\n", klen, (char*)key); kvstore_iterator_next(it); } kvstore_iterator_close(it);
kvstore_iterator_close
Release the iterator and its cursor. Always call this when done.
Prefix Iterators
Pre-positioned at the first key matching the prefix. Stop automatically when keys no longer match. Do not call kvstore_iterator_first() — read directly, then call next().
KVIterator *it = NULL; kvstore_prefix_iterator_create(kv, "user:", 5, &it); /* already at first match — do NOT call first() */ while (!kvstore_iterator_eof(it)) { void *key; int klen; kvstore_iterator_key(it, &key, &klen); kvstore_iterator_next(it); } kvstore_iterator_close(it);
Reverse Iterators
Traverse keys in descending order. Use kvstore_iterator_last() and kvstore_iterator_prev() instead of first() and next(). All key/value/eof/close accessors work identically.
/* Full reverse scan */ KVIterator *it = NULL; kvstore_reverse_iterator_create(kv, &it); kvstore_iterator_last(it); while (!kvstore_iterator_eof(it)) { void *key; int klen; kvstore_iterator_key(it, &key, &klen); kvstore_iterator_prev(it); } kvstore_iterator_close(it); /* Reverse prefix — do NOT call last(), already positioned */ kvstore_reverse_prefix_iterator_create(kv, "user:", 5, &it); while (!kvstore_iterator_eof(it)) { kvstore_iterator_prev(it); } kvstore_iterator_close(it);
Iterator Seek
Jump an open iterator to an arbitrary key in O(log N) time. Works on forward and reverse iterators, including prefix iterators.
kvstore_iterator_seek
Reposition the iterator without closing and reopening it.
- Forward iterator — seek to the first key
>= (pKey, nKey). - Reverse iterator — seek to the last key
<= (pKey, nKey).
If the seek lands outside a prefix iterator’s prefix, the iterator immediately enters the eof state. Call kvstore_iterator_key / kvstore_iterator_next as usual after a successful seek.
| Return | Meaning |
|---|---|
KVSTORE_OK | Seek succeeded (check eof to know if a key was found). |
KVSTORE_ERROR | Invalid arguments. |
/* forward: scan from "m" onwards */ KVIterator *it = NULL; kvstore_iterator_create(kv, &it); kvstore_iterator_seek(it, "m", 1); while (!kvstore_iterator_eof(it)) { void *key; int klen; kvstore_iterator_key(it, &key, &klen); kvstore_iterator_next(it); } kvstore_iterator_close(it); /* reverse: scan from "m" backwards */ kvstore_reverse_iterator_create(kv, &it); kvstore_iterator_seek(it, "m", 1); while (!kvstore_iterator_eof(it)) { kvstore_iterator_prev(it); } kvstore_iterator_close(it);
Transactions
Explicit ACID transactions for batching operations atomically. Without an explicit transaction, each put/delete auto-commits. Wrapping bulk writes in a single transaction is significantly faster.
kvstore_begin(kv, 1); kvstore_put(kv, "key1", 4, "val1", 4); kvstore_put(kv, "key2", 4, "val2", 4); kvstore_commit(kv); /* both writes are atomic */
cfg.busyTimeout > 0 to auto-retry on lock contention, or handle the return code and retry manually.TTL / Key Expiry
Per-key expiry with zero overhead for stores that never use TTL. Hidden index CFs are created lazily on first use and are invisible in kvstore_cf_list.
kvstore_put_ttl / kvstore_get_ttl
/* write key that expires in 30 minutes */ int64_t exp = kvstore_now_ms() + 30 * 60 * 1000; kvstore_put_ttl(kv, "sess:abc", 8, token, tlen, exp); /* read with lazy expiry */ void *val = NULL; int nVal = 0; int64_t rem = 0; int rc = kvstore_get_ttl(kv, "sess:abc", 8, &val, &nVal, &rem); if (rc == KVSTORE_OK) { /* rem = ms remaining */ sqliteFree(val); }
kvstore_ttl_remaining
| Return | *pnRemaining | Meaning |
|---|---|---|
KVSTORE_OK | KVSTORE_NO_TTL (-1) | Key exists, no expiry. |
KVSTORE_OK | 0 | Key just expired (lazy delete performed). |
KVSTORE_OK | N > 0 | N ms remain. |
KVSTORE_NOTFOUND | — | Key does not exist. |
kvstore_purge_expired
Scan the expiry index and delete all expired keys in one write transaction. O(expired keys) — uses sorted expire-time prefix to stop at first non-expired entry.
KVColumnFamily* handle. Each user CF maintains independent TTL indexes; purging CF A never touches CF B.
kvstore_cf_put_ttl / kvstore_cf_get_ttl
CF-level equivalents of kvstore_put_ttl / kvstore_get_ttl. Semantics are identical; the TTL indexes are created lazily per CF and are fully independent.
kvstore_cf_ttl_remaining
Return remaining TTL for a key in this column family without fetching the value. Return codes and *pnRemaining semantics are identical to kvstore_ttl_remaining.
kvstore_cf_purge_expired
Scan and delete all expired keys in this column family only, in a single write transaction. O(expired keys). Purging CF A never affects other column families.
Conditional Insert (Put-If-Absent)
Atomically insert a key-value pair only when the key does not already exist. Expired keys (lazy TTL) are treated as absent. Supports an optional TTL on the new entry.
kvstore_put_if_absent / kvstore_cf_put_if_absent
expire_ms | Behaviour |
|---|---|
0 | No TTL — new entry is permanent. |
> 0 | Absolute expiry in ms since Unix epoch (kvstore_now_ms() + delta_ms). |
The existence check and the write are performed inside a single write transaction, so no other writer can insert the same key concurrently.
int inserted = 0; kvstore_put_if_absent(kv, "lock", 4, "owner", 5, 0, &inserted); if (inserted) { puts("acquired lock"); } else { puts("lock already held"); } /* with TTL: session token that expires in 1 hour */ int64_t exp = kvstore_now_ms() + 3600 * 1000; kvstore_put_if_absent(kv, session_id, sid_len, token, tok_len, exp, NULL);
Bulk Clear
Remove every key-value pair from a column family in a single atomic write transaction. TTL index entries are also cleared. Much faster than iterating and deleting individually.
kvstore_clear / kvstore_cf_clear
Truncates the column family’s B-tree in O(pages) time using SQLite’s BtreeClearTable — the B-tree structure (root page) is preserved so future inserts work immediately without re-opening.
clear is called. BtreeClearTable invalidates all cursors on the cleared table; accessing a stale cursor is undefined behaviour./* flush a rate-limit namespace at midnight */ kvstore_cf_clear(rate_limit_cf); /* wipe everything in the default CF */ kvstore_clear(kv);
Key Count
Count keys in a column family without iterating over them. Uses SQLite’s sqlite3BtreeCount which reads only page headers — O(pages) ≈ O(N/100).
kvstore_count / kvstore_cf_count
Counts only the main data B-tree of the column family. TTL index CFs are separate B-trees and are not included. Expired-but-not-yet-purged keys are included in the count — call kvstore_purge_expired first for an exact live count.
int64_t n; kvstore_count(kv, &n); printf("default CF has %lld entries\n", (long long)n); kvstore_purge_expired(kv, NULL); kvstore_count(kv, &n); printf("live entries after purge: %lld\n", (long long)n);
Diagnostics & Maintenance
kvstore_errmsg
Return the last error message string. Do not free. Valid until the next operation on the same handle.
kvstore_stats / kvstore_stats_reset
kvstore_stats fills the KVStoreStats struct. All counters are cumulative from open (or last reset). kvstore_stats_reset zeros them; nDbPages is a live gauge and is never zeroed.
| Field | Type | Description |
|---|---|---|
nPuts | uint64_t | Total put operations (includes put_ttl, put_if_absent). |
nGets | uint64_t | Total get operations (includes get_ttl). |
nDeletes | uint64_t | Total delete operations. |
nIterations | uint64_t | Total iterators created. |
nErrors | uint64_t | Total errors encountered. |
nBytesRead | uint64_t | Total value bytes returned by get operations. |
nBytesWritten | uint64_t | Total (key + value) bytes written by put operations. |
nWalCommits | uint64_t | Write transactions successfully committed. |
nCheckpoints | uint64_t | WAL checkpoints performed (auto + explicit). |
nTtlExpired | uint64_t | Keys lazily expired on get / exists calls. |
nTtlPurged | uint64_t | Keys removed by kvstore_purge_expired. |
nDbPages | uint64_t | Current total pages in the database file (gauge, not zeroed by reset). |
kvstore_integrity_check
Walk the entire B-tree and verify structural integrity. Returns KVSTORE_OK or KVSTORE_CORRUPT with details in *pzErrMsg. Caller must free *pzErrMsg with sqliteFree().
kvstore_sync
Force all pending changes to disk. If a write transaction is active, performs a commit-and-reopen cycle to flush the WAL.
kvstore_incremental_vacuum
Reclaim unused pages and shrink the database file. Pass 0 to free all unused pages, or a positive nPage to limit work per call (good for latency-sensitive apps).
kvstore_incremental_vacuum(kv, 0); /* free all */ kvstore_incremental_vacuum(kv, 50); /* incremental: 50 pages */
kvstore_checkpoint
Copy WAL frames back into the main database file. pnLog and pnCkpt may be NULL. Requires no active write transaction. Non-WAL databases: no-op returning KVSTORE_OK.
int nLog = 0, nCkpt = 0; kvstore_checkpoint(kv, KVSTORE_CHECKPOINT_PASSIVE, &nLog, &nCkpt); printf("%d frames total, %d checkpointed\n", nLog, nCkpt); /* reclaim WAL disk space */ kvstore_checkpoint(kv, KVSTORE_CHECKPOINT_TRUNCATE, NULL, NULL);
Memory Management
| Macro | Description |
|---|---|
sqliteMalloc(n) | Allocate n zero-initialised bytes. |
sqliteFree(p) | Free memory allocated by SNKV. |
sqliteRealloc(p, n) | Resize allocation. |
sqliteStrDup(s) | Duplicate a C string (equivalent to sqlite3_mprintf("%s", s)). |
snkv_malloc(n) | Alias for sqliteMalloc (single-header build). |
snkv_free(p) | Alias for sqliteFree (single-header build). |
kvstore_get, kvstore_cf_get, kvstore_cf_list, or kvstore_integrity_check must be freed with sqliteFree(). Iterator key/value pointers are iterator-owned — do NOT free them.Thread Safety
- All
kvstore_*calls are mutex-protected — a singleKVStorehandle is safe across threads. - In WAL mode, multiple readers run concurrently with a single writer.
- Only one write transaction can be active at a time; concurrent writes return
KVSTORE_BUSY. - For maximum throughput, each thread can open its own
KVStoreto the same file. KVIteratorhandles are not thread-safe — do not share across threads.
Encryption
Per-value AES-grade encryption using XChaCha20-Poly1305 (via Monocypher). Passwords are stretched with Argon2id. All existing APIs — put, get, delete, iterate, TTL, column families — work transparently on encrypted stores. Keys are stored in plaintext; only values are encrypted.
kvstore_open_encrypted
Open or create an encrypted store. On a fresh file a new Argon2id-derived key is generated from the password. On an existing encrypted file the password is verified before access is granted. If the file is a plain (non-encrypted) store with existing data, all values are encrypted in-place using the supplied password — the store becomes fully encrypted transparently. Returns KVSTORE_AUTH_FAILED only if the password is wrong on an already-encrypted file. *ppKV is set to NULL on any error — no resource leak.
/* Create / open encrypted */ KVStore *kv = NULL; int rc = kvstore_open_encrypted("mydb.db", "hunter2", 7, &kv, NULL); if (rc == KVSTORE_AUTH_FAILED) { puts("wrong password"); } /* All standard operations work transparently */ kvstore_put(kv, "key", 3, "secret", 6); kvstore_close(kv);
kvstore_is_encrypted
Returns 1 if the store was opened with kvstore_open_encrypted, 0 otherwise.
kvstore_reencrypt
Change the encryption password in-place. Derives a new key from the new password, re-encrypts every value in every column family, and updates the authentication metadata — all in a single atomic transaction. The old password is invalid immediately after this call returns.
kvstore_reencrypt(kv, "new-pass", 8); /* kv is still open and usable with the new password */
kvstore_remove_encryption
Decrypt all values in-place and remove the authentication metadata. After this call the store is a plain (non-encrypted) SNKV database and can be opened with kvstore_open or kvstore_open_v2.
kvstore_remove_encryption(kv); kvstore_close(kv); /* Now opens without a password */ KVStore *plain = NULL; kvstore_open("mydb.db", &plain, KVSTORE_JOURNAL_WAL);
kvstore_now_ms
Current time in milliseconds since the Unix epoch. Useful for computing absolute TTL expiry timestamps: kvstore_now_ms() + delta_ms.
- KDF: Argon2id (m=64 MB, t=3 iterations) — password → 256-bit key
- Cipher: XChaCha20-Poly1305 — 24-byte random nonce + 16-byte MAC per value
- Overhead: 40 bytes per stored value (nonce + MAC)
- Key scope: wiped from memory on
kvstore_close - Column families: all CFs share one key; re-encryption covers all CFs atomically
Vector Store
An HNSW approximate nearest-neighbour index built on top of KVStore, using the
usearch C SDK.
Vectors and KV data share a single .db file. The index is rebuilt from
the database on open and saved to a .usearch sidecar on close for fast
reload. All writes are atomic KVStore transactions; the usearch index is updated after
each successful commit.
kvstore_vec.h and link against libsnkv_vec.a
(built with make vector). The core store (libsnkv.a / snkv.h)
does not include the vector layer — it requires g++ to compile the usearch C++ core.Types & Constants
Opaque handle
KVVecStore | Opaque handle returned by kvstore_vec_open. |
Result structs
KVVecSearchResult | pKey/nKey, pValue/nValue, distance, pMetadata/nMetadata. Free with kvstore_vec_free_results. |
KVVecKeyResult | pKey/nKey, distance. Free with kvstore_vec_free_key_results. |
KVVecStats | Filled by kvstore_vec_stats. See Stats section. |
KVVecItem | One item for kvstore_vec_put_batch: pKey/nKey, pVal/nVal, pVec, pMeta/nMeta. |
Distance spaces
KVVEC_SPACE_L2 | 0 | Squared Euclidean (‖a−b‖²) — distances are comparable but not sqrt L2; default |
KVVEC_SPACE_COSINE | 1 | Cosine distance (1 − dot(a,b) / (‖a‖·‖b‖)) |
KVVEC_SPACE_IP | 2 | Inner product (negative dot product) |
KVVEC_SPACE_L2 returns squared Euclidean distance — squaring is omitted for performance. Relative ordering is identical to true L2, so nearest-neighbour results are unaffected. To recover metric distance, take sqrtf(result.distance).Index precision (dtype)
KVVEC_DTYPE_F32 | 0 | float32 — full precision (default) |
KVVEC_DTYPE_F16 | 1 | float16 — half RAM, negligible recall loss |
KVVEC_DTYPE_I8 | 2 | int8 — quarter RAM, cosine-like metrics only |
Extra error codes
KVVEC_DIM_MISMATCH | 20 | Opened with wrong dim for existing store |
KVVEC_DTYPE_MISMATCH | 21 | Opened with wrong dtype for existing store |
KVVEC_SPACE_MISMATCH | 22 | Opened with wrong space for existing store |
KVVEC_INDEX_DROPPED | 23 | Index has been dropped; search unavailable |
KVVEC_INDEX_EMPTY | 24 | Index is empty; put a vector first |
KVVEC_BAD_VECTOR | 25 | Vector shape or value is invalid |
Open / Close
kvstore_vec_open
Open or create a vector store. dim, space, and dtype are immutable after the first open — passing different values on a subsequent open returns the corresponding mismatch error. Pass pPassword/nPassword for an encrypted store; sidecar persistence is disabled for encrypted stores. *ppVS is NULL on any error.
KVVecStore *vs = NULL; int rc = kvstore_vec_open( "store.db", 128, KVVEC_SPACE_COSINE, 0, 0, 0, /* defaults */ KVVEC_DTYPE_F32, NULL, 0, /* plain (unencrypted) */ &vs ); if (rc == KVVEC_DIM_MISMATCH) puts("wrong dim for existing store");
kvstore_vec_close
Close the store and free all resources. For unencrypted file-backed stores, saves the HNSW graph to {path}.usearch and the next-id stamp to {path}.usearch.nid so the next open can skip the O(n·dim) rebuild. Safe to call with NULL.
Writes
kvstore_vec_put
Insert or update a key/value pair and add its vector to the HNSW index. The KVStore write (all 5 internal CFs) is one atomic transaction; the usearch index update happens after commit. If the key already exists, the old usearch label is removed before the new one is added (count stays constant on overwrite).
float vec[128] = { /* ... */ }; kvstore_vec_put(vs, "doc:1", 5, "hello", 5, vec, 0, /* no TTL */ "{\"tag\":\"ai\"}", 12); /* metadata */
kvstore_vec_put_batch
Insert or update nItems records in one atomic transaction. Last-write-wins for duplicate keys within the batch. All usearch insertions happen after the commit.
KVVecItem items[3] = {
{ "k1", 2, "val1", 4, vec1, NULL, 0 },
{ "k2", 2, "val2", 4, vec2, NULL, 0 },
{ "k3", 2, "val3", 4, vec3, meta, nMeta },
};
kvstore_vec_put_batch(vs, items, 3, 0);
kvstore_vec_kv_put
Plain KV write — does not update the vector index. Useful for storing non-vector data (config, counters) alongside vector keys in the same file.
Reads
kvstore_vec_get
Fetch value bytes. TTL-aware: returns KVSTORE_NOTFOUND and lazily deletes the key if it has expired. Caller must snkv_free(*ppVal).
kvstore_vec_get_vector
Fetch the stored float32 vector. *ppVec is a flat array of dim floats. Caller must snkv_free(*ppVec).
kvstore_vec_get_metadata
Fetch JSON metadata bytes. Returns KVSTORE_OK with *ppMeta=NULL when no metadata has ever been written to this store (tags CF does not yet exist). Returns KVSTORE_NOTFOUND with *ppMeta=NULL when the tags CF exists but this specific key has no metadata. Both cases are not errors. Caller must snkv_free(*ppMeta) on KVSTORE_OK with non-NULL result.
kvstore_vec_contains
Returns 1 if the key exists and is not expired, 0 otherwise.
kvstore_vec_count
Returns the number of active (non-deleted) vectors in the HNSW index (usearch_size).
Search
kvstore_vec_search
Approximate nearest-neighbour search. Expired keys are filtered out transparently. Results are sorted ascending by distance. Caller must pass the array to kvstore_vec_free_results. Returns KVVEC_INDEX_EMPTY if no vectors have been inserted, KVVEC_INDEX_DROPPED after kvstore_vec_drop_index.
float q[128] = { /* ... */ }; KVVecSearchResult *res = NULL; int n = 0; /* Basic ANN search */ kvstore_vec_search(vs, q, 5, 0, 0, 0.0f, &res, &n); /* With exact rerank (oversample×top_k candidates → exact distance sort) */ kvstore_vec_search(vs, q, 5, 1, 3, 0.0f, &res, &n); for (int i = 0; i < n; i++) printf("%.*s dist=%.4f val=%.*s\n", res[i].nKey, (char*)res[i].pKey, res[i].distance, res[i].nValue, (char*)res[i].pValue); kvstore_vec_free_results(res, n);
kvstore_vec_search_keys
ANN search returning (key, distance) pairs only — no value fetch. Cheaper than kvstore_vec_search when values are not needed. Caller must pass the array to kvstore_vec_free_key_results.
kvstore_vec_free_results / kvstore_vec_free_key_results
Free the result arrays returned by the search functions. Safe to call with NULL.
Delete
kvstore_vec_delete
Delete the key from the KV store, all 5 vector CFs, and the HNSW index — all in one atomic transaction. Returns KVSTORE_NOTFOUND if the key does not exist.
TTL
Pass a non-zero expire_ms to kvstore_vec_put (absolute milliseconds since the Unix epoch — use kvstore_now_ms() + delta). Expired vectors are filtered out transparently during search and lazily deleted on get. Use kvstore_vec_purge_expired for bulk cleanup.
/* Expire in 60 seconds */ int64_t exp = kvstore_now_ms() + 60000; kvstore_vec_put(vs, "sess:abc", 8, token, nToken, vec, exp, NULL, 0); /* Bulk-delete all expired vectors */ int nPurged = 0; kvstore_vec_purge_expired(vs, &nPurged);
Stats
kvstore_vec_stats
Fill *pStats with current index configuration and runtime state. Always returns KVSTORE_OK.
dim | Vector dimension |
space | KVVEC_SPACE_* |
dtype | KVVEC_DTYPE_* |
connectivity | HNSW M parameter |
expansion_add | ef_construction |
expansion_search | ef at query time |
count | Active vectors in HNSW index (usearch_size) |
capacity | Allocated HNSW capacity |
fill_ratio | count / capacity — 0–1 ratio (multiply by 100 for %). Auto-doubles at 0.9. |
vec_cf_count | Entries in _snkv_vec_ CF (may include expired) |
has_metadata | 1 if _snkv_vec_tags_ CF exists |
sidecar_enabled | 1 for unencrypted file-backed stores |
Maintenance
kvstore_vec_purge_expired
Scan the _snkv_vec_ CF and delete all expired vectors from every CF and the HNSW index in one atomic transaction. *pnDeleted (may be NULL) is set to the number removed. Use this in preference to kvstore_purge_expired to keep vector CFs in sync. Returns KVSTORE_OK (0 deleted) when there are no TTL entries.
kvstore_vec_drop_index
Drop all five _snkv_vec*_ column families and free the in-memory HNSW index. KV data in the default CF is preserved and readable via kvstore_vec_get. After this call, kvstore_vec_search returns KVVEC_INDEX_DROPPED. Sidecar files are removed.
kvstore_vec_drop_index(vs); /* index gone, KV intact */ void *v = NULL; int n = 0; kvstore_vec_get(vs, "doc:1", 5, &v, &n); /* still works */ snkv_free(v);
Build
The vector layer requires g++ (C++17) to compile the usearch HNSW core. The default make / make test targets use pure gcc and are unaffected.
# Build libsnkv_vec.a (core + usearch objects) make vector # Build and run the comprehensive example make vector-examples make run-vector-examples # Build and run the test suite (150 tests) make test-vector
Internal storage (5 Column Families)
_snkv_vec_ | key → float32[dim] — stored vector |
_snkv_vec_idk_ | key → uint64 label (8B big-endian) — maps key to HNSW label |
_snkv_vec_idi_ | uint64 label → key — reverse lookup used during search |
_snkv_vec_meta_ | config keys: ndim, metric, connectivity, expansion_add, expansion_search, dtype, next_id |
_snkv_vec_tags_ | key → JSON bytes — created lazily on first metadata write |
Sidecar files
{path}.usearch | usearch HNSW binary graph — saved on close, loaded on open |
{path}.usearch.nid | 8-byte big-endian next-id stamp — used to detect stale sidecars |
zPath=NULL). If the stamp in .usearch.nid does not match the stored next_id the sidecar is discarded and the index is rebuilt from the CFs.Installation
# Install from PyPI pip install snkv # Or build from source cd python python3 setup.py build_ext --inplace
Quick Start
from snkv import KVStore with KVStore("mydb.db") as db: db["hello"] = "world" print(db["hello"].decode()) # world print(db.get("missing")) # None for key, value in db: print(key, value) # b'hello' b'world'
Constants
Journal Mode
| Constant | Description |
|---|---|
JOURNAL_WAL | Write-Ahead Logging — concurrent readers (default). |
JOURNAL_DELETE | Rollback journal mode. |
Sync Level
| Constant | Description |
|---|---|
SYNC_OFF | No fsync — fastest, not crash-safe. |
SYNC_NORMAL | Survives process crash (default). |
SYNC_FULL | fsync on every commit — strongest durability. |
Checkpoint Mode
| Constant | Description |
|---|---|
CHECKPOINT_PASSIVE | Non-blocking copy (default). |
CHECKPOINT_FULL | Wait for all readers, then copy all frames. |
CHECKPOINT_RESTART | Like FULL, then reset WAL write position. |
CHECKPOINT_TRUNCATE | Like RESTART, then truncate WAL to zero bytes. |
TTL
| Constant | Value | Description |
|---|---|---|
NO_TTL | -1 | Returned by ttl() when key has no expiry. |
Exceptions
snkv.Error ← base class for all SNKV errors ├── snkv.NotFoundError (also subclass of KeyError) ├── snkv.BusyError ├── snkv.LockedError ├── snkv.ReadOnlyError ├── snkv.CorruptError └── snkv.AuthError
| Exception | Raised when |
|---|---|
Error | Generic SNKV error (base class). |
NotFoundError | Key does not exist; also a KeyError. |
BusyError | Database locked and busy_timeout expired. |
LockedError | Write lock conflict with a concurrent transaction. |
ReadOnlyError | Write attempted on a read-only store. |
CorruptError | Integrity check detected corruption. |
AuthError | Wrong password supplied to an encrypted store, or encrypted store opened without a password via KVStore(). |
KVStore
The main entry point. Opens or creates a key-value store at a given path. Always use as a context manager to ensure proper cleanup. Raises AuthError if the file is an encrypted store — use KVStore.open_encrypted() instead.
Opening a Store — KVStore(path, *, **config)
# Minimal db = KVStore("mydb.db") # In-memory db = KVStore() # or KVStore(None) # Fully configured db = KVStore( "mydb.db", journal_mode=JOURNAL_WAL, sync_level=SYNC_NORMAL, cache_size=4000, busy_timeout=5000, wal_size_limit=200, ) # Context manager (recommended) with KVStore("mydb.db") as db: db["key"] = "value"
Core Operations
put(key, value, ttl=None) → None
Insert or overwrite a key-value pair. ttl is seconds until expiry (int or float); None means no expiry. Both key and value accept str, bytes, bytearray, or memoryview.
db.put("user:1", b"\x01\x02\x03") db.put("session", "tok", ttl=3600) # expires in 1 hour db.put("cache", data, ttl=0.5) # half a second
get(key, default=None) → bytes | None
Return the stored value as bytes, or default if the key does not exist or has expired. Never raises NotFoundError — use db[key] for that.
val = db.get("user:1") # bytes or None val = db.get("missing", b"") # b'' if not found
delete(key) → None
Delete a key. Raises NotFoundError (KeyError) if the key does not exist.
exists(key) → bool
Return True if the key exists, without fetching the value.
TTL / Key Expiry
ttl(key) → float | None
| Return value | Meaning |
|---|---|
| Positive float | Seconds remaining. |
0.0 | Key just expired (lazy delete performed). |
None | Key exists but has no expiry. |
raises NotFoundError | Key does not exist. |
db.put("session", "tok", ttl=3600) print(db.ttl("session")) # e.g. 3599.97 print(db.ttl("perm")) # None (no TTL)
purge_expired() → int
Scan the TTL index and delete all expired keys in a single transaction. Returns the count of keys deleted. O(expired keys).
n = db.purge_expired()
print(f"Removed {n} expired keys")
Conditional Insert / Bulk Clear / Key Count
put_if_absent(key, value, ttl=None) → bool
Atomically insert key only when it is absent (or has expired). Returns True if inserted, False if already present. Safe for distributed locks and deduplication.
# Distributed lock with auto-release inserted = db.put_if_absent(b"lock:job-1", b"worker-A", ttl=30) if inserted: run_job() # only one worker reaches here # Deduplication if db.put_if_absent(b"msg:001", b"payload"): process() # first write wins
clear() → None
Remove every key-value pair from the default column family in a single atomic write transaction. TTL index entries are cleared atomically. Runs in O(pages) — much faster than iterating and deleting individually. Close all iterators before calling.
db.clear() # truncate default CF; all keys gone db.put(b"fresh", b"start") # normal inserts work immediately after
count() → int
Return the number of entries in the default column family. Reads page-level nCell headers via sqlite3BtreeCount — O(pages), not O(rows). Expired-but-not-yet-purged keys are included; call purge_expired() first for an exact live count.
n = db.count() # e.g. 10000 db.purge_expired() n = db.count() # exact live count
Dict-like Interface
| Syntax | Equivalent | Notes |
|---|---|---|
db["key"] | get | Raises NotFoundError on miss/expiry. |
db["key"] = "val" | put | No TTL. |
db["key", ttl] = "val" | put(ttl=...) | TTL in seconds. |
del db["key"] | delete | Raises NotFoundError on miss. |
"key" in db | exists | |
for k, v in db | iterator() | Yields (bytes, bytes). |
db["session:abc"] = "active" db["token:xyz", 60] = "bearer-abc" # expires in 60 s print(db["session:abc"]) # b'active' del db["session:abc"]
Transactions
begin(write=False) / commit() / rollback()
ACID transactions for batching operations atomically. Without an explicit transaction, each individual put/delete auto-commits.
db.begin(write=True) try: db["a"] = "1" db["b"] = "2" db.commit() except Exception: db.rollback() raise
Column Families
with db.create_column_family("users") as users: users["alice"] = b"admin" with db.open_column_family("users") as users: print(users["alice"]) # b'admin' names = db.list_column_families() # ["users", ...] db.drop_column_family("users")
Iterators
# Forward scan for key, value in db.iterator(): print(key.decode(), value.decode()) # Reverse scan for key, value in db.iterator(reverse=True): print(key.decode(), value.decode()) # Prefix, ascending for key, value in db.iterator(prefix="user:"): print(key, value) # Prefix, descending for key, value in db.iterator(prefix="user:", reverse=True): print(key, value)
Maintenance
sync() / vacuum(n_pages=0) / integrity_check() / checkpoint(mode) / stats() / stats_reset()
stats() returns a dict with these keys:
| Key | Description |
|---|---|
puts | Total put operations (includes TTL puts and put_if_absent). |
gets | Total get operations (includes TTL gets). |
deletes | Total delete operations. |
iterations | Total iterators created (each iterator(), prefix_iterator(), etc. call counts as one). |
errors | Total non-NOTFOUND error returns. |
bytes_read | Total value bytes returned by get operations. |
bytes_written | Total key+value bytes written by put operations. |
wal_commits | Total WAL write transaction commits. |
checkpoints | Total WAL checkpoint operations. |
ttl_expired | Total keys lazily deleted on access due to TTL expiry. |
ttl_purged | Total keys deleted by purge_expired(). |
db_pages | Current live database page count (always up-to-date; not reset by stats_reset()). |
import snkv from snkv import CHECKPOINT_TRUNCATE, CorruptError # Vacuum after bulk deletes db.vacuum() # Checkpoint and reclaim WAL space nlog, nckpt = db.checkpoint(CHECKPOINT_TRUNCATE) print(f"WAL: {nlog} frames, {nckpt} checkpointed") try: db.integrity_check() except CorruptError as e: print(f"Corruption: {e}") st = db.stats() print(st["puts"], st["gets"], st["bytes_written"], st["db_pages"]) db.stats_reset() # counters zeroed; db_pages still live
Lifecycle
close() / context manager
Close the store and release all resources. All column family and iterator handles must be closed first. Always prefer the context manager form.
with KVStore("mydb.db") as db: ... # db.close() called automatically
ColumnFamily
A logical namespace within a KVStore. Obtained via db.create_column_family(), db.open_column_family(), or db.default_column_family(). Always use as a context manager.
Core Operations
with db.open_column_family("users") as cf: cf.put("alice", b"admin") cf.put("token", b"xyz", ttl=300) # 5 minutes role = cf.get("alice") # b'admin' cf.delete("alice")
CF TTL
remaining = cf.ttl("token") # e.g. 284.3 n = cf.purge_expired() print(f"Cleaned {n} entries")
CF Conditional Insert / Bulk Clear / Key Count
with db.open_column_family("dedup") as cf: # First write wins — safe deduplication if cf.put_if_absent(b"msg:001", b"hello"): process() print(cf.count()) # entry count for this CF only cf.clear() # other CFs unaffected print(cf.count()) # 0
CF Dict Interface
| Syntax | Notes |
|---|---|
cf["key"] | Raises NotFoundError on miss/expiry. |
cf["key"] = "val" | No TTL. |
cf["key", ttl] = "val" | TTL in seconds. |
del cf["key"] | Raises NotFoundError on miss. |
"key" in cf | Existence check. |
for k, v in cf | Forward iteration over all keys. |
cf["alice"] = "admin" cf["session:u42", 300] = "logged-in" print(cf["alice"]) # b'admin' del cf["alice"]
CF Iterators
for key, value in cf.iterator(prefix="user:", reverse=True): print(key, value)
Iterator
Ordered cursor over key-value pairs. Forward iterators yield keys ascending; reverse iterators yield keys descending. Each iteration step yields a (bytes, bytes) tuple.
Python Iterator Protocol
The most common usage — iterate with a for loop or context manager:
# Simple forward for key, value in db.iterator(): print(key.decode(), "->", value.decode()) # Reverse with context manager (auto-closes) with db.iterator(reverse=True) as it: for key, value in it: print(key, value)
Manual Control
# Manual forward it = db.iterator() it.first() while not it.eof: print(it.key.decode(), it.value.decode()) it.next() it.close() # Manual reverse it = db.reverse_iterator() it.last() while not it.eof: k, v = it.item() print(k, v) it.prev() it.close() # seek — jump to any position in O(log N) with db.iterator() as it: it.seek(b"user:carol") # forward: first key >= target while not it.eof: print(it.key) it.next() # seek on a prefix iterator — boundary still enforced with db.iterator(prefix=b"user:") as it: it.seek(b"user:m") # skips to "user:m..." within prefix while not it.eof: print(it.key) it.next() # seek returns self — chainable key = db.iterator().seek(b"target").key
Type Notes
| Input type | Behaviour |
|---|---|
str | Encoded to UTF-8 bytes automatically. |
bytes | Passed through as-is. |
bytearray | Converted to bytes. |
memoryview | Converted to bytes. |
Return values are always bytes.
KVStore across threads. Column family and iterator handles are not independently thread-safe.Encryption
Transparent per-value encryption using XChaCha20-Poly1305 with Argon2id key derivation. All existing methods — put, get, delete, iterators, TTL, column families — work without modification on an encrypted store.
KVStore.open_encrypted / is_encrypted / reencrypt / remove_encryption
open_encrypted is a class method that opens or creates an encrypted store. password may be bytes or str (UTF-8 encoded). If the file is a plain store with existing data, all values are encrypted in-place and the store becomes encrypted — existing data is preserved. Raises AuthError only on a wrong password for an already-encrypted store.
from snkv import KVStore, AuthError # Create / open encrypted store with KVStore.open_encrypted("mydb.db", b"hunter2") as db: db[b"secret"] = b"classified" print(db.is_encrypted()) # True print(db[b"secret"]) # b"classified" — transparent decrypt # Wrong password raises AuthError try: KVStore.open_encrypted("mydb.db", b"wrong") except AuthError: print("bad password") # Change password in-place (re-encrypts all values atomically) with KVStore.open_encrypted("mydb.db", b"hunter2") as db: db.reencrypt(b"new-strong-pass") # Remove encryption — store becomes plain with KVStore.open_encrypted("mydb.db", b"new-strong-pass") as db: db.remove_encryption() with KVStore("mydb.db") as db: # plain open works now print(db[b"secret"]) # open_encrypted on a plain store — encrypts it in-place with KVStore.open_encrypted("mydb.db", b"fresh-pass") as db: print(db.is_encrypted()) # True — store is now encrypted print(db[b"secret"]) # b"classified" — data preserved
snkv.Error raised when a wrong password is supplied to an already-encrypted store. Calling open_encrypted on a plain store encrypts it in-place instead of raising AuthError.VectorStore
SNKV key-value store with an integrated usearch HNSW vector index. All vector data lives inside the same .db file using dedicated column families.
Index persistence: For unencrypted, file-backed stores the in-memory index is saved to {path}.usearch on close() and reloaded on the next open — skipping the O(n×d) rebuild. If the sidecar is absent, stale, or corrupt it is silently discarded and the index is rebuilt from the column families. Encrypted stores and in-memory stores always rebuild from column families (a plaintext sidecar would defeat encryption).
vector extra: pip install snkv[vector]Installation
# installs snkv + usearch>=2.9 + numpy>=1.21
pip install snkv[vector]
Opening a VectorStore
VectorStore(path, dim, *, space, connectivity, expansion_add, expansion_search, dtype, password, **kv_kwargs)
| Parameter | Default | Description |
|---|---|---|
path | — | Path to .db file. None for in-memory. |
dim | — | Vector dimension. Fixed for the lifetime of the store. |
space | "l2" | Distance metric: "l2" (squared L2), "cosine", or "ip" (inner product). |
connectivity | 16 | HNSW M parameter. Must be ≥ 1. |
expansion_add | 128 | HNSW expansion during index build. Must be ≥ 1. |
expansion_search | None | None restores the stored value (falls back to 64 for new stores). Explicit value overrides stored. |
dtype | "f32" | In-memory HNSW index precision: "f32" (default), "f16" (half RAM), or "i8" (quarter RAM). On-disk storage in _snkv_vec_ is always float32 regardless of dtype. Lower precision reduces memory at a small recall cost — see Quantization below. |
password | None | bytes or str. Opens/creates an encrypted store. Disables sidecar — index is always rebuilt from encrypted CFs. |
from snkv.vector import VectorStore import numpy as np # Plain store with VectorStore("store.db", dim=128, space="cosine") as vs: vs.vector_put(b"doc:1", b"hello world", np.random.rand(128).astype("f4")) # Encrypted store with VectorStore("store.db", dim=128, password=b"secret") as vs: vs.vector_put(b"doc:1", b"hello world", np.random.rand(128).astype("f4")) # Quantized store — half RAM for the in-memory HNSW index with VectorStore("store.db", dim=768, space="cosine", dtype="f16") as vs: vs.vector_put(b"doc:1", b"hello world", np.random.rand(768).astype("f4"))
Quantization: dtype controls only the in-memory HNSW graph precision — on-disk vectors in _snkv_vec_ are always float32. Memory per vector in the index: f32 = dim × 4 bytes, f16 = dim × 2 bytes, i8 = dim × 1 byte. For 1 M vectors at dim=768: f32 ≈ 3 GB, f16 ≈ 1.5 GB, i8 ≈ 768 MB. dtype is immutable — validated against the stored value on reopen.
Writing Vectors
vector_put(key, value, vector, *, ttl=None, metadata=None) → None
Store a key/value pair and add its vector to the HNSW index atomically. The SNKV write (all column families) is one transaction; the in-memory usearch update happens after commit. On overwrite the old usearch entry is removed and replaced.
| Parameter | Description |
|---|---|
key | bytes or str |
value | bytes or str — arbitrary payload stored in the default CF |
vector | array-like, shape (dim,) — stored as float32 on disk |
ttl | Finite positive float — seconds until expiry. None = no expiry. |
metadata | dict or None. None = preserve existing. {} = clear. {...} = write new. Must be JSON-serializable. |
vs.vector_put(b"doc:1", b"hello", np.random.rand(128).astype("f4")) vs.vector_put(b"doc:2", b"world", np.random.rand(128).astype("f4"), ttl=3600, metadata={"category": "news", "score": 0.9})
vector_put_batch(items, *, ttl=None) → None
Insert or update multiple items in a single SNKV transaction with one bulk index.add call. 5–20× faster than N individual vector_put calls for bulk loads. Duplicate keys in the batch are deduplicated — last occurrence wins. If any item has an invalid vector shape the entire batch rolls back.
| Parameter | Description |
|---|---|
items | Iterable of (key, value, vector) or (key, value, vector, metadata) |
ttl | Finite positive float — uniform expiry applied to all items. |
vs.vector_put_batch([ (b"a", b"val-a", np.random.rand(128).astype("f4")), (b"b", b"val-b", np.random.rand(128).astype("f4"), {"tag": "x"}), ])
Reading Vectors & Values
vector_get(key) → np.ndarray
Return the stored vector as np.ndarray(dim,) float32. Raises NotFoundError if no vector is stored for the key. Raises VectorIndexError if the index has been dropped.
get(key, default=None) → bytes | None
Return value bytes from the default CF, or default if missing or expired. Identical to KVStore.get.
get_metadata(key) → dict | None
Return the stored metadata dict, or None if the key has no metadata or the tags CF has never been created. Returns None silently on corrupted JSON.
vec = vs.vector_get(b"doc:1") # np.ndarray(128,) val = vs.get(b"doc:1") # b"hello" meta = vs.get_metadata(b"doc:2") # {"category": "news", "score": 0.9}
ANN Search
search(query, top_k=10, *, filter=None, rerank=False, oversample=3, max_distance=None) → List[SearchResult]
Approximate nearest-neighbour search. Returns up to top_k results sorted by distance ascending. Expired keys are skipped transparently.
| Parameter | Description |
|---|---|
query | array-like, shape (dim,) |
top_k | Maximum results to return (must be > 0). |
filter | dict — equality filter (all fields must match); callable — predicate (metadata_dict) → bool; None — no filter. |
rerank | bool — re-score ANN candidates with exact float32 distances. |
oversample | int ≥ 1 — candidate pool = top_k × oversample when filter or rerank is active (default 3). |
max_distance | float or None — drop results with distance > threshold, applied last after rerank. |
SearchResult fields: key: bytes, value: bytes, distance: float, metadata: dict | None.
Important: result.metadata is None unless a filter argument is provided. This is intentional — metadata loading is skipped when no filtering is needed to avoid the per-candidate CF read cost. To read metadata without filtering, use get_metadata(key) after the search.
q = np.random.rand(128).astype("f4") # Basic ANN search — result.metadata is None here results = vs.search(q, top_k=5) for r in results: print(r.key, r.distance, r.value) # r.metadata is None — use get_metadata() if you need it meta = vs.get_metadata(r.key) # dict or None # With metadata filter — result.metadata IS populated results = vs.search(q, top_k=5, filter={"category": "news"}) for r in results: print(r.metadata) # {"category": "news", ...} # Exact rerank + distance cutoff results = vs.search(q, top_k=5, rerank=True, max_distance=0.5)
search_keys(query, top_k=10) → List[Tuple[bytes, float]]
ANN search returning (key, distance) pairs only — no value fetch, no metadata. Faster than search() when you only need keys.
pairs = vs.search_keys(q, top_k=10) for key, dist in pairs: print(key, dist)
Deleting
delete(key) → None
Delete key from the KV store and HNSW index atomically. Raises NotFoundError if the key does not exist or has expired. If the vector index has been dropped, falls back to a plain KV delete.
vs.delete(b"doc:1")
TTL
Vectors stored with ttl= are lazily expired: expired keys are skipped transparently in search(), search_keys(), and get(). Call vector_purge_expired() to reclaim disk space and remove them from the in-memory index.
# Store with 1-hour TTL vs.vector_put(b"session", b"data", vec, ttl=3600) # Expired keys are silently skipped in search results = vs.search(q, top_k=5) # won't include b"session" after expiry # Bulk-purge expired vectors (keeps in-memory index in sync) n = vs.vector_purge_expired() print(f"purged {n} vectors")
Stats & Info
vector_stats() → dict
Return full index configuration and runtime state.
| Key | Description |
|---|---|
dim | Vector dimension |
space | Distance metric |
dtype | In-memory index precision |
connectivity | HNSW M parameter |
expansion_add | HNSW build expansion |
expansion_search | HNSW search expansion |
count | Active vectors in usearch index |
capacity | Allocated usearch capacity |
fill_ratio | count / capacity |
vec_cf_count | Entries in _snkv_vec_ CF (may include expired) |
has_metadata | True if metadata CF exists |
sidecar_enabled | True for unencrypted file-backed stores; False for encrypted or in-memory |
vector_info() → dict
Subset of vector_stats(): dim, space, dtype, count, connectivity, expansion_add, expansion_search.
__len__() → int
Active vector count. Same as vector_stats()["count"].
print(len(vs)) # active vector count print(vs.vector_stats()) # full dict
Maintenance
vector_purge_expired() → int
Scan all vector entries, delete those whose KV value has expired, remove them from the HNSW index, and return the count deleted. Always call this instead of KVStore.purge_expired() on a VectorStore to keep the vector column families in sync.
n = vs.vector_purge_expired() print(f"removed {n} expired vectors")
drop_vector_index
drop_vector_index() → None
Drop all five internal vector column families (_snkv_vec_, _snkv_vec_idk_, _snkv_vec_idi_, _snkv_vec_meta_, _snkv_vec_tags_) permanently. KV data in the default CF is preserved and accessible via get(). The sidecar file ({path}.usearch) is also deleted. After this call, search() and vector_put() raise VectorIndexError.
vs.drop_vector_index() val = vs.get(b"doc:1") # still works — KV data intact
Dict Interface
Standard dict operations are forwarded to the plain KV layer. They do not interact with the vector index.
| Syntax | Equivalent |
|---|---|
vs[key] | get(key), raises NotFoundError if missing |
vs[key] = value | Plain KV put — does not update vector index |
del vs[key] | delete(key) — removes from both KV and vector index |
key in vs | True if key exists in default CF |
len(vs) | Active vector count |
Lifecycle
close() / context manager
Close all column family handles and the underlying KVStore. For unencrypted file-backed stores, saves the in-memory usearch index to {path}.usearch so the next open can skip the rebuild. Use as a context manager (with VectorStore(...) as vs:) for automatic cleanup.
VectorIndexError
Raised when the usearch index is not initialised or has been dropped. Subclass of Exception. Import from snkv.vector.
from snkv.vector import VectorStore, VectorIndexError, SearchResult with VectorStore("store.db", dim=128) as vs: vs.vector_put(b"k", b"v", np.zeros(128, dtype="f4")) results = vs.search(np.zeros(128, dtype="f4"), top_k=1) print(results[0].key, results[0].distance)