Home

Metadata

vecs allows you to associate key-value pairs of metadata with indexes and ids in your collections. You can then add filters to queries that reference the metadata metadata.

Types#

Metadata is stored as binary JSON. As a result, allowed metadata types are drawn from JSON primitive types.

  • Boolean
  • String
  • Number

The technical limit of a metadata field associated with a vector is 1GB. In practice you should keep metadata fields as small as possible to maximize performance.

Metadata Query Language#

The metadata query language is based loosely on mongodb's selectors.

vecs currently supports a subset of those operators.

Comparison Operators#

Comparison operators compare a provided value with a value stored in metadata field of the vector store.

OperatorDescription
$eqMatches values that are equal to a specified value
$neMatches values that are not equal to a specified value
$gtMatches values that are greater than a specified value
$gteMatches values that are greater than or equal to a specified value
$ltMatches values that are less than a specified value
$lteMatches values that are less than or equal to a specified value
$inMatches values that are contained by scalar list of specified values

Logical Operators#

Logical operators compose other operators, and can be nested.

OperatorDescription
$andJoins query clauses with a logical AND returns all documents that match the conditions of both clauses.
$orJoins query clauses with a logical OR returns all documents that match the conditions of either clause.

Performance#

For best performance, use scalar key-value pairs for metadata and prefer $eq, $and and $or filters where possible. Those variants are most consistently able to make use of indexes.

Examples#


year equals 2020


_10
{"year": {"$eq": 2020}}


year equals 2020 or gross greater than or equal to 5000.0


_10
{
_10
"$or": [
_10
{"year": {"$eq": 2020}},
_10
{"gross": {"$gte": 5000.0}}
_10
]
_10
}


last_name is less than "Brown" and is_priority_customer is true


_10
{
_10
"$and": [
_10
{"last_name": {"$lt": "Brown"}},
_10
{"is_priority_customer": {"$gte": 5000.00}}
_10
]
_10
}


priority contained by ["enterprise", "pro"]


_10
{
_10
"priority": {"$in": ["enterprise", "pro"]}
_10
}