Skip to content

Indexing

VectorChord's index type vchordrq divides vectors into lists and searches only a subset of lists closest to the query vector. It provides fast build time and low memory consumption, while delivering significantly better performance than both hnsw and ivfflat.

Let's start by creating a table named items with an embedding column of type vector(n), and then populate it with sample data.

sql
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));
INSERT INTO items (embedding) SELECT ARRAY[random(), random(), random()]::real[] FROM generate_series(1, 1000);

To create a vchordrq index, you can use the following SQL.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops);

After the index is built, you can perform a vector search using it to retrieve the nearest neighbors to a vector.

sql
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 10;

You can also add filters to vector search queries as needed.

sql
SELECT * FROM items WHERE id % 7 <> 0 ORDER BY embedding <-> '[3,1,2]' LIMIT 10;

Tuning: Balancing query throughput and accuracy

When there are less than rows in the table, you usually don't need to set the index options.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops);

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 10;

However, as the number of rows grows, partitioning becomes necessary and can be configured using index options. options are specified using a TOML: Tom's Obvious Minimal Language string. You can refer to #Index Options for more information.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops) WITH (options = $$
[build.internal]
lists = [1000]
$$);

SET vchordrq.probes TO '10';
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 10;

The parameter lists should be tuned based on the number of rows. The following table provides guidelines for choosing an appropriate value. When querying, choose vchordrq.probes accordingly.

Number of Rows Recommended Number of Partitions Example lists
N/A[]
[2000]
[10000]
[80000]

The process of building an index involves two steps: partitioning the vector space first, and then inserting rows into the index. The first step, partitioning the vector space, can be sped up using multiple threads.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops) WITH (options = $$
[build.internal]
lists = [1000]
build_threads = 8
$$);

The second step, inserting rows into the index, can be parallelized using the appropriate GUC parameter. Refer to PostgreSQL Tuning. It's a common practice to set the value of build.internal.build_threads and parallel workers of PostgreSQL to the number of CPU cores.

For most datasets using cosine similarity, enabling residual_quantization and build.internal.spherical_centroids may improve both QPS and recall. We recommend validating this on a representative sample of your production data before enabling it in production.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_cosine_ops) WITH (options = $$
residual_quantization = true
[build.internal]
lists = [1000]
spherical_centroids = true
build_threads = 8
$$);

Tuning: Optimize the indexing time

To improve the build speed, you may opt to use more shared memory to accelerate the process by setting build.pin to 2.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops) WITH (options = $$
...
build.pin = 2
$$);

For large tables with more than 50 million rows, the build.internal process requires significant time and memory. Let be the vector dimension used for partition, be build.internal.lists[-1], be build.internal.sampling_factor, be build.internal.kmeans_iterations, and be build.internal.build_threads. The build time is approximately , which usually takes more than one day.

If this applies to you, you can use the hierarchical clustering to speed up the process, albeit at the expense of some accuracy. In our benchmark, hierarchical clustering was 100 times faster than the default algorithm, while query recall decreased only from 95.6% to 94.9%.

sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops) WITH (options = $$
...
kmeans_algorithm.hierarchical = {}
$$);

Tuning: Optimize the memory usage with indexing

When the indexing process starts, VectorChord shows the estimated amount of memory that will be allocated, such as:

shell
INFO:  clustering: estimated memory usage is 1.49 GiB

If the value exceeds your expectations or the physical memory constraint, it is wise to cancel and check this chapter. There are some options that can help reduce memory usage.

The memory usage , is mostly determined by:

  • D: The vector dimension used for partition. This is either the actual vector dimension or the k-means dimension if set.
  • F: build.internal.sampling_factor.
  • C: build.internal.lists[-1].

Based on our experience, reducing D will have the least impact on accuracy, so that could be a good starting point. Decreasing F is also plausible. Since C is much more sensitive, it should be the last thing you consider.

For your reference, this configuration has little impact on query accuracy:

  • Reduce D from 768 to 100.
  • Reduce F from 256 to 64.
sql
CREATE INDEX ON items USING vchordrq (embedding vector_l2_ops) WITH (options = $$
...
kmeans_dimension = 100
sampling_factor = 64
$$);

If the accuracy is not acceptable, you can also refer to the External Build to offload the indexing workload to other machines.

Reference

Operator Classes vchordrq

The following table lists all available operator classes supported by vchordrq.

Operator ClassDescriptionOperator 1Operator 2
vector_l2_opsindex works for vector type and Euclidean distance<->(vector,vector)<<->>(vector,vector)
vector_ip_opsindex works for vector type and negative inner product<#>(vector,vector)<<#>>(vector,vector)
vector_cosine_opsindex works for vector type and cosine distance<=>(vector,vector)<<=>>(vector,vector)
halfvec_l2_opsindex works for halfvec type and Euclidean distance<->(halfvec,halfvec)<<->>(halfvec,halfvec)
halfvec_ip_opsindex works for halfvec type and negative inner product<#>(halfvec,halfvec)<<#>>(halfvec,halfvec)
halfvec_cosine_opsindex works for halfvec type and cosine distance<=>(halfvec,halfvec)<<=>>(halfvec,halfvec)
vector_maxsim_opsindex works for vector[] type and scalable vector-similarity@#(vector[],vector[])N/A
halfvec_maxsim_opsindex works for halfvec[] type and scalable vector-similarity@#(halfvec[],halfvec[])N/A

<->, <#> and <=> are operators defined by pgvector.

NameDescription
<->squared Euclidean distance
<#>negative dot product
<=>cosine distance

<<->>, <<#>>, <<=>> and @# are operators defined by VectorChord.

For more information about <<->>, <<#>>, <<=>>, refer to Similarity Filter.

For more information about @#, refer to Multi-Vector Retrieval.

The operator classes for MaxSim are available since version 0.3.0.

Indexing Options vchordrq

residual_quantization

  • Description: This option determines whether residual quantization is used. If you are not familiar with residual quantization, you can read this blog for more information. In short, residual quantization is a technique that improves the accuracy of vector search by quantizing the residuals of the vectors.
  • Type: boolean
  • Default: false
  • Example:
    • residual_quantization = false means that residual quantization is not used.
    • residual_quantization = true means that residual quantization is used.

degree_of_parallelism since v1.0.0

  • Description: This option is a hint that specifies the degree of parallelism. In most cases, you do not need to change it. If you are using a CPU with more than 32 threads and wish to utilize more threads for PostgreSQL, you may set it to the number of threads for better performance.
  • Type: integer
  • Default: 32
  • Domain: [1, 256]
  • Example:
    • degree_of_parallelism = 32 hints to the index that 32 or less processes may access on the index concurrently.
    • degree_of_parallelism = 64 hints to the index that 64 or less processes may access on the index concurrently.

build.pin since v0.2.1

  • Description: This option determines whether shared memory is used for indexing. For large tables, you can choose to enable this option to speed up the build process.
  • Type: union of integer and boolean
  • Default:
    • -1 since v1.0.0
    • false until v0.5.3
  • Domain:
    • {-1, 0, 1, 2, false, true} since v1.0.0
    • {false, true} until v0.5.3
  • Example:
    • build.pin = 2 means the hot portion of the index is cached in memory.
    • build.pin = 1 means a subset of the hot portion of the index is cached in memory, consuming less memory.
    • build.pin = 0 means that this feature is enabled but nothing is actually cached. This option is for debugging purposes only.
    • build.pin = -1 means that this feature is disabled.
    • build.pin = false is the legacy form of build.pin = -1.
    • build.pin = true is the legacy form of build.pin = 1.

Default Build Options since v0.5.3

This is the default value of index building. The index will not be partitioned. In terms of semantics, build.default = {} is similar to build.internal.lists = [].

Internal Build Options vchordrq

build.internal.lists

  • Description: This option determines the hierarchical structure of the vector space partitioning.
  • Type: list of integers
  • Default:
    • [] since v0.3.0
    • [1000] until v0.2.2: implicit behavior is not ideal
  • Example:
    • build.internal.lists = [] means that the vector space is not partitioned.
    • build.internal.lists = [4096] means the vector space is divided into cells.
    • build.internal.lists = [4096, 262144] means the vector space is divided into cells, and those cells are further divided into smaller cells.
  • Note: The index partitions the vector space into multiple Voronoi cells based on centroids, iteratively creating a hierarchical space partition tree. Each leaf node in this tree represents a region with an associated list storing vectors in that region. During insertion, vectors are placed in lists corresponding to their appropriate leaf nodes. For queries, the index optimizes search by excluding lists whose leaf nodes are distant from the query vector, effectively pruning the search space. If the length of lists is , the lists option should be no less than , where is the number of vectors in the table.

build.internal.spherical_centroids

  • Description: This option determines whether to perform spherical K-means -- the centroids are L2 normalized after each iteration, you can refer to option spherical in here.
  • Type: boolean
  • Default: false
  • Example:
    • build.internal.spherical_centroids = false means that spherical k-means is not performed.
    • build.internal.spherical_centroids = true means that spherical k-means is performed.
  • Note: Set this to true if your model generates embeddings that use cosine similarity as the metric.

build.internal.sampling_factor since v0.2.0

  • Description: This option determines the number of vectors the K-means algorithm samples per cluster. The higher this value, the slower the build, the greater the memory consumption in building, and the better search performance.
  • Type: integer
  • Domain: [0, 1024]
  • Default: 256
  • Example:
    • build.internal.sampling_factor = 256 means that the K-means algorithm samples vectors, where is the maximum value in build.internal.lists.
    • build.internal.sampling_factor = 32 means that the K-means algorithm samples vectors, where is the maximum value in build.internal.lists. This reduces K-means' time and memory usage to approximately of what it would be with the default value of 256.

build.internal.kmeans_iterations since v0.2.2

  • Description: This option determines the number of iterations for K-means algorithm. The higher this value, the slower the build.
  • Type: integer
  • Domain: [0, 1024]
  • Default: 10
  • Example:
    • build.internal.kmeans_iterations = 10 means that the K-means algorithm performs iterations.
    • build.internal.kmeans_iterations = 100 means that the K-means algorithm performs iterations.

build.internal.build_threads

  • Description: This option determines the number of threads used by K-means algorithm. The higher this value, the faster the build, and greater load on the server in building.
  • Type: integer
  • Domain: [1, 255]
  • Default: 1
  • Example:
    • build.internal.build_threads = 1 means that the K-means algorithm uses thread.
    • build.internal.build_threads = 4 means that the K-means algorithm uses threads.

build.internal.kmeans_algorithm since v1.0.0

  • Description: This option determines the K-means algorithm to be used.
  • Type: object
  • Example:
    • build.internal.kmeans_algorithm.lloyd = {}. This uses Lloyd's algorithm. This is the default value.
    • build.internal.kmeans_algorithm.hierarchical = {}. This uses hierarchical clustering. Compared to Lloyd's algorithm, this approach is much faster, but it may cause a loss of accuracy.

build.internal.kmeans_dimension since v1.0.0

  • Description: This option determines the dimension to use for K-means input and output. This feature employs dimensionality reduction and expansion via resampling, effectively reducing K-means' time and memory consumption, but it may cause a loss of accuracy.
  • Type: union of integer and null
  • Default: null
  • Example:
    • If this option is not set, this feature is disabled.
    • build.internal.kmeans_dimension = 100 means that K-means will process vectors with dimensions. For original vectors of dimensions, this reduces K-means' time and memory usage to approximately of what it would be without this feature.

Search Parameters vchordrq

vchordrq.enable_scan since v0.5.0

  • Description: Enables or disables the query planner's use of vchordrq index scan. It's for testing purposes.
  • Type: boolean
  • Default: on
  • Example:
    • vchordrq.enable_scan = off disables the query planner's use of vchordrq index scan.
    • vchordrq.enable_scan = on enables the query planner's use of vchordrq index scan.

vchordrq.probes

  • Description: This GUC parameter vchordrq.probes controls how the vector space assists in query pruning. The more probes, the more accurate the search, but also the slower it is.
  • Type: list of integers
  • Default:
    • since v0.3.0
    • 10 until v0.2.2: the default value was 10 before `lists` defaulted to empty
  • Example:
    • SET vchordrq.probes = 1 means that only one probe is used.
    • SET vchordrq.probes = 10 means that ten probes are used.
  • Note: The default value is an empty list. The length of this option must match the length of lists.
    • If lists = [], then probes must be .
    • If lists = [11, 22], then probes can be 2,4 or 4,8. It must not be incorrectly shaped, for example, , 3, 7,8,9, 5,5,5,5.

vchordrq.epsilon

  • Description: Even after pruning, the number of retrieved vectors remains substantial. The index employs the RaBitQ algorithm to quantize vectors into bit vectors, which require just the memory of single-precision floating-point vectors. Most computations are integer-based, leading to faster processing. Unlike conventional quantization algorithms, RaBitQ estimates not only distances but also their lower bounds. The index computes the lower bound for each vector and dynamically adjusts the number of vectors needing recalculated distances, based on the query count, thus balancing performance and accuracy. The GUC parameter vchordrq.epsilon controls the conservativeness of the lower bounds of distances. The higher the value, the higher the accuracy, but the worse the performance. The default value tends to favor higher accuracy than needed for most use cases, so you can try lowering this parameter to achieve better performance.
  • Type: real
  • Default: 1.9
  • Domain: [0.0, 4.0]
  • Example:
    • SET vchordrq.epsilon = 0.1 indicates you are using a very optimistic lower bound estimation. You set it this way because your dataset is not sensitive to the lower bound estimation, for the precision you need.
    • SET vchordrq.epsilon = 4.0 indicates you are using a very pessimistic lower bound estimation. You set it this way because your dataset is not very sensitive to the lower bound estimation, for the precision you need.

vchordrq.prewarm_dim removed in v0.4.0

  • Description: The vchordrq.prewarm_dim GUC parameter is used to precompute the RaBitQ projection matrix for the specified dimensions. This can help to reduce the latency of the first query after the PostgreSQL cluster is started.
  • Type: list of integers
  • Default: 64,128,256,384,512,768,1024,1536
  • Example:
    • ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128' means that the projection matrix will be precomputed for dimensions 64 and 128.
  • Note:
    • This setting requires a database restart to take effect.
    • Since v0.4.0, a new algorithm made this option obsolete.