---
title: "Hybrid Search"
description: "How XI Lucent combines vector cosine similarity and BM25 full-text search with Reciprocal Rank Fusion, how filters work, and how scoring is composed."
published: 2026-05-14T12:11:22.280794+00:00
updated: 2026-05-14T12:11:22.280794+00:00
tags: ["concepts", "hybrid-search", "lucent", "retrieval"]
url: https://xiobjects.com/docs/xio/lucent/concepts/hybrid-search
source: XI Objects
---

<!-- xion:doctype xion+markdown -->
<!-- xion:metadata
{
  "version": "1.0",
  "content_type": "application/xion\u002Bmarkdown",
  "source_type": "xi-content/doc",
  "generator": "xio-content-publisher/1.0.0",
  "generated": "2026-05-14T12:10:19.1830419\u002B00:00",
  "encoding": "utf-8",
  "render_intent": "markdown",
  "title": "Hybrid Search",
  "slug": "xio/lucent/concepts/hybrid-search",
  "copyright": "\u00A9 2026 XI Objects Inc"
}
-->

# Hybrid Search

Lucent's default retrieval strategy runs two search passes in parallel and fuses them. Vector similarity excels at semantic matches ("what does lucent do") where no keyword appears verbatim in the relevant chunks. BM25 full-text search excels at exact-term retrieval ("what is the default value of HybridRrfK"). Neither alone handles both cases well. Hybrid search handles both.

## How the default strategy works

```mermaid
sequenceDiagram
    participant Q as QueryAsync
    participant VS as Vector Store
    participant FTS as FTS5 Store
    participant RRF as RRF Fusion
    participant S as Scorer

    Q->>VS: cosine KNN (query vector)
    Q->>FTS: BM25 (query text)
    VS-->>RRF: ranked candidates
    FTS-->>RRF: ranked candidates
    RRF->>RRF: deduplicate + fuse
    RRF->>S: merged list
    S-->>Q: final results
```

The two search paths run concurrently. Results from each path are a ranked list of `ScoredChunk` objects. Fusion happens over ranks, not raw scores.

## Reciprocal Rank Fusion

RRF scores each candidate as the sum of `1 / (k + rank)` across the lists it appears in, where `k` is a constant that dampens the impact of top-ranked documents from either list alone. Lucent defaults to `k = 60` (`HybridRrfK`), which is the standard RRF constant.

After fusion, the combined list is sorted by descending fused score and deduplicated. A chunk that appears in both the vector results and the FTS results gets credit from both, so genuinely relevant chunks that match both semantically and lexically naturally rise to the top.

## Vector and text weights

The `VectorWeight` option (default `0.7`) controls how much of each list's contribution is credited before RRF. A vector weight of `0.7` means vector results contribute `0.7 * (1 / (k + rank))` and text results contribute `0.3 * (1 / (k + rank))`. Increasing it toward `1.0` makes retrieval more semantic; decreasing it makes keyword matches more influential.

Adjust via `RetrievalOptions` on a per-query basis or globally in `LucentOptions`:

```csharp
// Per-query
var result = await engine.QueryAsync("docs", new QueryRequest
{
    Text = "HybridRrfK default value",
    TopK = 10,
});

// Global default, shift toward keyword search
services.AddLucent(opts =>
{
    opts.HybridSearchVectorWeight = 0.5f;
    opts.HybridRrfK = 60;
});
```

## Score composition in results

Each `ScoredChunk` in `QueryResult.Chunks` carries three score fields:

| Field | Range | Meaning |
|-------|-------|---------|
| `Score` | 0–1 | Final fused score from RRF (then re-ranker if configured) |
| `VectorScore` | -1–1 | Raw cosine similarity from the vector search path |
| `TextScore` | 0–1 | Normalized BM25 score from the FTS path |

`VectorScore` and `TextScore` are absent when a chunk appeared in only one of the two search paths. The `Source` field is `"vector"`, `"text"`, or `"hybrid"` depending on which paths surfaced the chunk.

## Filters

Filter expressions narrow the candidate set before ranking. They're applied at the store level, so filtered chunks never enter the search paths at all.

```csharp
using static Lucent.Filter;

var result = await engine.QueryAsync("specs", new QueryRequest
{
    Text = "authentication flow",
    Filter = And(
        Eq("category", "security"),
        Not(Eq("status", "archived"))
    )
});
```

Filters compose as a tree of `FilterExpression` cases:

| Expression | Example |
|-----------|---------|
| `Eq(key, value)` | Exact match on a metadata field |
| `Ne(key, value)` | Not equal |
| `In(key, values)` | Matches any of the given values |
| `Contains(key, substr)` | Substring match |
| `StartsWith(key, prefix)` | Prefix match |
| `Gt / Gte / Lt / Lte` | Numeric or datetime comparison with `FilterValueKind` |
| `Exists(key)` | Field is present |
| `NotExists(key)` | Field is absent |
| `And(...)` | All operands must match |
| `Or(...)` | Any operand must match |
| `Not(expr)` | Inverts the operand |

Filter keys match against `ChunkMetadata` fields. For structural metadata (`PageNumber`, `SlideIndex`, etc.), use the field name directly. For custom metadata from `AddDocumentRequest.Metadata`, use the key as-is.

The SQLite adapter translates filter trees to SQL `WHERE` clauses. Alternative vector store adapters (Qdrant, pgvector) translate to their respective native filter languages.

## Score threshold

`QueryRequest.ScoreThreshold` (default `0.0`) filters out results whose final score falls below the threshold before returning. Setting it to `0.35` or `0.4` is a reasonable starting point for cutting noise from weakly relevant results. Too high a threshold and you'll miss relevant chunks; start conservative and raise it based on observed result quality.

## Disabling full-text search

If you want pure vector search without the FTS component, swap to `VectorOnlyStrategy`:

```csharp
services.AddLucent(opts =>
{
    opts.AddRetrievalStrategy<VectorOnlyStrategy>();
});
```

This skips FTS entirely. The `SqliteFtsStore` is still registered and populated at ingest time; it just isn't queried. If you know you'll never use FTS, you can also remove the text search store registration to skip indexing:

```csharp
services.AddLucent(opts =>
{
    opts.AddTextSearchStore<NullTextSearchStore>();
    opts.AddRetrievalStrategy<VectorOnlyStrategy>();
});
```

## Re-ranking with CrossEncoderScorer

After RRF fusion, an optional `IScorer` pass can re-rank results. The default `NoOpScorer` is a pass-through. `CrossEncoderScorer` feeds each (query, chunk) pair through an ONNX cross-encoder model and re-scores:

```csharp
services.AddLucent(opts =>
{
    opts.AddScorer<CrossEncoderScorer>();
});
```

Cross-encoder re-ranking runs after the full fusion step and operates over the `TopK` candidates returned by retrieval. It's slower than RRF but produces higher precision for the final ranked list.
<!-- xion:trust
{
  "v": 1,
  "canon_v": 1,
  "ctx": "xiobjects.com/content",
  "hash_blake3_hex": "8dff96b5a46884c946049c2e6d1d366586c4f04690093a22d4ae20919cfe5c4d",
  "hash_sha256_hex": null,
  "sig_alg": "ed25519",
  "sig_b64": "Xmx5hII8bYa_2lC3mYcCug-PGFA5EBPsj8WgxidW4mj7NV9tsEDcEdQA7SC1roewWtYYWl5nBnehb-Y3QNmZBw",
  "pubkey_b64": "h-awvV8Rn-juph_c2Y7UH5A6e7NaFia3zBiMrJUOMOo",
  "x509_chain_pem": [
    "-----BEGIN CERTIFICATE-----\r\nMIIB9DCCAaagAwIBAgIQBrrNsmRlBvKQdA4idEliJjAFBgMrZXAwLjEsMCoGA1UE\r\nAwwjWEkgT2JqZWN0cyBJbmMgQ29udHJvbCBJbnRlcm1lZGlhdGUwHhcNMjYwNTEz\r\nMjI0NjA1WhcNMjYwNjEyMjI0NjA1WjBLMR4wHAYDVQQDDBV4aW8tY29udGVudC1w\r\ndWJsaXNoZXIxFzAVBgNVBAoMDlhJIE9iamVjdHMgSW5jMRAwDgYDVQQLDAdDb250\r\nZW50MCowBQYDK2VwAyEAh\u002BawvV8Rn\u002Bjuph/c2Y7UH5A6e7NaFia3zBiMrJUOMOqj\r\ngbwwgbkwDAYDVR0TAQH/BAIwADAOBgNVHQ8BAf8EBAMCB4AwEwYDVR0lBAwwCgYI\r\nKwYBBQUHAyQwZQYDVR0jBF4wXIAUOym3mFmw/qs1fgKrujCkxhrTk7KhLqQsMCox\r\nKDAmBgNVBAMMH0luc3RpdHV0ZSBvZiBQcm92ZW5hbmNlIFJvb3QgQ0GCFFJgN/ix\r\nQn72H6h3T5lEr9f8lJQFMB0GA1UdDgQWBBS1LSJi5\u002BeqBq8h974Ht9HTgIcdgTAF\r\nBgMrZXADQQCKjXbPwnk/DZHmLQstUWRzU6GSf\u002BSHTXTTZCtRLbmJKxT17Qlbpexc\r\nsRgdSpxNWpJPe9Fr4vwhRkESMqMIpgQO\r\n-----END CERTIFICATE-----\r\n",
    "-----BEGIN CERTIFICATE-----\r\nMIIByDCCAXqgAwIBAgIUUmA3\u002BLFCfvYfqHdPmUSv1/yUlAUwBQYDK2VwMCoxKDAm\r\nBgNVBAMMH0luc3RpdHV0ZSBvZiBQcm92ZW5hbmNlIFJvb3QgQ0EwHhcNMjUxMTAy\r\nMDMxNzEyWhcNMzAxMTAxMDMxNzEyWjAuMSwwKgYDVQQDDCNYSSBPYmplY3RzIElu\r\nYyBDb250cm9sIEludGVybWVkaWF0ZTAqMAUGAytlcAMhAFSS/pggSRmTcAMko7uc\r\nATH8OHgxVymd5mBFlPXbJkgio4GtMIGqMBIGA1UdEwEB/wQIMAYBAf8CAQAwDgYD\r\nVR0PAQH/BAQDAgEGMB0GA1UdDgQWBBQ7KbeYWbD\u002BqzV\u002BAqu6MKTGGtOTsjBlBgNV\r\nHSMEXjBcgBQAZRTDswSVORu\u002BkUOKX6WvrOvmQKEupCwwKjEoMCYGA1UEAwwfSW5z\r\ndGl0dXRlIG9mIFByb3ZlbmFuY2UgUm9vdCBDQYIUJqoJlpiSFg\u002B7W5IJLMrLttgR\r\nQp4wBQYDK2VwA0EA5FOht7YOsVRPp/FOKMQ\u002B3Mo9JxrvGR3ylKWAWNm6OUV7N3DB\r\nI9cD62wU5I0d0EKDBy0CX9DnoqUyxv5yguraAA==\r\n-----END CERTIFICATE-----\r\n",
    "-----BEGIN CERTIFICATE-----\r\nMIIBaTCCARugAwIBAgIUJqoJlpiSFg\u002B7W5IJLMrLttgRQp4wBQYDK2VwMCoxKDAm\r\nBgNVBAMMH0luc3RpdHV0ZSBvZiBQcm92ZW5hbmNlIFJvb3QgQ0EwHhcNMjUxMTAy\r\nMDMwNTEyWhcNMzUxMDMxMDMwNTEyWjAqMSgwJgYDVQQDDB9JbnN0aXR1dGUgb2Yg\r\nUHJvdmVuYW5jZSBSb290IENBMCowBQYDK2VwAyEAEWNZl\u002Br3IC7\u002BgBh90Yo1kWk1\r\npZCVzVuFdFT7qBBU8W2jUzBRMB0GA1UdDgQWBBQAZRTDswSVORu\u002BkUOKX6WvrOvm\r\nQDAfBgNVHSMEGDAWgBQAZRTDswSVORu\u002BkUOKX6WvrOvmQDAPBgNVHRMBAf8EBTAD\r\nAQH/MAUGAytlcANBAO6QeydOFNrN75qNyftggYudsxMyl4w9qWkSdZ6hlhrRcbSr\r\niG9Si0kbrIJOwYB/LTBU0RM4Rl\u002Bo9PM3Qp0mPwo=\r\n-----END CERTIFICATE-----\r\n"
  ],
  "key_id": "SDyVO7FvlAM-6CvQ62VZYOBO7JADFqLquUunUABRgKg",
  "created_at": "2026-05-14T12:10:19Z"
}
-->