---
title: "Ingest Documents"
description: "How to ingest text, streams, files, metadata, and XION-signed content into a Lucent collection."
published: 2026-05-14T12:11:30.963447+00:00
updated: 2026-05-14T12:11:30.963447+00:00
tags: ["guide", "ingestion", "lucent"]
url: https://xiobjects.com/docs/xio/lucent/guides/ingest-documents
source: XI Objects
---

<!-- xion:doctype xion+markdown -->
<!-- xion:metadata
{
  "version": "1.0",
  "content_type": "application/xion\u002Bmarkdown",
  "source_type": "xi-content/doc",
  "generator": "xio-content-publisher/1.0.0",
  "generated": "2026-05-14T12:10:20.2192317\u002B00:00",
  "encoding": "utf-8",
  "render_intent": "markdown",
  "title": "Ingest Documents",
  "slug": "xio/lucent/guides/ingest-documents",
  "copyright": "\u00A9 2026 XI Objects Inc"
}
-->

# Ingest Documents

All ingestion goes through `IKnowledgeEngine.AddDocumentAsync`. The method is overloaded in a few useful ways.

## Prerequisites

- `Xio.Lucent` installed and registered with `services.AddLucent()`
- `engine.ProvisionAsync()` called at startup

## Ingest plain text

Pass content as a string. The content type is detected automatically, or set a hint to skip detection.

```csharp
var result = await engine.AddDocumentAsync("my-collection", new AddDocumentRequest
{
    Content = "The quick brown fox...",
    DocumentId = "fox-article",
    ContentTypeHint = "text/plain"
});
```

`DocumentId` is optional. Omitting it generates a deterministic ID from a BLAKE3 hash of the content. The same content always produces the same ID, which makes idempotent ingestion straightforward.

## Ingest a file stream

Pass a forward-only stream for binary formats. Lucent reads the stream once and doesn't seek.

```csharp
await using var file = File.OpenRead("/docs/spec.pdf");

var result = await engine.AddDocumentAsync("my-collection", new AddDocumentRequest
{
    ContentStream = file,
    Source = new SourceInfo
    {
        FileName = "spec.pdf",
        FilePath = "/docs/spec.pdf",
        LastModified = File.GetLastWriteTimeUtc("/docs/spec.pdf")
    }
});

Console.WriteLine($"{result.ChunkCount} chunks from {result.ContentType}");
```

The `Source` record is optional, but providing `FileName` improves content type detection and populates `ChunkMetadata.FilePath`. `ContentStream` and `Content` are mutually exclusive; setting both throws.

## Attach metadata

Any key-value pairs in `AddDocumentRequest.Metadata` are attached to every chunk from that document. They're queryable via filter expressions at retrieval time.

```csharp
var result = await engine.AddDocumentAsync("policies", new AddDocumentRequest
{
    ContentStream = file,
    Source = new SourceInfo { FileName = "privacy-policy-v3.pdf" },
    Metadata = new Dictionary<string, string>
    {
        ["category"] = "legal",
        ["version"] = "3",
        ["status"] = "approved",
        ["owner"] = "legal-team"
    }
});
```

Later, filter to only approved legal docs:

```csharp
using static Lucent.Filter;

var result = await engine.QueryAsync("policies", new QueryRequest
{
    Text = "data retention obligations",
    Filter = And(Eq("category", "legal"), Eq("status", "approved"))
});
```

## Ingest XION-signed content

Set `IsXionContent = true` to trigger XION verification before ingestion. The document must be a valid XION-signed artifact; verification failure rejects the document with an exception.

```csharp
var xionMarkdown = File.ReadAllText("report.xion.md");

var result = await engine.AddDocumentAsync("verified-docs", new AddDocumentRequest
{
    Content = xionMarkdown,
    DocumentId = "q1-report",
    IsXionContent = true
});
```

On success, every chunk carries a `XionAttribution` record on its `ChunkMetadata`. See [XRAG: Verified-Provenance RAG](/docs/xio/lucent/concepts/xrag) for the full details.

XRAG requires `AddLucentXionVerification` in your DI registration. Attempting to ingest XION content without it throws at runtime.

## Update a document

Calling `AddDocumentAsync` with an existing `DocumentId` is an upsert. Lucent compares the BLAKE3 hash of the incoming content against the stored hash:

- Same hash, same embedder model: no-op. Returns immediately with the existing chunk count and zero durations.
- Same hash, different model: re-embeds all chunks with the new model.
- Different hash: runs the full pipeline, replaces all chunks.

This means you can call `AddDocumentAsync` on every deploy without worrying about redundant work for unchanged files.

## Delete a document

```csharp
await engine.DeleteDocumentAsync("my-collection", "fox-article");
```

Removes the document registry entry and all associated chunks from both the vector store and full-text index.

## List documents

```csharp
var docs = await engine.ListDocumentsAsync("my-collection");

foreach (var doc in docs)
{
    Console.WriteLine($"{doc.DocumentId} — {doc.ChunkCount} chunks, ingested {doc.IngestedAt:u}");
}
```

`DocumentInfo` carries the document ID, content type, chunk count, model ID, source info, and ingest timestamp.

## Export and import

Lucent can export a collection to a single binary stream and restore it elsewhere. The SQLite adapter uses `VACUUM INTO` for an atomic snapshot with no WAL or SHM sidecars.

```csharp
// Export
await using var backup = File.Create("collection-backup.lucent");
await engine.ExportAsync("my-collection", backup);

// Import into a fresh engine
await using var restored = File.OpenRead("collection-backup.lucent");
await engine.ImportAsync("my-collection", restored);
```

## Next Steps

- [Query a Collection](/docs/xio/lucent/guides/query-collection) — search, filters, score thresholds
- [XRAG: Verified-Provenance RAG](/docs/xio/lucent/concepts/xrag) — provenance querying and signed source retrieval
- [Chunking Strategies](/docs/xio/lucent/concepts/chunking) — choose the right chunker for your content
<!-- xion:trust
{
  "v": 1,
  "canon_v": 1,
  "ctx": "xiobjects.com/content",
  "hash_blake3_hex": "3a3868c9648d453b874063d5626ec1707a15b5e828f9f8f70302c62cb9fd6717",
  "hash_sha256_hex": null,
  "sig_alg": "ed25519",
  "sig_b64": "_bbKTmPkQn1QZhoWFFEdRJrWM8pboae2fauVAmeslStUHEcgVLHezid8OdYDihQDCtnjd24FxrXVlogteBwGBQ",
  "pubkey_b64": "h-awvV8Rn-juph_c2Y7UH5A6e7NaFia3zBiMrJUOMOo",
  "x509_chain_pem": [
    "-----BEGIN CERTIFICATE-----\r\nMIIB9DCCAaagAwIBAgIQBrrNsmRlBvKQdA4idEliJjAFBgMrZXAwLjEsMCoGA1UE\r\nAwwjWEkgT2JqZWN0cyBJbmMgQ29udHJvbCBJbnRlcm1lZGlhdGUwHhcNMjYwNTEz\r\nMjI0NjA1WhcNMjYwNjEyMjI0NjA1WjBLMR4wHAYDVQQDDBV4aW8tY29udGVudC1w\r\ndWJsaXNoZXIxFzAVBgNVBAoMDlhJIE9iamVjdHMgSW5jMRAwDgYDVQQLDAdDb250\r\nZW50MCowBQYDK2VwAyEAh\u002BawvV8Rn\u002Bjuph/c2Y7UH5A6e7NaFia3zBiMrJUOMOqj\r\ngbwwgbkwDAYDVR0TAQH/BAIwADAOBgNVHQ8BAf8EBAMCB4AwEwYDVR0lBAwwCgYI\r\nKwYBBQUHAyQwZQYDVR0jBF4wXIAUOym3mFmw/qs1fgKrujCkxhrTk7KhLqQsMCox\r\nKDAmBgNVBAMMH0luc3RpdHV0ZSBvZiBQcm92ZW5hbmNlIFJvb3QgQ0GCFFJgN/ix\r\nQn72H6h3T5lEr9f8lJQFMB0GA1UdDgQWBBS1LSJi5\u002BeqBq8h974Ht9HTgIcdgTAF\r\nBgMrZXADQQCKjXbPwnk/DZHmLQstUWRzU6GSf\u002BSHTXTTZCtRLbmJKxT17Qlbpexc\r\nsRgdSpxNWpJPe9Fr4vwhRkESMqMIpgQO\r\n-----END CERTIFICATE-----\r\n",
    "-----BEGIN CERTIFICATE-----\r\nMIIByDCCAXqgAwIBAgIUUmA3\u002BLFCfvYfqHdPmUSv1/yUlAUwBQYDK2VwMCoxKDAm\r\nBgNVBAMMH0luc3RpdHV0ZSBvZiBQcm92ZW5hbmNlIFJvb3QgQ0EwHhcNMjUxMTAy\r\nMDMxNzEyWhcNMzAxMTAxMDMxNzEyWjAuMSwwKgYDVQQDDCNYSSBPYmplY3RzIElu\r\nYyBDb250cm9sIEludGVybWVkaWF0ZTAqMAUGAytlcAMhAFSS/pggSRmTcAMko7uc\r\nATH8OHgxVymd5mBFlPXbJkgio4GtMIGqMBIGA1UdEwEB/wQIMAYBAf8CAQAwDgYD\r\nVR0PAQH/BAQDAgEGMB0GA1UdDgQWBBQ7KbeYWbD\u002BqzV\u002BAqu6MKTGGtOTsjBlBgNV\r\nHSMEXjBcgBQAZRTDswSVORu\u002BkUOKX6WvrOvmQKEupCwwKjEoMCYGA1UEAwwfSW5z\r\ndGl0dXRlIG9mIFByb3ZlbmFuY2UgUm9vdCBDQYIUJqoJlpiSFg\u002B7W5IJLMrLttgR\r\nQp4wBQYDK2VwA0EA5FOht7YOsVRPp/FOKMQ\u002B3Mo9JxrvGR3ylKWAWNm6OUV7N3DB\r\nI9cD62wU5I0d0EKDBy0CX9DnoqUyxv5yguraAA==\r\n-----END CERTIFICATE-----\r\n",
    "-----BEGIN CERTIFICATE-----\r\nMIIBaTCCARugAwIBAgIUJqoJlpiSFg\u002B7W5IJLMrLttgRQp4wBQYDK2VwMCoxKDAm\r\nBgNVBAMMH0luc3RpdHV0ZSBvZiBQcm92ZW5hbmNlIFJvb3QgQ0EwHhcNMjUxMTAy\r\nMDMwNTEyWhcNMzUxMDMxMDMwNTEyWjAqMSgwJgYDVQQDDB9JbnN0aXR1dGUgb2Yg\r\nUHJvdmVuYW5jZSBSb290IENBMCowBQYDK2VwAyEAEWNZl\u002Br3IC7\u002BgBh90Yo1kWk1\r\npZCVzVuFdFT7qBBU8W2jUzBRMB0GA1UdDgQWBBQAZRTDswSVORu\u002BkUOKX6WvrOvm\r\nQDAfBgNVHSMEGDAWgBQAZRTDswSVORu\u002BkUOKX6WvrOvmQDAPBgNVHRMBAf8EBTAD\r\nAQH/MAUGAytlcANBAO6QeydOFNrN75qNyftggYudsxMyl4w9qWkSdZ6hlhrRcbSr\r\niG9Si0kbrIJOwYB/LTBU0RM4Rl\u002Bo9PM3Qp0mPwo=\r\n-----END CERTIFICATE-----\r\n"
  ],
  "key_id": "SDyVO7FvlAM-6CvQ62VZYOBO7JADFqLquUunUABRgKg",
  "created_at": "2026-05-14T12:10:20Z"
}
-->