> ## Documentation Index
> Fetch the complete documentation index at: https://docs.truedy.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Managing Knowledge Base Sources

> Add, update, and remove the documents that power your agent's knowledge

Sources are the individual documents inside a knowledge base. This guide covers the full lifecycle: adding sources of every type, monitoring processing status, handling failures, and removing outdated content.

<Note>
  Before adding sources you need an existing knowledge base. See [Knowledge Bases](/guides/knowledge-bases-overview) if you haven't created one yet.
</Note>

***

## Adding sources

Truedy supports two source types: **document** (a file you upload) and **web** (URLs you want crawled).

### Document source

Upload a PDF, DOCX, TXT, or Markdown file. The file must be base64-encoded in the request body. Maximum file size is 50 MB.

<CodeGroup>
  ```bash curl theme={null}
  # Encode your file first: base64 -i manual.pdf > manual_b64.txt
  # Then paste the base64 string into the "data" field.

  curl -X POST https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "name": "Product Manual v3",
      "file": {
        "filename": "manual.pdf",
        "content_type": "application/pdf",
        "data": "<BASE64_ENCODED_CONTENT>"
      }
    }'
  ```

  ```python Python theme={null}
  import requests, base64

  with open("/path/to/manual.pdf", "rb") as f:
      encoded = base64.b64encode(f.read()).decode()

  response = requests.post(
      "https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
      json={
          "name": "Product Manual v3",
          "file": {
              "filename": "manual.pdf",
              "content_type": "application/pdf",
              "data": encoded,
          },
      },
  )
  source = response.json()["data"]
  print(f"Document source created: {source['id']} — status: {source['status']}")
  ```

  ```javascript JavaScript theme={null}
  const fs = require('fs')

  const fileBuffer = fs.readFileSync('/path/to/manual.pdf')
  const encoded = fileBuffer.toString('base64')

  const res = await fetch(
    'https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document',
    {
      method: 'POST',
      headers: {
        Authorization: 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        name: 'Product Manual v3',
        file: {
          filename: 'manual.pdf',
          content_type: 'application/pdf',
          data: encoded,
        },
      }),
    }
  )
  const { data: source } = await res.json()
  console.log(`Document source created: ${source.id} — status: ${source.status}`)
  ```
</CodeGroup>

**Request body:**

| Field               | Type   | Required | Description                                                                                                                               |
| ------------------- | ------ | -------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| `name`              | string | Yes      | Display name for this source                                                                                                              |
| `description`       | string | No       | Optional description                                                                                                                      |
| `file.filename`     | string | Yes      | Original filename including extension (e.g. `manual.pdf`)                                                                                 |
| `file.content_type` | string | Yes      | MIME type: `application/pdf`, `text/plain`, `application/vnd.openxmlformats-officedocument.wordprocessingml.document`, or `text/markdown` |
| `file.data`         | string | Yes      | Base64-encoded file content                                                                                                               |

<Tip>
  PDFs with selectable text index best. Scanned image-only PDFs produce poor results. Where possible, export text-based PDFs or convert to plain text.
</Tip>

### Web source

Truedy crawls the provided URLs, extracts text, and indexes the content. Pages must be publicly accessible without authentication.

<CodeGroup>
  ```bash curl theme={null}
  curl -X POST https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/web \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "name": "Help Centre: Billing",
      "urls": ["https://help.example.com/billing"],
      "max_depth": 1
    }'
  ```

  ```python Python theme={null}
  response = requests.post(
      "https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/web",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
      json={
          "name": "Help Centre: Billing",
          "urls": ["https://help.example.com/billing"],
          "max_depth": 1,
      },
  )
  source = response.json()["data"]
  print(f"Web source queued: {source['id']}")
  ```

  ```javascript JavaScript theme={null}
  const res = await fetch(
    'https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/web',
    {
      method: 'POST',
      headers: {
        Authorization: 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        name: 'Help Centre: Billing',
        urls: ['https://help.example.com/billing'],
        max_depth: 1,
      }),
    }
  )
  const { data: source } = await res.json()
  console.log(`Web source queued: ${source.id}`)
  ```
</CodeGroup>

**Request body:**

| Field         | Type             | Required | Description                                                                                                         |
| ------------- | ---------------- | -------- | ------------------------------------------------------------------------------------------------------------------- |
| `name`        | string           | Yes      | Display name for this source                                                                                        |
| `description` | string           | No       | Optional description                                                                                                |
| `urls`        | array of strings | Yes      | One or more URLs to crawl                                                                                           |
| `max_depth`   | integer          | No       | Link-follow depth. `0` = index only the given URLs. `1` = also follow links from those pages. Max `3`. Default `1`. |

***

## Checking source status

### List all sources

<CodeGroup>
  ```bash curl theme={null}
  curl https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources \
    -H "Authorization: Bearer YOUR_API_KEY"
  ```

  ```python Python theme={null}
  sources = requests.get(
      "https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
  ).json()

  for s in sources["data"]:
      print(f"{s['name']:40s}  {s['status']}")
  ```
</CodeGroup>

**Response**

```json theme={null}
{
  "data": [
    {
      "id": "src_01J...",
      "kb_id": "kb_01J8X...",
      "name": "Product Manual v3",
      "kind": "document",
      "status": "ready",
      "created_at": "2025-09-12T10:05:00Z",
      "updated_at": "2025-09-12T10:06:30Z"
    },
    {
      "id": "src_02K...",
      "kb_id": "kb_01J8X...",
      "name": "Help Centre: Billing",
      "kind": "web",
      "status": "processing",
      "created_at": "2025-09-12T10:06:00Z",
      "updated_at": "2025-09-12T10:06:00Z"
    }
  ]
}
```

### Status meanings

| Status       | Meaning                                               |
| ------------ | ----------------------------------------------------- |
| `pending`    | Source has been received and is queued for processing |
| `processing` | Content is being extracted and indexed                |
| `ready`      | Indexed and available for search during calls         |
| `failed`     | Processing encountered an error — see below           |

Only `ready` sources contribute to your agent's knowledge. Sources in any other state are excluded from search.

### When a source fails

Common causes of a `failed` status:

| Cause                          | How to resolve                                                             |
| ------------------------------ | -------------------------------------------------------------------------- |
| Unsupported file type          | Re-upload as PDF, DOCX, or TXT                                             |
| URL blocked by `robots.txt`    | Contact the site owner, or copy the content into a text source             |
| URL requires login             | The page must be publicly accessible — use a text or file source instead   |
| Empty or unextractable content | Verify the file has selectable text, or paste the content as a text source |
| File too large                 | Split the document into smaller files and add each as a separate source    |

To resubmit a failed source, delete it and add it again after resolving the underlying issue.

***

## Updating sources

Sources do not have a dedicated update endpoint. The recommended approach is to delete the existing source and add a new one with the revised content. This ensures the index is rebuilt cleanly from the latest version.

<CodeGroup>
  ```bash curl theme={null}
  # 1. Delete the old source
  curl -X DELETE https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/SOURCE_ID \
    -H "Authorization: Bearer YOUR_API_KEY"

  # 2. Re-add with updated content (document example)
  curl -X POST https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "name": "Product Manual v4",
      "file": {
        "filename": "manual_v4.pdf",
        "content_type": "application/pdf",
        "data": "<BASE64_ENCODED_CONTENT>"
      }
    }'
  ```

  ```python Python theme={null}
  import base64

  source_id = "src_01J..."

  # 1. Delete the old source
  requests.delete(
      f"https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/{source_id}",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
  )

  # 2. Re-add with updated file
  with open("/path/to/manual_v4.pdf", "rb") as f:
      encoded = base64.b64encode(f.read()).decode()

  new_source = requests.post(
      "https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
      json={
          "name": "Product Manual v4",
          "file": {
              "filename": "manual_v4.pdf",
              "content_type": "application/pdf",
              "data": encoded,
          },
      },
  ).json()
  print(f"Replacement source created: {new_source['data']['id']}")
  ```
</CodeGroup>

***

## Removing sources

Deleting a source removes it from the search index immediately. In-progress calls are not affected, but all subsequent calls will no longer have access to that content.

<CodeGroup>
  ```bash curl theme={null}
  curl -X DELETE https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/SOURCE_ID \
    -H "Authorization: Bearer YOUR_API_KEY"
  ```

  ```python Python theme={null}
  response = requests.delete(
      "https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/SOURCE_ID",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
  )
  if response.status_code == 204:
      print("Source deleted.")
  ```
</CodeGroup>

A successful deletion returns HTTP `204 No Content`.

***

## Best practices

**Keep sources focused.** One topic per source is better than a single giant document. Focused sources produce cleaner search results and make it easier to update individual pieces of content.

**Use descriptive names.** Names like `"Refund Policy — EU"` or `"Pricing: Pro Plan"` are far easier to debug than `"doc1"` or `"untitled"`.

**Verify web pages are publicly accessible.** Before adding a web source, open the URL in a private/incognito browser tab. If you see a login screen or a 4xx error, Truedy will not be able to crawl it — use a document source instead.

**Test with your agent.** After all sources reach `ready` status, use the agent test panel in the Truedy dashboard and ask questions that should be answered from the KB. If the agent gives a vague or incorrect answer, check that the relevant source content is specific and unambiguous.

**Avoid duplicate content.** If the same information appears in multiple sources, the agent may retrieve conflicting passages. Consolidate overlapping content into a single authoritative source.

<Warning>
  Deleting a knowledge base deletes all of its sources permanently. This cannot be undone. Detach the KB from all agents before deleting it to avoid disrupting live calls.
</Warning>