Sources are the individual documents inside a knowledge base. This guide covers the full lifecycle: adding sources of every type, monitoring processing status, handling failures, and removing outdated content.
Before adding sources you need an existing knowledge base. See Knowledge Bases if you haven’t created one yet.
Adding sources
Truedy supports two source types: document (a file you upload) and web (URLs you want crawled).
Document source
Upload a PDF, DOCX, TXT, or Markdown file. The file must be base64-encoded in the request body. Maximum file size is 50 MB.
# Encode your file first: base64 -i manual.pdf > manual_b64.txt
# Then paste the base64 string into the "data" field.
curl -X POST https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Product Manual v3",
"file": {
"filename": "manual.pdf",
"content_type": "application/pdf",
"data": "<BASE64_ENCODED_CONTENT>"
}
}'
Request body:
| Field | Type | Required | Description |
|---|
name | string | Yes | Display name for this source |
description | string | No | Optional description |
file.filename | string | Yes | Original filename including extension (e.g. manual.pdf) |
file.content_type | string | Yes | MIME type: application/pdf, text/plain, application/vnd.openxmlformats-officedocument.wordprocessingml.document, or text/markdown |
file.data | string | Yes | Base64-encoded file content |
PDFs with selectable text index best. Scanned image-only PDFs produce poor results. Where possible, export text-based PDFs or convert to plain text.
Web source
Truedy crawls the provided URLs, extracts text, and indexes the content. Pages must be publicly accessible without authentication.
curl -X POST https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/web \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Help Centre: Billing",
"urls": ["https://help.example.com/billing"],
"max_depth": 1
}'
Request body:
| Field | Type | Required | Description |
|---|
name | string | Yes | Display name for this source |
description | string | No | Optional description |
urls | array of strings | Yes | One or more URLs to crawl |
max_depth | integer | No | Link-follow depth. 0 = index only the given URLs. 1 = also follow links from those pages. Max 3. Default 1. |
Checking source status
List all sources
curl https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources \
-H "Authorization: Bearer YOUR_API_KEY"
Response
{
"data": [
{
"id": "src_01J...",
"kb_id": "kb_01J8X...",
"name": "Product Manual v3",
"kind": "document",
"status": "ready",
"created_at": "2025-09-12T10:05:00Z",
"updated_at": "2025-09-12T10:06:30Z"
},
{
"id": "src_02K...",
"kb_id": "kb_01J8X...",
"name": "Help Centre: Billing",
"kind": "web",
"status": "processing",
"created_at": "2025-09-12T10:06:00Z",
"updated_at": "2025-09-12T10:06:00Z"
}
]
}
Status meanings
| Status | Meaning |
|---|
pending | Source has been received and is queued for processing |
processing | Content is being extracted and indexed |
ready | Indexed and available for search during calls |
failed | Processing encountered an error — see below |
Only ready sources contribute to your agent’s knowledge. Sources in any other state are excluded from search.
When a source fails
Common causes of a failed status:
| Cause | How to resolve |
|---|
| Unsupported file type | Re-upload as PDF, DOCX, or TXT |
URL blocked by robots.txt | Contact the site owner, or copy the content into a text source |
| URL requires login | The page must be publicly accessible — use a text or file source instead |
| Empty or unextractable content | Verify the file has selectable text, or paste the content as a text source |
| File too large | Split the document into smaller files and add each as a separate source |
To resubmit a failed source, delete it and add it again after resolving the underlying issue.
Updating sources
Sources do not have a dedicated update endpoint. The recommended approach is to delete the existing source and add a new one with the revised content. This ensures the index is rebuilt cleanly from the latest version.
# 1. Delete the old source
curl -X DELETE https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/SOURCE_ID \
-H "Authorization: Bearer YOUR_API_KEY"
# 2. Re-add with updated content (document example)
curl -X POST https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/document \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Product Manual v4",
"file": {
"filename": "manual_v4.pdf",
"content_type": "application/pdf",
"data": "<BASE64_ENCODED_CONTENT>"
}
}'
Removing sources
Deleting a source removes it from the search index immediately. In-progress calls are not affected, but all subsequent calls will no longer have access to that content.
curl -X DELETE https://api.truedy.ai/api/public/v1/knowledge-bases/KB_ID/sources/SOURCE_ID \
-H "Authorization: Bearer YOUR_API_KEY"
A successful deletion returns HTTP 204 No Content.
Best practices
Keep sources focused. One topic per source is better than a single giant document. Focused sources produce cleaner search results and make it easier to update individual pieces of content.
Use descriptive names. Names like "Refund Policy — EU" or "Pricing: Pro Plan" are far easier to debug than "doc1" or "untitled".
Verify web pages are publicly accessible. Before adding a web source, open the URL in a private/incognito browser tab. If you see a login screen or a 4xx error, Truedy will not be able to crawl it — use a document source instead.
Test with your agent. After all sources reach ready status, use the agent test panel in the Truedy dashboard and ask questions that should be answered from the KB. If the agent gives a vague or incorrect answer, check that the relevant source content is specific and unambiguous.
Avoid duplicate content. If the same information appears in multiple sources, the agent may retrieve conflicting passages. Consolidate overlapping content into a single authoritative source.
Deleting a knowledge base deletes all of its sources permanently. This cannot be undone. Detach the KB from all agents before deleting it to avoid disrupting live calls.