Latency with Files/Knowledge base Vapi AI #support

Latency with Files/Knowledge base

Elastic

04/02/2025, 10:32 AM

Any ideas how I can make the latency better when using the Files/Knowledge base docs/pdfs? I have a 500kb file currently attached to my Dementia agent. AND, does the agent reference all the docs in the Files folder? I noticed I can only add one file to my agent, but I'm ale to add multiple files to the Files folder (purpose?). I have also noticed there is a file size restriction of 1MB, again, potential latency issues. Is there a way to use multiple files as reference docs, and some how improve latency between the conversation response? Thanks! https://cdn.discordapp.com/attachments/1356939261856710786/1356939262041395230/Screen_Shot_2025-04-02_at_6.28.37_AM.png?ex=67ee63aa&is=67ed122a&hm=15463e8461a95f1d76fc196f52f70d5f528f2277a9fd6296e88931a8a7118a98&

Vapi Support Bot

04/02/2025, 10:33 AM

Hi there! 👋 I'm unable to provide specific information about whether the agent references all docs in the Files folder or why you can only add one file to your agent while adding multiple to the Files folder. However, I can definitely help with improving latency when using Files/Knowledge base! ## Improving Latency with Files/Knowledge Base Here are several ways to improve latency when using Files/Knowledge base in Vapi: ### 1. File Size Optimization The 1MB warning you're seeing is important! According to the [documentation](https://docs.vapi.ai/knowledge-base), exceeding 1MB in combined file size "may significantly affect call latency and user experience." Since you mentioned your current file is 500KB, you're halfway to that threshold. ### 2. Content Chunking Breaking down information into smaller, digestible pieces can help. According to [information I found](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1336243643043938395), you can customize how files are ingested and chunked with these parameters: -

splitDelimiters

(default:

[.!?\\n]

) -

targetSplitsPerChunk

(default: 20 splits per chunk) -

rebalanceChunks

(set to

true

to evenly distribute content) ### 3. File Format Considerations Vapi supports Markdown, PDF, plain text, and Microsoft Word formats. Some formats may process more efficiently than others. ### 4. Using Multiple Files For using multiple files as reference docs, you'll need to use the Vapi API. According to [multiple sources](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1336909771290185802), Vapi has migrated from Canonical to Trieve, and you now need to attach Knowledge Bases using the API. ### 5. Knowledge Base Organization Consider implementing: - Categorization and tagging for quicker retrieval - A robust indexing system - Regular content reviews to remove outdated information ### 6. Consider Trieve Integration

Vapi Support Bot

04/02/2025, 10:33 AM

You can create a knowledge base on Trieve and use their API to dynamically manage chunks, potentially improving performance according to [this information](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1339000190488543273). Is there anything specific from these strategies you'd like me to elaborate on? Sources: - [Ways to improve latency when using Files/Knowledge base in Vapi](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1336243643043938395) - [Impact of file size on latency in Vapi Files/Knowledge base](https://docs.vapi.ai/knowledge-base) - [Methods for optimizing files to reduce latency in Vapi Files/Knowledge base](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1336909771290185802) - [Using multiple files as reference documents in Vapi agents](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1339000190488543273) - [File size restriction of 1MB in Vapi Files/Knowledge base](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1336674769055252594)

Sahil

04/02/2025, 1:44 PM

Hey Elastic, can you brief me more about the types of files you're using and potential content in them? Apart from this, you can attach multiple files to an assistant for use alongside the KB, but the total file size must be less than 1MB for better performance and reduced latency. Lastly To help track down this issue, could you share: - The call ID

Elastic

04/02/2025, 2:18 PM

Dementia Agent Call ID: c9f5c59d-09bb-4565-be1d-7b45c66d90ba (this is the urgent one) Thanks!

Elastic

04/02/2025, 2:26 PM

BTW - the way, the other Cleaning Booking services agent is working fine, just not sure what's happening with the Dementia one. Call ID for cleaning service: e94158ed-ce21-47e0-83b4-fda91d6c1da3 Maybe there is some correlation, just hoping. Thanks.

Sahil

04/07/2025, 4:18 PM

Hey Elastic, First of all, apologies for the delay. I was out of office for a couple of days. I'll suggest you check both of the call IDs again as I don't see KB used in both of the shared call IDs.

Sahil

04/07/2025, 4:19 PM

Hypothetically, if you cannot use bigger files with VAPI, then you have to use providers such as Trieve where you can use bigger files and then trigger the KB using tools only. So that you never run into hallucinations and your tools calls work even if it's a transfer tool call or tools calling within the squads.

Previous Next