Very strange tokenization issue in non-English rep...
# support
Can be one Vapi setting? or something like this?
I have run some other tests and I can confirm that is the setting: "voice": {"chunkPlan": {"enabled": False}}
If disabled, we are getting spaces between chunks in Non-English responses, for example in Italian.
But with english responses everything runs properly.
@Shubham Bajaj @User are you aware of this strange behaviour? Maybe disabling it will remove even basic polishing tasks like trimmering chunks whitespaces?
s
The chunkPlan.enabled setting controls how the text is processed before being sent to the voice provider. When disabled, it bypasses several important text processing steps, including: 1. Basic text formatting 2. Whitespace handling 3. Chunk boundary detection based on punctuation
Copy code
json
ChunkPlan {
  /**
   * This determines whether the model output is chunked before being sent to the voice provider. Default true.
   *
   * Usage:
   * - To rely on the voice provider's audio generation logic, set this to false.
   * - If seeing issues with quality, set this to true.
   *
   * If disabled, Vapi-provided audio control tokens like <flush /> will not work.
   * @default true
   */
  enabled?: boolean;
  /**
   * This is the minimum number of characters in a chunk.
   *
   * Usage:
   * - To increase quality, set this to a higher value.
   * - To decrease latency, set this to a lower value.
   *
   * @default 30
   */
  minCharacters?: number;
  /**
   * These are the punctuations that are considered valid boundaries for a chunk to be created.
   *
   * Usage:
   * - To increase quality, constrain to fewer boundaries.
   * - To decrease latency, enable all.
   *
   * Default is automatically set to balance the trade-off between quality and latency based on the provider.
   */
  punctuationBoundaries?: Punctuation[];
  /**
   * This is the plan for formatting the chunk before it is sent to the voice provider.
   */
  formatPlan?: FormatPlan;
}
For non-English responses, keeping chunkPlan.enabled: true is particularly important because it helps with: - Proper sentence boundary detection using language-appropriate punctuation - Whitespace normalization between chunks - Improved prosody and natural speech flow Keep chunkPlan.enabled: true for all languages, especially non-English ones. If you're experiencing specific issues with chunk sizes or latency, you can tune these parameters instead:
Copy code
json
{
  "voice": {
    "chunkPlan": {
      "enabled": true,
      "minCharacters": 30, // Adjust between 1-80
      "punctuationBoundaries": [".", "!", "?", ";"] // Customize based on your needs
    }
  }
}
This maintains the text processing benefits while giving you control over chunk behavior. The reason English might work better with enabled: false is that many voice providers have better built-in handling for English text, but for other languages, Vapi's chunk processing helps ensure consistent quality. Let me know if you'd like to explore specific chunking configurations for your Italian use case!