What is Vapi's structure mentioned in Documentatio...
# support
v
@Pigeon I apologize for any confusion caused by the missing information. The expected structure for integrating a custom LLM with VAPI should be detailed in the documentation, but if it's not there or not clear, I can certainly help clarify. Typically, the expected structure would define how your custom LLM should communicate with VAPI, including the request and response formats. This would involve specifying the API endpoints, the required headers, the body of the requests, and the expected responses. Since the specific structure isn't provided in the link you mentioned, I recommend checking other sections of the VAPI documentation for a detailed guide on the expected structure. If the information is still not available, please reach out to support for more detailed guidance on this matter. They should be able to provide you with the necessary structure for your custom LLM integration.
You can continue this conversation by mentioning me in the message.
Got it. Tagging @User for additional help.
p
My FastAPI endpoint code:
Copy code
import time
from typing import Any
from fastapi import APIRouter

router = APIRouter()


@router.post("/completions", status_code=200)
def respond(payload: dict) -> Any:
    """
    Endpoint for receiving responses without authentication.
    """
    user_messages = [msg for msg in payload["messages"] if msg["role"] == "user"]
    if user_messages:
        print(f"User content: {user_messages[-1]['content']}")
        print("--------------------------------")

    response = {
        "id": "chatcmpl-8mcLf78g0quztp4BMtwd3hEj58Uof",
        "object": "chat.completion",
        "created": int(time.time()),
        "model": "gpt-3.5-turbo-0613",
        "system_fingerprint": None,
        "choices": [
            {
                "index": 0,
                "delta": {"content": "Hi, how are you doing?"},
                "logprobs": None,
                "finish_reason": "stop",
            }
        ],
    }

    print("--------------------------------")
    print(response)

    return response
i dont hear my hardcoded message on the call.
So, whats the correct response structure I have to send to vapi for it to create a bot message and TTS it?
c4599cea-b165-416c-828f-69c393cbf359
call id
Also tried piping through Groq which is supposed to serve completions in same format as open ai: Code:
Copy code
@router.post("/completions", status_code=200)
def respond(payload: dict) -> Any:
    """
    Endpoint for receiving responses without authentication.
    """
    user_messages = [msg for msg in payload["messages"] if msg["role"] == "user"]
    if user_messages:
        print(f"User content: {user_messages[-1]['content']}")
        print("--------------------------------")

    chat_completion = client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Hi, how are ya?",
            }
        ],
        model="llama3-8b-8192",
    )

    print("----------------------------")
    print(chat_completion.choices[0].message.content)
    print(chat_completion.model_dump())

    return chat_completion.model_dump()
Groq responded with:
Copy code
2024-11-28 19:13:39 Hi! I'm just a language model, I don't have feelings like humans do, but I'm here to help you with any questions or tasks you have! How can I assist you today?
2024-11-28 19:13:39 {'id': 'chatcmpl-9d94d051-66b6-4f04-b1f8-2a59d74af156', 'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': "Hi! I'm just a language model, I don't have feelings like humans do, but I'm here to help you with any questions or tasks you have! How can I assist you today?", 'role': 'assistant', 'function_call': None, 'tool_calls': None}}], 'created': 1732814021, 'model': 'llama3-8b-8192', 'object': 'chat.completion', 'system_fingerprint': 'fp_179b0f92c9', 'usage': {'completion_tokens': 41, 'prompt_tokens': 16, 'total_tokens': 57, 'completion_time': 0.034166667, 'prompt_time': 0.000146059, 'queue_time': 0.013842211, 'total_time': 0.034312726}, 'x_groq': {'id': 'req_01jdsw797beewax7v08dgv2arz'}}
But yet again I cant here a message that groq responded with TTSed, nor I get message in log of vapi
call id
c7867d0e-0963-4ad0-aca0-c79ea407cfdb
Im really lost on how is this supposed to work... Id expect
Hi! I'm just a language model, I don't have feelings like humans do, but I'm here to help you with any questions or tasks you have! How can I assist you today?
to be TTSed in a phone call. But nothing happens...
Log has no errors as well...
@User
My ngrok seem to show response was sent with 200:
Copy code
HTTP/1.1 200 OK
Server: uvicorn
Content-Length: 744
Content-Type: application/json
Date: Thu, 28 Nov 2024 17:49:37 GMT
With body:
Copy code
{
  "id": "chatcmpl-61a93f6f-2e8a-4632-a965-c5a1e7ad8310",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Hi! I'm just a language model, so I don't have emotions or feelings like humans do, but I'm \"functioning\" well and ready to help you with any questions or tasks you have! How about you? How's your day going?",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1732816180,
  "model": "llama3-8b-8192",
  "object": "chat.completion",
  "system_fingerprint": "fp_a97cfe35ae",
  "usage": {
    "completion_tokens": 53,
    "prompt_tokens": 16,
    "total_tokens": 69,
    "completion_time": 0.044166667,
    "prompt_time": 0.002902825,
    "queue_time": 0.012564495,
    "total_time": 0.047069492
  },
  "x_groq": {
    "id": "req_01jdsy94smfm79ppb8fpk248xa"
  }
}
@User @User so how long does it take to get actual support here?
.
.
s
@Pigeon here's how you need to you use custom-llm with Vapi https://github.com/VapiAI/advanced-concepts-custom-llm
p
Im not using OpenAI
I need a response structure that works
s
We parse all different LLM responses into OpenAI Messages format. You just need stream response back to Vapi, and follow OpenAI Messages strucutre.
@Pigeon what did you find wrong here, wanted to understand your pov to help you better.
This is very old
and not working
this example isnt streaming
And now youre saying that I must stream, so whats the actual truth?
s
@Pigeon Yes your correct docs are outdated, and you need to stream the response back to Vapi. Sorry for the inconvience caused to you.
p
So maybe update it? people come, try to implement it and waste many hours on something that is completely false...
Thanks, both openai and groq works with the new code you provided (https://github.com/VapiAI/advanced-concepts-custom-llm) using AsyncOpenAI and AsyncGroq
s
@Pigeon checking if this ticket can be closed?
p
Yes, thank you
6 Views