AI Server provides a unified API to process requests for AI services to access LLMs, Image Generation, Transcription, and more. The API is designed to be simple to use and easy to integrate into your applications providing many supported languages and frameworks.
Chat UI​
AI Server's Chat UI lets upi send Open AI Chat requests with custom system prompts to any of its active LLMs:
https://localhost:5006/Chat
Making a Chat Request​
To make a chat request to AI Server, you can use the /api/OpenAiChatCompletion
endpoint. This endpoint requires a OpenAiChatCompletion
request DTO that contains a property matching the OpenAI
API.
Sync Open AI Chat Completion​
var client = GetLocalApiClient(AiServerUrl);
var response = client.Post(new OpenAiChatCompletion {
Model = "llama3.1:8b",
Messages =
[
new() { Role = "system", Content = "You are a helpful AI assistant." },
new() { Role = "user", Content = "How do LLMs work?" }
],
MaxTokens = 50
});
var answer = response.Choices[0].Message.Content;
This request will generate a response from the llama3:8b
model using the system
and user
messages provided. This will perform the operation synchronously, waiting for the response to be generated before returning it to the client.
Alternatively, you can call the same endpoint asynchronously by using the /api/QueueOpenAiChatCompletion
endpoint. This will queue the request for processing and return a URL to check the status of the request and download the response when it's ready.
Queued Open AI Chat Completion​
var client = GetLocalApiClient(AiServerUrl);
var response = client.Post(new QueueOpenAiChatCompletion
{
Request = new()
{
Model = "gpt-4-turbo",
Messages =
[
new() { Role = "system", Content = "You are a helpful AI assistant." },
new() { Role = "user", Content = "How do LLMs work?" }
],
MaxTokens = 50
},
});
// Poll for Job Completion Status
GetOpenAiChatStatusResponse status = new();
while (status.JobState is BackgroundJobState.Started or BackgroundJobState.Queued)
{
status = await client.GetAsync(new GetOpenAiChatStatus { RefId = response.RefId });
await Task.Delay(1000);
}
var answer = status.Result.Choices[0].Message.Content;
Additional optional features on the request to enhance the usage of AI Server include:
- RefId: A unique identifier for the request specified by the client to more easily track the progress of the request.
- Tag: A tag to help categorize the request for easier tracking.
RefId
and Tag
are available on both synchronous and asynchronous requests, where as Queue requests also support:
- ReplyTo: A URL to send a POST request to when the request is complete.
Open AI Chat with ReplyTo Callback​
The Queued API also accepts a ReplyTo Web Callback for a more reliable push-based App integration where responses are posted back to a custom URL Endpoint:
var correlationId = Guid.NewGuid().ToString("N");
var response = client.Post(new QueueOpenAiChatCompletion
{
//...
ReplyTo = $"https://example.org/api/OpenAiChatResponseCallback?CorrelationId=${correlationId}"
});
Your callback can add any additional metadata on the callback to assist your App in correlating the response with
the initiating request which just needs to contain the properties of the OpenAiChatResponse
you're interested in
along with any metadata added to the callback URL, e.g:
public class OpenAiChatResponseCallback : IPost, OpenAiChatResponse, IReturnVoid
{
public Guid CorrelationId { get; set; }
}
public object Post(OpenAiChatResponseCallback request)
{
// Handle OpenAiChatResponse callabck
}
Unless your callback API is restricted to only accept requests from your AI Server, you should include a
unique Id like a Guid
in the callback URL that can be validated against an initiating request to ensure
the callback can't be spoofed.
Using the AI Server Request DTOs with other OpenAI compatible APIs​
One advantage of using AI Server is that it provides a common set of request DTOs in 11 different languages that are compatible with OpenAI's API. This allows you to switch between OpenAI and AI Server without changing your client code. This means you can switch to using typed APIs in your preferred language with your existing service providers OpenAI compatible APIs, and optionally switch to AI Server when you're ready to self-host your AI services for better value.
var client = new JsonApiClient("https://api.openai.com");
client.BearerToken = Environment.GetEnvironmentVariable("OPENAI_API_KEY");
// Using AI Server DTOs with OpenAI API
var request = new OpenAiChatCompletion {
Model = "gpt-4-turbo",
Messages = [
new() { Role = "system", Content = "You are a helpful AI assistant." },
new() { Role = "user", Content = "What is the capital of France?" }
],
MaxTokens = 20
};
var response = await client.PostAsync<OpenAiChatResponse>(
"/v1/chat/completions",
request);
This shows usage of the OpenAiChat
request DTO directly with OpenAI's API using the ServiceStack JsonApiClient
, so you get the benefits of using typed APIs in your preferred language with your existing service providers OpenAI compatible APIs.