I'm trying to implement an AI chat interface with a typewriter effect like ChatGPT. My backend API is written in Go and uses sashabaranov/go-openai for interfacing with OpenAI. Here's the controller, which is served at the route /answer/api/v1/chat/completion
package controller
import (
"os"
"strings"
"github.com/apache/incubator-answer/internal/base/handler"
"github.com/gin-gonic/gin"
"github.com/sashabaranov/go-openai"
"github.com/segmentfault/pacman/errors"
)
// ChatController chat controller
type ChatController struct{}
// NewChatController creates a new ChatController
func NewChatController() *ChatController {
return &ChatController{}
}
// ChatCompletion godoc
// @Summary Get Chat Completion
// @Description Get Chat Completion
// @Tags api-answer
// @Accept json
// @Produce json
// @Param prompt query string true "Prompt for the chat completion"
// @Router /answer/api/v1/chat/completion [get]
// @Success 200 {object} map[string]interface{}
func (cc *ChatController) ChatCompletion(ctx *gin.Context) {
prompt := ctx.Query("prompt")
if prompt == "" {
handler.HandleResponse(ctx, gin.Error{Err: errors.New(400, "prompt is required")}, nil)
return
}
OPENAI_API_KEY := os.Getenv("OPENAI_API_KEY")
client := openai.NewClient(OPENAI_API_KEY)
stream, err := client.CreateChatCompletionStream(ctx, openai.ChatCompletionRequest{
Model: "gpt-4",
Messages: []openai.ChatCompletionMessage{
{Role: "system", Content: "You are a helpful assistant."},
{Role: "user", Content: prompt},
},
})
if err != nil {
handler.HandleResponse(ctx, err, nil)
return
}
defer stream.Close()
ctx.Writer.Header().Set("Content-Type", "text/event-stream")
ctx.Writer.Header().Set("Cache-Control", "no-cache")
ctx.Writer.Header().Set("Connection", "keep-alive")
const NEWLINE = "$NEWLINE$"
for {
select {
case <-ctx.Done():
return
default:
response, err := stream.Recv()
if err != nil {
handler.HandleResponse(ctx, err, nil)
return
}
if response.Choices[0].Delta.Content != "" {
contentWithNewlines := strings.ReplaceAll(response.Choices[0].Delta.Content, "\n", NEWLINE)
ctx.Writer.Write([]byte("event: token\n"))
ctx.Writer.Write([]byte("data: " + contentWithNewlines + "\n\n"))
ctx.Writer.Flush()
}
}
}
}
On the frontend I have a React page with an EventSource with a listener that should be outputting each token as it's received:
/* eslint-disable */
import React, { useEffect, useState, useCallback } from 'react';
const Chats = () => {
const [output, setOutput] = useState<string>("");
const [isStarted, setIsStarted] = useState(false);
const [isStreamFinished, setIsStreamFinished] = useState<boolean>(false);
const prompt = 'mic check'; // Replace with your actual prompt
const NEWLINE = '$NEWLINE$'
const startChat = useCallback(() => {
setIsStarted(true);
setOutput("");
const eventSource = new EventSource(
`/answer/api/v1/chat/completion?prompt=${encodeURIComponent(prompt)}`,
);
eventSource.addEventListener('error', () => eventSource.close());
eventSource.addEventListener("token", (e) => {
// avoid newlines getting messed up
const token = e.data.replaceAll(NEWLINE, "\n");
console.log(token);
setOutput((prevOutput) => prevOutput + token);
});
eventSource.addEventListener("finished", (e) => {
console.log("finished", e);
eventSource.close();
setIsStreamFinished(true);
});
return () => {
eventSource.close();
};
}, [prompt]);
return (
<div>
<p>Prompt: {prompt}</p>
{!isStarted && <button onClick={startChat}>Start</button>}
<div>{output}</div>
</div>
);
};
export default Chats;
What actually happens when I hit the Start button is that it pauses until the completion finishes and then all the tokens are output at once. If I have a prompt that generates a longer response, that upfront pause is longer and still all the tokens are output at once. I'm not sure if the delay is on the frontend or backend.