Managing State for Streaming AI Responses

4 min read

Managing State for Streaming AI Responses

LLMs generate text token by token. APIs like Claude and GPT stream these tokens as they're generated instead of waiting for the complete response. This streaming reduces perceived latency—users see text appearing immediately instead of staring at a spinner for several seconds.

Streaming creates a state management problem. Text arrives as chunks over a persistent connection. You need to accumulate chunks, track connection status, handle errors, and show completion. React's useState and useEffect manage this flow. Server-Sent Events (SSE) provide the browser-to-backend connection that delivers chunks as they arrive.

TL;DR

Use Server-Sent Events to stream LLM responses from FastAPI to Next.js. Track state explicitly: idle, streaming, complete, error. Update UI as chunks arrive. Close connections when complete or on error. Use React's useState for content accumulation and useEffect for connection lifecycle.

How Server-Sent Events Work

SSE lets servers push data to browsers over HTTP. The browser creates an EventSource, the server keeps the connection open, and sends events in text format:

markdown
data: {"type": "chunk", "text": "Hello"}\n\n
data: {"type": "chunk", "text": " world"}\n\n
data: {"type": "done"}\n\n

Each message starts with data:, contains JSON, and ends with double newlines. The browser fires events as messages arrive.

FastAPI Streaming Endpoint

Stream LLM responses from FastAPI using StreamingResponse:

python
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from anthropic import Anthropic
import json
 
app = FastAPI()
client = Anthropic()
 
@app.post("/api/analyze")
async def analyze_ticket(ticket_id: str):
    async def generate():
        # Stream from Anthropic
        with client.messages.stream(
            model="claude-sonnet-4-5-20250929",
            max_tokens=1024,
            messages=[{
                "role": "user",
                "content": f"Analyze ticket {ticket_id}"
            }]
        ) as stream:
            for text in stream.text_stream:
                yield f"data: {json.dumps({
                    'type': 'chunk',
                    'text': text
                })}\n\n"
 
        yield f"data: {json.dumps({'type': 'done'})}\n\n"
 
    return StreamingResponse(
        generate(),
        media_type="text/event-stream"
    )

The endpoint yields chunks as they arrive from Anthropic's streaming API. Each chunk is a JSON message. The final message signals completion.

React State Management

Manage streaming state in React with useState and useEffect:

typescript
"use client"
 
import { useState, useEffect } from "react"
 
type StreamState = "idle" | "streaming" | "complete" | "error"
 
export function TicketAnalysis({ ticketId }: { ticketId: string }) {
  const [state, setState] = useState<StreamState>("idle")
  const [content, setContent] = useState("")
  const [error, setError] = useState<string | null>(null)
 
  useEffect(() => {
    setState("streaming")
 
    const eventSource = new EventSource(
      `/api/analyze?ticket_id=${ticketId}`
    )
 
    eventSource.addEventListener("message", (event) => {
      const data = JSON.parse(event.data)
 
      if (data.type === "chunk") {
        setContent(prev => prev + data.text)
      } else if (data.type === "done") {
        setState("complete")
        eventSource.close()
      }
    })
 
    eventSource.addEventListener("error", () => {
      setError("Stream failed")
      setState("error")
      eventSource.close()
    })
 
    return () => eventSource.close()
  }, [ticketId])
 
  if (state === "error") {
    return <div>Error: {error}</div>
  }
 
  return (
    <div>
      <p>{content}</p>
      {state === "streaming" && (
        <span className="animate-pulse"></span>
      )}
    </div>
  )
}

State flows linearly: idle → streaming → (complete | error). Content accumulates as chunks arrive. The cursor shows during streaming, disappears when complete.

State Management Best Practices

Always close connections. EventSource stays open until you close it. Clean up in useEffect's return function to prevent memory leaks.

Track state explicitly. Don't derive streaming status from content length or other signals. Use dedicated state variables.

Handle errors. Network failures, API errors, and timeouts all trigger the EventSource error event. Set error state and close the connection.

Accumulate content correctly. Use setContent(prev => prev + data.text) to append chunks. Direct assignment loses previous content.

Manage dependencies. Include variables like ticketId in useEffect dependencies. When they change, React closes the old connection and opens a new one.

Error Recovery

Streams fail. Handle reconnection:

typescript
const [retryCount, setRetryCount] = useState(0)
const maxRetries = 3
 
useEffect(() => {
  const eventSource = new EventSource(
    `/api/analyze?ticket_id=${ticketId}`
  )
 
  eventSource.addEventListener("error", () => {
    eventSource.close()
 
    if (retryCount < maxRetries) {
      setTimeout(() => {
        setRetryCount(prev => prev + 1)
        setContent("")
        setState("streaming")
      }, 1000 * Math.pow(2, retryCount))
    } else {
      setState("error")
      setError("Failed after 3 retries")
    }
  })
 
  return () => eventSource.close()
}, [ticketId, retryCount])

Retry with exponential backoff. Reset content on retry. Stop after max attempts.

References