Building a "chat with internet access" interface used to require complex orchestration: web scrapers, vector databases, and RAG (Retrieval-Augmented Generation) pipelines. Even with those in place, frontend developers often hit a wall trying to stream these responses smoothly to the client.
The combination of Next.js 14, the Vercel AI SDK, and Perplexity's API eliminates this friction. Perplexity provides an LLM with built-in internet access, while the Vercel AI SDK handles the complex Server-Sent Events (SSE) required for real-time text streaming.
This guide provides a production-ready implementation for integrating live search capabilities into your Next.js application.
The Problem: Why Server-Side Streaming Breaks
Modern users expect LLM interactions to feel instant. They want to see the cursor move the moment they hit "Enter."
However, implementing search-based chat creates a "double latency" problem. The server must first query a search index (latency 1), process the results, and then generate an answer (latency 2). In a standard HTTP request/response model, the user stares at a loading spinner for 5–10 seconds.
Furthermore, streaming text from a Server Component (RSC) to a Client Component requires traversing the network boundary. Doing this manually involves managing ReadableStreams, text encoding/decoding, and maintaining connection state—logic that is brittle and prone to memory leaks.
The Solution: Abstraction via Vercel AI SDK
We will solve this by utilizing the Vercel AI SDK's streamText function. This utility abstracts the stream management and standardizes the output format, allowing us to swap providers (like OpenAI or Perplexity) without rewriting frontend logic.
We will use Perplexity's llama-3.1-sonar-large-128k-online model, which automatically performs internet searches to ground its responses in real-time data.
Prerequisites
Ensure you have the following installed in your Next.js 14 project:
npm install ai @ai-sdk/openai zod
You will also need a Perplexity API key in your .env.local file:
PERPLEXITY_API_KEY=pplx-xxxxxxxxxxxxxxxxxxxxx
Step 1: Configure the Perplexity Provider
The Vercel AI SDK does not have a dedicated "Perplexity" import. However, because Perplexity is OpenAI-compatible, we can utilize the @ai-sdk/openai provider with a custom baseURL.
Create a utility file to configure the client. This ensures we don't repeat configuration logic across different API routes.
// lib/ai-provider.ts
import { createOpenAI } from '@ai-sdk/openai';
// Create a custom OpenAI instance pointing to Perplexity's API
export const perplexity = createOpenAI({
apiKey: process.env.PERPLEXITY_API_KEY || '',
baseURL: 'https://api.perplexity.ai/',
});
Step 2: Create the API Route Handler
In Next.js 14 (App Router), we use Route Handlers to process the chat request. This handler runs on the server, keeping your API key secure.
We will use the streamText function. This function connects to Perplexity, manages the search-and-generate lifecycle, and pipes the output directly to the client response.
Create app/api/chat/route.ts:
// app/api/chat/route.ts
import { perplexity } from '@/lib/ai-provider';
import { streamText, convertToCoreMessages } from 'ai';
// Allow streaming responses up to 30 seconds
export const maxDuration = 30;
export async function POST(req: Request) {
// 1. Parse the request body coming from the client hook
const { messages } = await req.json();
// 2. call Perplexity API with the specific "online" model
const result = await streamText({
model: perplexity('llama-3.1-sonar-large-128k-online'),
// Convert Vercel SDK message format to the format expected by the provider
messages: convertToCoreMessages(messages),
// Optional: System prompt to guide behavior
system: 'You are a helpful search assistant. Always provide citations where possible.',
});
// 3. Return a streaming response
return result.toDataStreamResponse();
}
Key Technical Details
convertToCoreMessages: The frontend sends a specific JSON structure. This utility sanitizes and formats it for the LLM.toDataStreamResponse(): This converts the AI stream into a standard HTTP Response object set up for streaming (headers likeTransfer-Encoding: chunkedare handled automatically).- Model Selection: The
onlinesuffix in the model name triggers Perplexity's search subsystem. If you use a non-online model, it will behave like a standard static LLM.
Step 3: Build the Search Interface
Now we need a client component to consume this stream. The useChat hook from the ai package handles the complex state management: appending user messages immediately, handling the incoming stream chunks, and updating the UI in real-time.
Create components/perplexity-search.tsx:
// components/perplexity-search.tsx
'use client';
import { useChat } from 'ai/react';
import { useEffect, useRef } from 'react';
export default function PerplexitySearch() {
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
api: '/api/chat', // Points to our new route handler
onError: (error) => {
console.error('Search failed:', error);
},
});
const messagesEndRef = useRef<HTMLDivElement>(null);
// Auto-scroll to bottom as stream updates
useEffect(() => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
}, [messages]);
return (
<div className="flex flex-col h-[600px] w-full max-w-2xl mx-auto border rounded-xl overflow-hidden bg-gray-50 dark:bg-gray-900 border-gray-200 dark:border-gray-800">
{/* Messages Area */}
<div className="flex-1 overflow-y-auto p-6 space-y-6">
{messages.length === 0 && (
<div className="text-center text-gray-500 mt-20">
<h3 className="text-lg font-semibold">Perplexity Search</h3>
<p className="text-sm">Ask anything. I have real-time internet access.</p>
</div>
)}
{messages.map((m) => (
<div
key={m.id}
className={`flex ${m.role === 'user' ? 'justify-end' : 'justify-start'}`}
>
<div
className={`max-w-[85%] rounded-lg px-4 py-3 text-sm leading-relaxed ${
m.role === 'user'
? 'bg-blue-600 text-white'
: 'bg-white dark:bg-gray-800 text-gray-900 dark:text-gray-100 shadow-sm border border-gray-200 dark:border-gray-700'
}`}
>
<div className="whitespace-pre-wrap">{m.content}</div>
</div>
</div>
))}
{/* Invisible element to hook for auto-scrolling */}
<div ref={messagesEndRef} />
</div>
{/* Input Area */}
<div className="p-4 bg-white dark:bg-gray-900 border-t border-gray-200 dark:border-gray-800">
<form onSubmit={handleSubmit} className="relative">
<input
className="w-full p-3 pr-12 rounded-md border border-gray-300 dark:border-gray-700 bg-transparent focus:outline-none focus:ring-2 focus:ring-blue-500 dark:text-white"
value={input}
placeholder="Search the web..."
onChange={handleInputChange}
disabled={isLoading}
/>
<button
type="submit"
disabled={isLoading || !input.trim()}
className="absolute right-2 top-2 p-1.5 bg-blue-600 text-white rounded hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
>
{/* Simple SVG Arrow Icon */}
<svg
xmlns="http://www.w3.org/2000/svg"
width="20"
height="20"
viewBox="0 0 24 24"
fill="none"
stroke="currentColor"
strokeWidth="2"
strokeLinecap="round"
strokeLinejoin="round"
>
<path d="m5 12 7-7 7 7" />
<path d="M12 19V5" />
</svg>
<span className="sr-only">Send</span>
</button>
</form>
</div>
</div>
);
}
Deep Dive: How the Stream Works
Understanding the data flow is critical for debugging edge cases.
- Initiation: When
handleSubmitfires,useChatcreates a POST request to/api/chat. - Server Execution: The server initializes the
streamTextfunction. It opens a socket connection to Perplexity's API. - Buffering vs. Streaming: Unlike a standard fetch where you await the full
textresponse,streamTextlistens for data chunks (tokens). - Serialization: The Vercel AI SDK wraps these chunks in a specific protocol format (formatted as text lines) and pipes them into the HTTP Response.
- Rehydration: The client's
useChathook reads thisReadableStream. As new chunks arrive, it does not replace the message state; it appends to the last message in themessagesarray. This triggers a React re-render, causing the "typing" effect.
Handling Edge Cases and Pitfalls
1. Vercel Serverless Timeouts
Perplexity searches can be slow. If the search takes longer than 10 seconds (the default on Vercel's Hobby tier), the function will time out.
The Fix: Ensure you export the maxDuration config in your route handler (as shown in Step 2) if you are on Vercel Pro. If you are on the Hobby tier, consider using the Edge Runtime, although it has stricter compatibility requirements.
// Optional: Use Edge Runtime for longer timeouts on Hobby tier
// Note: Ensure all imported libraries are edge-compatible
export const runtime = 'edge';
2. Rendering Markdown
Perplexity returns Markdown (headers, lists, bold text). The raw text output in the code above won't render these prettily.
To fix this, install react-markdown and wrap the message content:
import ReactMarkdown from 'react-markdown';
// Inside the map loop:
<div className="prose dark:prose-invert text-sm">
<ReactMarkdown>{m.content}</ReactMarkdown>
</div>
3. Citations Handling
Perplexity's API returns citations in a citations field, separate from the content stream. The Vercel AI SDK streamText focuses primarily on the content delta.
If strict citation linking is required for your use case, you may need to parse the raw stream manually or wait for the onFinish callback in streamText to process the full metadata attached to the response object.
Conclusion
By leveraging the Vercel AI SDK as an abstraction layer, we've reduced a complex streaming architecture into roughly 50 lines of code. You now have a robust, streaming search interface that leverages Perplexity's live indexing without the overhead of building your own web scraper.
This setup is extensible. Because we adhered to the ai SDK standards, swapping Perplexity for GPT-4 or Claude in the future is as simple as changing the model string in your Route Handler.