Integrate Perplexity Search into Next.js 14 using Vercel AI SDK

Building a "chat with internet access" interface used to require complex orchestration: web scrapers, vector databases, and RAG (Retrieval-Augmented Generation) pipelines. Even with those in place, frontend developers often hit a wall trying to stream these responses smoothly to the client.

The combination of Next.js 14, the Vercel AI SDK, and Perplexity's API eliminates this friction. Perplexity provides an LLM with built-in internet access, while the Vercel AI SDK handles the complex Server-Sent Events (SSE) required for real-time text streaming.

This guide provides a production-ready implementation for integrating live search capabilities into your Next.js application.

The Problem: Why Server-Side Streaming Breaks

Modern users expect LLM interactions to feel instant. They want to see the cursor move the moment they hit "Enter."

However, implementing search-based chat creates a "double latency" problem. The server must first query a search index (latency 1), process the results, and then generate an answer (latency 2). In a standard HTTP request/response model, the user stares at a loading spinner for 5–10 seconds.

Furthermore, streaming text from a Server Component (RSC) to a Client Component requires traversing the network boundary. Doing this manually involves managing ReadableStreams, text encoding/decoding, and maintaining connection state—logic that is brittle and prone to memory leaks.

The Solution: Abstraction via Vercel AI SDK

We will solve this by utilizing the Vercel AI SDK's streamText function. This utility abstracts the stream management and standardizes the output format, allowing us to swap providers (like OpenAI or Perplexity) without rewriting frontend logic.

We will use Perplexity's llama-3.1-sonar-large-128k-online model, which automatically performs internet searches to ground its responses in real-time data.

Prerequisites

Ensure you have the following installed in your Next.js 14 project:

npm install ai @ai-sdk/openai zod

You will also need a Perplexity API key in your .env.local file:

PERPLEXITY_API_KEY=pplx-xxxxxxxxxxxxxxxxxxxxx

Step 1: Configure the Perplexity Provider

The Vercel AI SDK does not have a dedicated "Perplexity" import. However, because Perplexity is OpenAI-compatible, we can utilize the @ai-sdk/openai provider with a custom baseURL.

Create a utility file to configure the client. This ensures we don't repeat configuration logic across different API routes.

// lib/ai-provider.ts
import { createOpenAI } from '@ai-sdk/openai';

// Create a custom OpenAI instance pointing to Perplexity's API
export const perplexity = createOpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY || '',
  baseURL: 'https://api.perplexity.ai/',
});

Step 2: Create the API Route Handler

In Next.js 14 (App Router), we use Route Handlers to process the chat request. This handler runs on the server, keeping your API key secure.

We will use the streamText function. This function connects to Perplexity, manages the search-and-generate lifecycle, and pipes the output directly to the client response.

Create app/api/chat/route.ts:

// app/api/chat/route.ts
import { perplexity } from '@/lib/ai-provider';
import { streamText, convertToCoreMessages } from 'ai';

// Allow streaming responses up to 30 seconds
export const maxDuration = 30;

export async function POST(req: Request) {
  // 1. Parse the request body coming from the client hook
  const { messages } = await req.json();

  // 2. call Perplexity API with the specific "online" model
  const result = await streamText({
    model: perplexity('llama-3.1-sonar-large-128k-online'),
    // Convert Vercel SDK message format to the format expected by the provider
    messages: convertToCoreMessages(messages),
    // Optional: System prompt to guide behavior
    system: 'You are a helpful search assistant. Always provide citations where possible.',
  });

  // 3. Return a streaming response
  return result.toDataStreamResponse();
}

Key Technical Details

convertToCoreMessages: The frontend sends a specific JSON structure. This utility sanitizes and formats it for the LLM.
toDataStreamResponse(): This converts the AI stream into a standard HTTP Response object set up for streaming (headers like Transfer-Encoding: chunked are handled automatically).
Model Selection: The online suffix in the model name triggers Perplexity's search subsystem. If you use a non-online model, it will behave like a standard static LLM.

Step 3: Build the Search Interface

Now we need a client component to consume this stream. The useChat hook from the ai package handles the complex state management: appending user messages immediately, handling the incoming stream chunks, and updating the UI in real-time.

Create components/perplexity-search.tsx:

// components/perplexity-search.tsx
'use client';

import { useChat } from 'ai/react';
import { useEffect, useRef } from 'react';

export default function PerplexitySearch() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    api: '/api/chat', // Points to our new route handler
    onError: (error) => {
      console.error('Search failed:', error);
    },
  });

  const messagesEndRef = useRef<HTMLDivElement>(null);

  // Auto-scroll to bottom as stream updates
  useEffect(() => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
  }, [messages]);

  return (
    <div className="flex flex-col h-[600px] w-full max-w-2xl mx-auto border rounded-xl overflow-hidden bg-gray-50 dark:bg-gray-900 border-gray-200 dark:border-gray-800">
      
      {/* Messages Area */}
      <div className="flex-1 overflow-y-auto p-6 space-y-6">
        {messages.length === 0 && (
          <div className="text-center text-gray-500 mt-20">
            <h3 className="text-lg font-semibold">Perplexity Search</h3>
            <p className="text-sm">Ask anything. I have real-time internet access.</p>
          </div>
        )}

        {messages.map((m) => (
          <div
            key={m.id}
            className={`flex ${m.role === 'user' ? 'justify-end' : 'justify-start'}`}
          >
            <div
              className={`max-w-[85%] rounded-lg px-4 py-3 text-sm leading-relaxed ${
                m.role === 'user'
                  ? 'bg-blue-600 text-white'
                  : 'bg-white dark:bg-gray-800 text-gray-900 dark:text-gray-100 shadow-sm border border-gray-200 dark:border-gray-700'
              }`}
            >
              <div className="whitespace-pre-wrap">{m.content}</div>
            </div>
          </div>
        ))}
        
        {/* Invisible element to hook for auto-scrolling */}
        <div ref={messagesEndRef} />
      </div>

      {/* Input Area */}
      <div className="p-4 bg-white dark:bg-gray-900 border-t border-gray-200 dark:border-gray-800">
        <form onSubmit={handleSubmit} className="relative">
          <input
            className="w-full p-3 pr-12 rounded-md border border-gray-300 dark:border-gray-700 bg-transparent focus:outline-none focus:ring-2 focus:ring-blue-500 dark:text-white"
            value={input}
            placeholder="Search the web..."
            onChange={handleInputChange}
            disabled={isLoading}
          />
          <button
            type="submit"
            disabled={isLoading || !input.trim()}
            className="absolute right-2 top-2 p-1.5 bg-blue-600 text-white rounded hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
          >
            {/* Simple SVG Arrow Icon */}
            <svg
              xmlns="http://www.w3.org/2000/svg"
              width="20"
              height="20"
              viewBox="0 0 24 24"
              fill="none"
              stroke="currentColor"
              strokeWidth="2"
              strokeLinecap="round"
              strokeLinejoin="round"
            >
              <path d="m5 12 7-7 7 7" />
              <path d="M12 19V5" />
            </svg>
            <span className="sr-only">Send</span>
          </button>
        </form>
      </div>
    </div>
  );
}

Deep Dive: How the Stream Works

Understanding the data flow is critical for debugging edge cases.

Initiation: When handleSubmit fires, useChat creates a POST request to /api/chat.
Server Execution: The server initializes the streamText function. It opens a socket connection to Perplexity's API.
Buffering vs. Streaming: Unlike a standard fetch where you await the full text response, streamText listens for data chunks (tokens).
Serialization: The Vercel AI SDK wraps these chunks in a specific protocol format (formatted as text lines) and pipes them into the HTTP Response.
Rehydration: The client's useChat hook reads this ReadableStream. As new chunks arrive, it does not replace the message state; it appends to the last message in the messages array. This triggers a React re-render, causing the "typing" effect.

Handling Edge Cases and Pitfalls

1. Vercel Serverless Timeouts

Perplexity searches can be slow. If the search takes longer than 10 seconds (the default on Vercel's Hobby tier), the function will time out.

The Fix: Ensure you export the maxDuration config in your route handler (as shown in Step 2) if you are on Vercel Pro. If you are on the Hobby tier, consider using the Edge Runtime, although it has stricter compatibility requirements.

// Optional: Use Edge Runtime for longer timeouts on Hobby tier
// Note: Ensure all imported libraries are edge-compatible
export const runtime = 'edge';

2. Rendering Markdown

Perplexity returns Markdown (headers, lists, bold text). The raw text output in the code above won't render these prettily.

To fix this, install react-markdown and wrap the message content:

import ReactMarkdown from 'react-markdown';

// Inside the map loop:
<div className="prose dark:prose-invert text-sm">
  <ReactMarkdown>{m.content}</ReactMarkdown>
</div>

3. Citations Handling

Perplexity's API returns citations in a citations field, separate from the content stream. The Vercel AI SDK streamText focuses primarily on the content delta.

If strict citation linking is required for your use case, you may need to parse the raw stream manually or wait for the onFinish callback in streamText to process the full metadata attached to the response object.

Conclusion

By leveraging the Vercel AI SDK as an abstraction layer, we've reduced a complex streaming architecture into roughly 50 lines of code. You now have a robust, streaming search interface that leverages Perplexity's live indexing without the overhead of building your own web scraper.

This setup is extensible. Because we adhered to the ai SDK standards, swapping Perplexity for GPT-4 or Claude in the future is as simple as changing the model string in your Route Handler.

Programming Tutorials

Search This Blog