You’ve likely been there: You build a sophisticated agent using Google's Gemini 1.5 Pro. You define your tools perfectly. Yet, in production, the model decides to "chat" about the weather instead of calling the get_weather function. Or worse, it calls the function but passes "London" as a string when your schema explicitly demanded an object with logic-gated coordinates.
Nondeterministic behavior in function calling (Tool Use) is the primary bottleneck preventing AI demos from becoming enterprise-grade software. When an LLM hallucinates parameters or ignores tools, it breaks the application loop and erodes user trust.
This guide moves beyond basic tutorials. We will implement a strictly typed, deterministic function-calling architecture using the Gemini API, TypeScript, and the Node.js SDK.
The Root Cause: Why Gemini Ignores Your Tools
To fix function calling, you must understand that LLMs do not "call functions." They predict tokens.
When you provide a tool definition, the Gemini API converts that JSON Schema into a textual representation that sits in the system context. When the model generates a response, it calculates the probability of generating a text token (like "The") versus a special token indicating the start of a function call.
Failures occur for three specific technical reasons:
- Ambiguous Schema Descriptions: If your parameter descriptions are vague, the attention mechanism fails to map the user's intent to the specific JSON field. Gemini relies heavily on the
descriptionfield, treating it as a system instruction. - Context Pollution: If your conversation history contains previous malformed tool calls or excessive chatty context, the model’s probability distribution shifts away from structured data generation back to conversational text.
- Default
mode: AUTOInstability: By default, Gemini decides if it should call a tool. In high-stakes logic flows, relying on the model's discretion is a mistake.
The Fix: Implementing Strict Determinism
We will resolve these issues by implementing a FunctionDeclaration with rigorous schema validation and utilizing the toolConfig parameter to enforce behavior.
Prerequisites
Ensure you have the latest version of the Google Generative AI SDK.
npm install @google/generative-ai dotenv
1. The Robust Tool Definition
Avoid writing raw JSON schemas by hand if possible, but for the SDK, we must provide the exact structure. The key to reliability here is the description and the required array.
We will build a stock price checker. Notice the verbosity in the description fields—this is "Prompt Engineering for Schemas."
import { FunctionDeclaration, SchemaType } from "@google/generative-ai";
// Define the tool with explicit, instructive descriptions
const getStockPriceTool: FunctionDeclaration = {
name: "getStockPrice",
description: "Retrieves the current stock price and currency for a given ticker symbol. Must be used whenever the user asks about financial market data.",
parameters: {
type: SchemaType.OBJECT,
properties: {
ticker: {
type: SchemaType.STRING,
description: "The 4-letter stock ticker symbol (e.g., AAPL, GOOGL). If the user provides a company name, convert it to the ticker.",
},
exchange: {
type: SchemaType.STRING,
description: "The stock exchange code. Default to 'NASDAQ' if not specified or implied.",
},
},
// CRITICAL: Explicitly listing required fields reduces hallucination
required: ["ticker"],
},
};
2. Forcing the Hand: toolConfig
This is the configuration most developers overlook. If your application state requires a tool output (e.g., a CLI agent), you should not let the model choose to chat.
We use functionCallingConfig to set the mode.
const toolConfig = {
functionCallingConfig: {
// Options: "AUTO", "ANY" (forces tool use), "NONE" (forces text)
// Use "ANY" if the next step MUST be a function call.
// Use "AUTO" for general chatbots, but reinforce via system prompt.
mode: "AUTO",
},
};
3. The Execution Loop
Below is a production-ready implementation. It handles the initial request, executes the actual JavaScript function, and feeds the result back to Gemini for the final natural language response.
import {
GoogleGenerativeAI,
GenerateContentRequest,
Part
} from "@google/generative-ai";
import * as dotenv from "dotenv";
dotenv.config();
// 1. Mock API Implementation
async function fetchStockAPI(ticker: string, exchange: string = "NASDAQ") {
// Simulate API latency and data return
console.log(`[System] Fetching data for ${ticker} on ${exchange}...`);
await new Promise((resolve) => setTimeout(resolve, 500));
// deterministic mock data
const mockDb: Record<string, number> = {
"AAPL": 175.50,
"GOOGL": 140.20,
"MSFT": 402.10
};
const price = mockDb[ticker.toUpperCase()];
if (!price) {
throw new Error(`Ticker ${ticker} not found on ${exchange}`);
}
return { ticker, price, currency: "USD", exchange };
}
// 2. The Main Execution Logic
async function runAgent(userQuery: string) {
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY || "");
// Initialize model with tools and config
const model = genAI.getGenerativeModel({
model: "gemini-1.5-flash", // Flash is faster/cheaper for tool use; Pro for complex logic
tools: [{ functionDeclarations: [getStockPriceTool] }],
toolConfig: toolConfig, // defined in previous section
});
const chat = model.startChat();
console.log(`[User] ${userQuery}`);
// Send initial message
const result = await chat.sendMessage(userQuery);
const response = result.response;
const functionCalls = response.functionCalls();
// 3. Handle the Function Call
if (functionCalls && functionCalls.length > 0) {
const call = functionCalls[0];
const { name, args } = call;
if (name === "getStockPrice") {
try {
// Validate args before execution (Defensive Programming)
const ticker = args.ticker as string;
const exchange = (args.exchange as string) || "NASDAQ";
// Execute actual code
const apiResponse = await fetchStockAPI(ticker, exchange);
// 4. Send Function Response back to Model
// The model needs the result to generate the natural language answer
const parts: Part[] = [
{
functionResponse: {
name: "getStockPrice",
response: {
name: "getStockPrice",
content: apiResponse, // Send the JSON object back
},
},
},
];
const finalResult = await chat.sendMessage(parts);
console.log(`[Agent] ${finalResult.response.text()}`);
} catch (error: any) {
console.error("Tool Execution Failed:", error.message);
// Optional: Feed error back to model so it can apologize to user
}
}
} else {
// Model decided to just chat
console.log(`[Agent] ${response.text()}`);
}
}
// Execute
runAgent("What is the current trading price for Apple?");
Deep Dive: Why This Works
The functionResponse Structure
Notice specifically how we construct the follow-up message. We do not simply append text. We send a functionResponse part.
When Gemini sees a functionResponse, it treats it as the logical conclusion of the previous functionCall. This closes the attention loop. If you mistakenly send the API result back as standard user text, the model may hallucinate that the user provided the data, leading to confusing conversation contexts.
Schema Validation
By defining ticker as required and adding strict typing, we force the model's temperature usage down for those specific tokens. Even if your global temperature is set to 0.9 for creativity, the API treats tool definitions with higher rigidity.
Handling Edge Cases and Pitfalls
1. The "Infinite Loop" Trap
Sometimes, if the API returns an error or an unexpected format, the model will simply try to call the function again with the exact same parameters.
Solution: Implement a retry counter or a max_steps variable in your loop. If the model tries to call the same tool with the same arguments twice and fails, terminate the loop and return a fallback message.
2. Hallucinated Parameters
Gemini might invent parameters that aren't in your schema, such as adding a date field to our stock tool if the user says "What was Apple's price yesterday?"
Solution: Inside your execution block, sanitize the input.
// Inside your execution block
const allowedKeys = ["ticker", "exchange"];
const cleanArgs = Object.keys(args)
.filter(key => allowedKeys.includes(key))
.reduce((obj, key) => {
obj[key] = args[key];
return obj;
}, {});
Strictly speaking, you should ignore extra keys rather than throwing errors, as throwing errors often causes the model to panic and loop.
3. Multiple Tool Calls
Gemini 1.5 supports parallel function calling. The response.functionCalls() method returns an array.
Code Adjustment: Always iterate over functionCalls(), even if you only expect one.
const functionCalls = response.functionCalls();
if (functionCalls) {
const responses = await Promise.all(functionCalls.map(async (call) => {
// execute logic for each call
return {
functionResponse: {
name: call.name,
response: { content: await execute(call) }
}
};
}));
// Send all responses back in one message
await chat.sendMessage(responses);
}
Conclusion
Reliable function calling with Gemini is less about hoping the AI is "smart enough" and more about strict contract definition. By leveraging detailed schema descriptions, enforcing required fields, and correctly handling the functionResponse loop, you transform a probabilistic chatbot into a deterministic logic engine.
The code provided here serves as a template. For production, wrap the fetchStockAPI in a try/catch block that returns a JSON error object (e.g., { error: "Ticker invalid" }) rather than throwing an exception. This allows the LLM to gracefully recover and ask the user for clarification.