Prompt API: Chrome’s Built-in AI Powerhouse with Gemini Nano
What is Prompt API?
Prompt API is an experimental feature from Chrome (currently available in the Origin Trial for Chrome 138 and later versions) that allows developers to harness the power of the Gemini Nano model through API calls. This innovative tool enables processing of natural language, images, and audio inputs directly within the browser, generating text outputs. It opens up a world of possibilities for web applications, including:
-
AI-driven search: Answering user questions based on webpage content -
Personalized content: Dynamically categorizing news articles for user filtering -
Multimodal applications: Processing text, images, and audio to generate descriptions, transcriptions, or classification results
Hardware and Usage Requirements
To utilize Prompt API effectively, both developers and end-users must meet specific hardware and software requirements:
-
Operating System: Windows 10/11, macOS 13+ (Ventura or later), or Linux. Currently, Android, iOS, and ChromeOS are not supported. -
Storage Space: At least 22GB of free storage (model size may change with updates). You can check the current size in chrome://on-device-internals
. If storage drops below 10GB, the model will be automatically deleted and require redownloading. -
GPU: Graphics card with more than 4GB of VRAM (video random access memory). -
Network: Unlimited data connection (Wi-Fi or Ethernet) is required.
Getting Started with Prompt API
Before diving into implementation, familiarize yourself with Google’s Generative AI Use Policy. The core functionality of Prompt API revolves around two key functions in the LanguageModel
namespace:
-
LanguageModel.availability()
: Checks if the model is available, returning statuses like “available,” “needs download,” or “unavailable.” -
LanguageModel.create()
: Creates a session and triggers model download if necessary. Developers can enhance user experience by monitoring download progress through events likedownloadprogress
.
Model Download Process
While Prompt API comes built into Chrome, the Gemini Nano model itself downloads separately when the API is first used by a website. To check if the model is ready, use the asynchronous LanguageModel.availability()
function, which returns one of the following statuses:
-
"unavailable
: The implementation doesn’t support the requested options or prompt language models at all. -
"downloadable"
: The implementation supports the requested options, but some components (like the language model or fine-tuning data) need to be downloaded first. -
"downloading"
: The implementation supports the requested options, but an ongoing download must complete before creating a session. -
"available
: The implementation supports the requested options with no new downloads required.
To initiate the model download and create a language model session, use the asynchronous LanguageModel.create()
function. When the availability status is "downloadable"
, it’s good practice to monitor download progress to keep users informed:
const session = await LanguageModel.create({
monitor(m) {
m.addEventListener("downloadprogress", (e) => {
console.log(`Downloaded ${e.loaded * 100}%`);
});
},
});
Understanding Model Parameters
The params()
function provides valuable information about the language model’s capabilities, returning an object with these fields:
-
defaultTopK
: The default top-K value (default: 3) -
maxTopK
: The maximum allowed top-K value (8) -
defaultTemperature
: The default temperature setting (1.0) -
temperature
: A value between 0.0 and 2.0 that controls output randomness -
maxTemperature
: The highest allowed temperature value (2.0)
await LanguageModel.params();
// Returns: {defaultTopK: 3, maxTopK: 8, defaultTemperature: 1, maxTemperature: 2}
For those new to these terms, top-K controls how many candidate responses the model considers when generating output, while temperature affects randomness—lower values create more focused, predictable results, while higher values produce more varied, creative outputs.
Creating and Managing Sessions
Once Prompt API is ready, you can create sessions using the create()
function. These sessions allow interaction with the model through prompt()
or promptStreaming()
functions.
Customizing Your Session
You can customize each session using an optional options object with topK
and temperature
parameters. These default to the values returned by LanguageModel.params()
. Important note: When initializing a new session, you must either specify both topK
and temperature
or neither.
const params = await LanguageModel.params();
const slightlyHighTemperatureSession = await LanguageModel.create({
temperature: Math.max(params.defaultTemperature * 1.2, 2.0),
topK: params.defaultTopK,
});
The create()
function also accepts a signal
field in its options object, allowing you to pass an AbortSignal
to terminate the session:
const controller = new AbortController();
stopButton.onclick = () => controller.abort();
const session = await LanguageModel.create({
signal: controller.signal,
});
Using Initial Prompts
Initial prompts provide context about previous interactions, enabling features like continuing conversations after a browser restart. Here’s how to set them up:
const session = await LanguageModel.create({
initialPrompts: [
{ role: "system", content: "You are a helpful and friendly assistant." },
{ role: "user", content: "What is the capital of Italy?" },
{ role: "assistant", content: "The capital of Italy is Rome." },
{ role: "user", content: "What language is spoken there?" },
{
role: "assistant",
content: "The official language of Italy is Italian. [...]",
},
],
});
Guiding Responses with Prefixes
Beyond establishing previous roles, you can add new “assistant” role messages to shape the model’s previous answers. For example:
const followup = await session.prompt([
{
role: "user",
content: "I'm nervous about my presentation tomorrow",
},
{
role: "assistant",
content: "Presentations are tough!",
},
]);
In some cases, you might want to pre-fill part of the “assistant” response to guide the model toward a specific format. Add prefix: true
to the trailing “assistant” message to achieve this:
const characterSheet = await session.prompt([
{
role: "user",
content: "Create a TOML character sheet for a gnome barbarian",
},
{
role: "assistant",
content: "```toml\n",
prefix: true,
},
]);
Appending Messages Without Prompting
Processing multimodal inputs can take time, so pre-sending planned prompts to populate the session can help the model start processing earlier. While initialPrompts
works during session creation, the append()
method lets you add context after creation:
const session = await LanguageModel.create({
initialPrompts: [
{
role: "system",
content:
"You are a skilled analyst who correlates patterns across multiple images.",
},
],
expectedInputs: [{ type: "image" }],
});
fileUpload.onchange = async () => {
await session.append([
{
role: "user",
content: [
{
type: "text",
value: `Here's one image. Notes: ${fileNotesInput.value}`,
},
{ type: "image", value: fileUpload.files[0] },
],
},
]);
};
analyzeButton.onclick = async (e) => {
analysisResult.textContent = await session.prompt(userQuestionInput.value);
};
The promise returned by append()
resolves when the prompt is validated, processed, and added to the session. It rejects if the prompt can’t be appended.
Session Limits and Quotas
Each language model session has a maximum token capacity. You can track usage and remaining capacity using these session properties:
console.log(`${session.inputUsage}/${session.inputQuota}`);
This helps prevent hitting limits during long conversations or complex tasks.
Maintaining Conversation Context
Each session tracks conversation context, considering previous interactions in future responses until the context window is full:
const session = await LanguageModel.create({
initialPrompts: [
{
role: "system",
content:
"You are a friendly, helpful assistant specialized in clothing choices.",
},
],
});
const result1 = await session.prompt(
"What should I wear today? It is sunny. I am unsure between a t-shirt and a polo.",
);
console.log(result1);
const result2 = await session.prompt(
"That sounds great, but oh no, it is actually going to rain! New advice?",
);
console.log(result2);
In this example, the model remembers the initial clothing question when providing updated advice for rainy weather.
Enforcing JSON Output Format
To ensure the model follows a specific JSON structure, pass a JSON schema in the options object’s responseConstraint
field to prompt()
or promptStreaming()
:
const session = await LanguageModel.create();
const schema = {
"type": "boolean"
};
const post = "Mugs and ramen bowls, both a bit smaller than intended- but that's how it goes with reclaim. Glaze crawled the first time around, but pretty happy with it after refiring.";
const result = await session.prompt(
`Is this post about pottery?\n\n${post}`,
{
responseConstraint: schema,
}
);
console.log(JSON.parse(result));
// Returns: true
By default, the implementation may include the schema in messages to the language model, which uses part of your input quota. You can measure this usage with session.measureInputUsage()
by passing the responseConstraint
option.
To avoid this behavior, use the omitResponseConstraintInput
option, but be sure to include formatting guidance in your prompt:
const result = await session.prompt(
`
Summarize this feedback into a rating between 0-5, only outputting a JSON
object { rating }, with a single property whose value is a number:
The food was delicious, service was excellent, will recommend.
`,
{ responseConstraint: schema, omitResponseConstraintInput: true },
);
Cloning Sessions
To conserve resources, you can clone existing sessions using the clone()
function. This resets conversation context but preserves initial prompts. The function accepts an optional options object with a signal
field for termination:
const controller = new AbortController();
stopButton.onclick = () => controller.abort();
const clonedSession = await session.clone({
signal: controller.signal,
});
Cloning is useful when you want to explore different directions in a conversation without starting from scratch.
Interacting with the Model
You can prompt the model using either prompt()
for complete responses or promptStreaming()
for incremental results.
Non-Streaming Output
For shorter responses, use prompt()
, which returns the full result once generated:
// First check if a session can be created based on model availability and device capabilities
const { defaultTemperature, maxTemperature, defaultTopK, maxTopK } =
await LanguageModel.params();
const available = await LanguageModel.availability();
if (available !== "unavailable") {
const session = await LanguageModel.create();
// Prompt the model and wait for the complete result
const result = await session.prompt("Write me a poem!");
console.log(result);
}
Streaming Output
For longer responses, promptStreaming()
provides a ReadableStream
that delivers partial results as they’re generated:
const { defaultTemperature, maxTemperature, defaultTopK, maxTopK } =
await LanguageModel.params();
const available = await LanguageModel.availability();
if (available !== "unavailable") {
const session = await LanguageModel.create();
// Prompt the model and stream the result
const stream = session.promptStreaming("Write me an extra-long poem!");
for await (const chunk of stream) {
console.log(chunk);
}
}
Streaming creates a more responsive user experience for lengthier outputs like articles, stories, or detailed explanations.
Stopping Prompts
Both prompt()
and promptStreaming()
accept an optional second parameter with a signal
field, allowing you to stop processing:
const controller = new AbortController();
stopButton.onclick = () => controller.abort();
const result = await session.prompt("Write me a poem!", {
signal: controller.signal,
});
This is particularly useful for implementing user-initiated cancellation of long-running requests.
Terminating Sessions
When you no longer need a session, call destroy()
to free resources. Destroyed sessions can’t be reused, and any ongoing operations will be aborted:
await session.prompt(
"You are a friendly, helpful assistant specialized in clothing choices."
);
session.destroy();
// This promise will reject with an error indicating the session is destroyed
await session.prompt(
"What should I wear today? It is sunny, and I am unsure between a t-shirt and a polo."
);
It’s good practice to destroy sessions when they’re no longer needed, especially in single-page applications that might remain open for extended periods.
Multimodal Capabilities
Starting with Chrome 138 Canary, Prompt API supports audio and image inputs for local experimentation, with text output. These capabilities enable exciting new features:
-
Transcribing audio messages in chat applications -
Generating descriptions for uploaded images to use in captions or alt text
const session = await LanguageModel.create({
// { type: "text" } is optional unless specifying expected input languages
expectedInputs: [{ type: "audio" }, { type: "image" }],
});
const referenceImage = await (await fetch("/reference-image.jpeg")).blob();
const userDrawnImage = document.querySelector("canvas");
const response1 = await session.prompt([
{
role: "user",
content: [
{
type: "text",
value:
"Give a helpful artistic critique of how well the second image matches the first:",
},
{ type: "image", value: referenceImage },
{ type: "image", value: userDrawnImage },
],
},
]);
console.log(response1);
const audioBlob = await captureMicrophoneInput({ seconds: 10 });
const response2 = await session.prompt([
{
role: "user",
content: [
{ type: "text", value: "My response to your critique:" },
{ type: "audio", value: audioBlob },
],
},
]);
Multimodal Demonstrations
For practical examples of Prompt API with audio input, check out the Mediarecorder Audio Prompt demo. For image input examples, see the Canvas Image Prompt demo.
Performance Best Practices
Prompt API for the web is still under development. For optimal performance, follow these best practices for session management:
-
Reuse sessions when possible rather than creating new ones for each interaction -
Monitor input usage to avoid hitting quota limits -
Implement proper session destruction when conversations end -
Use streaming for longer responses to improve perceived performance -
Provide clear user feedback during model downloads and processing
Application Scenarios
Prompt API enables a wide range of practical applications across different content types:
Text Applications
-
Summarizing hotel reviews -
Generating structured data like star ratings -
Creating product descriptions from specifications -
Answering questions based on webpage content -
Categorizing news articles for personalized feeds
Image Applications
-
Classifying images (e.g., detecting identification documents) -
Generating alt text for accessibility -
Comparing product images for similarities -
Analyzing visual content for specific features -
Creating captions for photos
Audio Applications
-
Transcribing audio messages in encrypted chats -
Filtering live recordings in music collections -
Converting voice notes to text -
Analyzing audio content for specific patterns -
Generating descriptions of audio clips
Important Considerations
Permission Policies
By default, only top-level windows and same-origin iframes can use Prompt API. Cross-origin iframes require the allow="language-model"
attribute.
Web Workers Limitation
Currently, Prompt API doesn’t support Web Workers and must run in the main document or an iframe.
Privacy and Security
Always adhere to Google’s AI usage policies and ensure user data remains secure. Since processing happens locally in the browser, sensitive information doesn’t leave the user’s device, but you should still implement appropriate data handling practices.
Storage Management
The model may be automatically deleted if storage space drops below 10GB, requiring redownload when space becomes available. Inform users about this possibility to manage expectations.
Providing Feedback
Your feedback helps shape the future of Prompt API and Gemini Nano. Here’s how you can contribute:
-
Join the Early Access Program -
Submit bug reports or feature requests for Chrome’s implementation -
Share feedback on the API structure by commenting on existing issues or opening new ones in the Prompt API GitHub repository -
Participate in standardization efforts through the Web Incubator Community Group
Your input directly influences the development of this API and all built-in AI APIs, potentially leading to specialized task APIs for specific use cases like audio transcription or image description.
As Prompt API continues to evolve, it promises to unlock new possibilities for web developers, bringing powerful AI capabilities directly to browsers while maintaining user privacy through local processing. By integrating these tools, developers can create more intelligent, responsive, and accessible web applications that work seamlessly across devices.