From Zero to One: Building Your First ChatGPT App with OpenAI Apps SDK

Have you ever imagined ChatGPT not just answering questions, but also showing an interactive to-do list, a 3D solar system model, or even a pizza ordering interface? The OpenAI Apps SDK makes this possible. This article will provide a complete breakdown of how to use the Apps SDK and its ecosystem tools to step-by-step build and deploy your own embedded ChatGPT application.

Article Summary

The OpenAI Apps SDK allows developers to create interactive application interfaces for ChatGPT. Its core is a server built on the Model Context Protocol (MCP), which exposes tools to the model and combines them with frontend components (Widgets) to enable rich interactions. Developers need to build a frontend component and an MCP server. By defining tools and returning responses with UI metadata, rich embedded applications can be rendered within ChatGPT conversations. This article, based on official examples, details the complete workflow from environment setup, component development, and server deployment to integration within ChatGPT.

Part 1: Understanding Core Concepts and Architecture

Before we start coding, we need to clarify a few key concepts. The entire workflow of the OpenAI Apps SDK revolves around two core parts: the frontend component and the MCP server.

What is the Model Context Protocol (MCP)?

MCP is an open protocol for connecting large language models to external tools, data, and user interfaces. Think of it as a standardized bridge between ChatGPT and your service.

A minimal MCP server designed for the Apps SDK needs to implement three core capabilities:

List Tools: Tell ChatGPT which tools your server supports, and the required input and output format for each tool.
Call Tools: When ChatGPT decides to use a tool, it sends a request. Your server executes the corresponding action (like querying a database, processing data) and returns structured results.
Return Widgets: Alongside the returned data, include metadata that can be rendered as an interface, allowing ChatGPT to directly display an interactive widget within the conversation.

Key Point: MCP is transport-agnostic. You can host it using Server-Sent Events or streaming HTTP—the Apps SDK supports both.

How Does the Apps SDK Work?

Your frontend component (an HTML file or a webpage built with a framework) will be loaded into an iframe within the ChatGPT interface. This component communicates with ChatGPT through a global object called window.openai.

window.openai.toolOutput: When ChatGPT loads the iframe, it injects the result of the most recent tool call into this property. Your component can read it to initialize the interface.
window.openai.callTool: Your component can call this function to actively request ChatGPT to invoke another tool on the MCP server, enabling user interaction. The call returns new structured content, keeping the interface in sync.

In simple terms, the MCP server handles logic and data processing, the frontend component handles presentation and interaction, and the Apps SDK is responsible for seamlessly embedding them into ChatGPT’s conversation flow.

Part 2: Exploring Official Examples and Toolkits

Before building your own, take a look at the “wheels” provided officially. This can greatly improve your development efficiency and quality.

1. Apps SDK UI Component Library

To help developers quickly build interfaces that match ChatGPT’s design style, OpenAI provides Apps SDK UI. This is a lightweight design system based on Tailwind CSS and React.

Its core advantages include:

Out-of-the-box Design Tokens: Provides preset design specifications for colors, typography, spacing, shadows, and more.
Accessible Components: Built on Radix UI primitives, ensuring good accessibility support.
Tailwind 4 Deep Integration: Use utility classes defined directly by the design tokens for a smooth development experience.

Installation and setup are very simple:

npm install @openai/apps-sdk-ui

Then, at the top of your global CSS file (e.g., main.css), import the necessary styles:

@import “tailwindcss”;
@import “@openai/apps-sdk-ui/css”;
/* Ensure Tailwind can scan for classes within the component library */
@source “../node_modules/@openai/apps-sdk-ui”;

Now you can import and use pre-built, consistently styled components like <Button>, <Badge>, and <Card> in your React components.

2. Apps SDK Examples Repository

OpenAI maintains a feature-rich Examples Repository, which contains multiple runnable frontend components and their corresponding MCP servers. This is an excellent learning resource and source of inspiration.

A Look at the Repository Structure:

src/: Frontend source code for various example components.
assets/: Built HTML, JS, and CSS bundle files.
MCP Server Examples in Multiple Languages:
- Pizzaz (Node & Python): A pizza shop example featuring list views, carousels, map views, and a checkout flow. It uses the Apps SDK UI library.
- Solar System (Python): An interactive 3D solar system viewer.
- Kitchen Sink Lite (Node & Python): A “kitchen sink” example demonstrating all core functions like reading/setting widget state, calling tools, and using host APIs (e.g., opening external links).
- Shopping Cart (Python): Demonstrates how to use widgetSessionId to maintain shopping cart state across multiple tool calls.
- Authenticated (Python): Demonstrates tool calls requiring OAuth authentication.

Preparation for Running the Examples:

Environment Requirements: Node.js 18+, Python 3.10+, pnpm package manager is recommended.
General Steps:
1. Clone the repository and install dependencies: pnpm install
2. Build all components: pnpm run build (This generates static files in the assets/ directory)
3. Start the static file server: pnpm run serve (Serves files at http://localhost:4444)
4. Navigate to a specific server directory and start the corresponding MCP server as instructed (e.g., uvicorn pizzaz_server_python.main:app --port 8000).

Note: If you are using Chrome version 142 or higher, you need to disable the #local-network-access-check flag (set it in chrome://flags/) to view locally running widget UIs properly. Remember to restart your browser after making the change.

Part 3: Hands-on Tutorial – Building a To-Do List App

Theory is one thing, but hands-on practice is better. Let’s follow the official quickstart guide to build a simple to-do list application.

Step 1: Create the Frontend Component (`todo-widget.html`)

We create a standalone HTML file containing all the styles, structure, and logic.

<!DOCTYPE html>
<html lang=”en”>
<head>
    <meta charset=”utf-8” />
    <title>Todo list</title>
    <style>
        /* Style definitions: fonts, colors, layout, forms, and list styles */
        :root { font-family: “Inter”, system-ui, -apple-system, sans-serif; }
        body { background: #f6f8fb; padding: 16px; }
        main { background: #fff; max-width: 360px; border-radius: 16px; padding: 20px; box-shadow: 0 12px 24px rgba(15, 23, 42, 0.08); }
        /* … more specific style definitions */
    </style>
</head>
<body>
    <main>
        <h2>Todo list</h2>
        <form id=”add-form”>
            <input id=”todo-input” placeholder=”Add a task” />
            <button type=”submit”>Add</button>
        </form>
        <ul id=”todo-list”></ul>
    </main>
    <script type=”module”>
        // Core logic code
        const listEl = document.querySelector(“#todo-list”);
        const formEl = document.querySelector(“#add-form”);
        const inputEl = document.querySelector(“#todo-input”);

        // 1. Initialize the task list, prioritizing reading from data injected by ChatGPT
        let tasks = […(window.openai?.toolOutput?.tasks ?? [])];

        // 2. Render function: generates list DOM based on the tasks array
        const render = () => { /* … dynamically create li elements … */ };

        // 3. Listen for global state update events (triggered by ChatGPT)
        window.addEventListener(“openai:set_globals”, (event) => {
            if (event.detail?.globals?.toolOutput?.tasks) {
                tasks = event.detail.globals.toolOutput.tasks;
                render();
            }
        });

        // 4. Unified function for calling tools
        const callTodoTool = async (name, payload) => {
            if (window.openai?.callTool) {
                // In the ChatGPT environment, call via the SDK
                const response = await window.openai.callTool(name, payload);
                if (response?.structuredContent?.tasks) {
                    tasks = response.structuredContent.tasks;
                    render();
                }
            } else {
                // When testing locally, simulate tool calls
                if (name === “add_todo”) {
                    tasks = […tasks, { id: crypto.randomUUID(), title: payload.title, completed: false }];
                }
                if (name === “complete_todo”) {
                    tasks = tasks.map(task => task.id === payload.id ? { …task, completed: true } : task);
                }
                render();
            }
        };

        // 5. Bind form submit event (add task)
        formEl.addEventListener(“submit”, async (e) => {
            e.preventDefault();
            const title = inputEl.value.trim();
            if (title) {
                await callTodoTool(“add_todo”, { title });
                inputEl.value = “”;
            }
        });

        // 6. Bind list checkbox change event (complete task)
        listEl.addEventListener(“change”, async (e) => {
            const checkbox = e.target;
            if (checkbox.type === “checkbox”) {
                const id = checkbox.closest(“li”)?.dataset.id;
                if (id) {
                    await callTodoTool(“complete_todo”, { id });
                }
            }
        });

        // Initial render
        render();
    </script>
</body>
</html>

Step 2: Build the MCP Server (`server.js`)

The frontend handles interaction, the backend (MCP server) handles providing tools and data processing. We’ll use Node.js and the official MCP SDK.

import { createServer } from “node:http”;
import { readFileSync } from “node:fs”;
import { McpServer } from “@modelcontextprotocol/sdk/server/mcp.js”;
import { StreamableHTTPServerTransport } from “@modelcontextprotocol/sdk/server/streamableHttp.js”;
import { z } from “zod”;

// Read the HTML component we just created
const todoHtml = readFileSync(“public/todo-widget.html”, “utf8”);
let todos = []; // Simple in-memory storage
let nextId = 1;

function createTodoServer() {
    const server = new McpServer({ name: “todo-app”, version: “0.1.0” });

    // Crucial step: Register the UI resource so ChatGPT knows how to fetch this component
    server.registerResource(
        “todo-widget”,
        “ui://widget/todo.html”, // This URI will be referenced in tool metadata
        {},
        async () => ({
            contents: [{
                uri: “ui://widget/todo.html”,
                mimeType: “text/html+skybridge”, // Special MIME type indicating this is an Apps SDK widget
                text: todoHtml,
                _meta: { “openai/widgetPrefersBorder”: true } // Optional metadata: suggests displaying a border
            }]
        })
    );

    // Register the “Add Todo” tool
    server.registerTool(
        “add_todo”,
        {
            title: “Add todo”,
            description: “Creates a todo item with the given title.”,
            inputSchema: { title: z.string().min(1) }, // Use Zod to define input parameter validation
            _meta: {
                “openai/outputTemplate”: “ui://widget/todo.html”, // Key: Specifies which UI component renders the response for this tool
                “openai/toolInvocation/invoking”: “Adding todo”, // Text displayed while invoking
                “openai/toolInvocation/invoked”: “Added todo”,   // Text displayed after invocation
            }
        },
        async (args) => {
            const title = args?.title?.trim?.() ?? “”;
            if (!title) {
                return {
                    content: [{ type: “text”, text: “Missing title.” }],
                    structuredContent: { tasks: todos }
                };
            }
            const todo = { id: `todo-${nextId++}`, title, completed: false };
            todos = […todos, todo];
            return {
                content: [{ type: “text”, text: `Added “${todo.title}”.` }],
                structuredContent: { tasks: todos } // Return structured task list data
            };
        }
    );

    // Register the “Complete Todo” tool (similar structure)
    server.registerTool(
        “complete_todo”,
        {
            title: “Complete todo”,
            description: “Marks a todo as done by id.”,
            inputSchema: { id: z.string().min(1) },
            _meta: {
                “openai/outputTemplate”: “ui://widget/todo.html”,
                “openai/toolInvocation/invoking”: “Completing todo”,
                “openai/toolInvocation/invoked”: “Completed todo”,
            }
        },
        async (args) => {
            const id = args?.id;
            const todo = todos.find(task => task.id === id);
            if (todo) {
                todos = todos.map(task => task.id === id ? { …task, completed: true } : task);
                return {
                    content: [{ type: “text”, text: `Completed “${todo.title}”.` }],
                    structuredContent: { tasks: todos }
                };
            }
            return {
                content: [{ type: “text”, text: `Todo ${id} was not found.` }],
                structuredContent: { tasks: todos }
            };
        }
    );
    return server;
}

// Create an HTTP server and mount the MCP service at the specified path (/mcp)
const httpServer = createServer(async (req, res) => {
    const url = new URL(req.url, `http://${req.headers.host ?? “localhost”}`);
    // Handle CORS preflight requests
    if (req.method === “OPTIONS” && url.pathname === “/mcp”) {
        res.writeHead(204, { …CORS headers }).end();
        return;
    }
    // Route requests for the /mcp path to the MCP server for handling
    if (url.pathname === “/mcp” && [“POST”, “GET”, “DELETE”].includes(req.method)) {
        const server = createTodoServer();
        const transport = new StreamableHTTPServerTransport({ … });
        await server.connect(transport);
        await transport.handleRequest(req, res);
        return;
    }
    // Return 404 for other requests
    res.writeHead(404).end(“Not Found”);
});
httpServer.listen(8787, () => console.log(“Todo MCP server listening on http://localhost:8787/mcp”));

Step 3: Run and Test Locally

Start the server: Ensure package.json has ”type”: “module” set, then run node server.js.
Test with MCP Inspector: This is a visual testing tool to check if your server correctly lists tools and if tool calls work properly.
```
npx @modelcontextprotocol/inspector@latest http://localhost:8787/mcp
```
Expose to the public internet: To allow the remote ChatGPT service to access your local server, you need to use a tunneling tool like ngrok.
```
ngrok http 8787
```
After running, it will generate a temporary public URL (like https://abc123.ngrok.app).

Step 4: Integrate into ChatGPT

Enable Developer Mode in ChatGPT settings.
Go to Settings > Connectors, and click Create.
In the URL field, enter your MCP server endpoint, which is the ngrok URL from the previous step plus the /mcp path (e.g., https://abc123.ngrok.app/mcp).
Name your app (e.g., “My Todo List”), add a description, and create it.
Open a new conversation, click the “+” or “More” button next to the input box, and select the app you just added from the list.
Now, try asking ChatGPT: “Help me add a task: Read the Apps SDK documentation.” ChatGPT will call your add_todo tool and render the interactive to-do list within the conversation. You can check off tasks directly in the list.

Part 4: Advanced Techniques and Best Practices

After mastering the basic build process, understanding the following concepts will make your application more powerful and robust.

1. State Management: The Power of `widgetSessionId`

The shopping cart example demonstrates an important concept: maintaining state across tool calls. When a user modifies a shopping cart through multiple conversation turns (e.g., “add milk”, “add bread”), how do you ensure both the model and the widget see the same, latest state?

The answer is using _meta[“widgetSessionId”]. The MCP server returns a unique session ID in its response. Subsequently, whenever the widget calls a tool via window.openai.callTool, this ID is automatically attached to the request. The server can use this ID to look up and update previous state (e.g., shopping cart data stored in server memory or a database), ensuring continuity and consistency of state.

2. Deep Interaction Between Widget and Host

The Kitchen Sink Lite example shows how a frontend component can deeply utilize the host capabilities provided by window.openai:

setWidgetState: The widget can actively update its own state. This state is persisted and can be seen by the model in subsequent tool calls.
requestDisplayMode: Request to switch display modes (e.g., compact or expanded).
openExternal: Safely open an external link.
sendFollowUpMessage: Automatically send a follow-up message as the assistant.

3. Deployment Considerations

When you’re ready to deploy your app to a production environment:

Set Environment Variables: In your server environment, set BASE_URL=https://your-server.com. This is used to generate correct static resource reference paths.
Persist State: As mentioned in the shopping cart example, production environments should persist state (like cart contents) in a server-side database or cache, not just in memory.
Authentication Integration: For tools requiring user login, refer to the Authenticated Server example to implement authentication flows like OAuth.

FAQ: Frequently Asked Questions

Q: Do I have to use React and the Apps SDK UI library?
A: Absolutely not. Your frontend component can be built with any technology stack you prefer (Vanilla HTML/JS, Vue, Svelte, etc.). The Apps SDK UI library is merely an optional toolkit for improving development efficiency and consistency.

Q: Can MCP servers only be written in Node.js or Python?
A: No. MCP is a protocol; any language that implements it can be used. Official SDKs are provided for TypeScript/Node.js and Python, and community implementations for other languages may exist.

Q: What do I need to do every time I update my MCP server (e.g., add a new tool)?
A: You need to go to the Settings > Connectors page in ChatGPT, find your app, and click the Refresh button. This prompts ChatGPT to re-fetch the latest tool list and configuration from your server.

Q: Can I charge for my app?
A: This documentation does not cover commercialization policies. Please refer to the OpenAI official platform’s developer terms and guidelines.

Q: Besides rendering UI in an iframe, what else can MCP do?
A: The core of MCP is providing “tools.” Even without returning a UI widget, you can use it to let ChatGPT call your API, query your database, or execute any operation you define. Returning UI widgets (Widgets) is a capability added by the Apps SDK on top of MCP, specifically designed to enrich the ChatGPT interaction experience.

Conclusion and Next Steps

Through this article, you have understood the core concepts of the OpenAI Apps SDK, mastered the complete process of building an interactive ChatGPT application from scratch, and glimpsed more advanced application patterns. From simple to-do lists to complex 3D visualizations, the boundaries of the ChatGPT application ecosystem are being defined by developers.

Your next steps could be:

Dive deeper into the examples repository. Run the Pizzaz or Solar System examples to experience more complex interactions.
Customize with your data: Modify the handlers in the example servers to connect them to your real business systems (databases, APIs).
Create a brand new component: Place your new component in the src/ directory of the examples repository—it will be automatically picked up by the build system. Then, following the examples, write a dedicated MCP server to drive it.

Building ChatGPT apps is more than just creating a new feature; it’s about shaping the future interface for human-AI collaboration. Now is the time to turn your idea into that amazing interactive moment within a ChatGPT conversation.

Build Your First ChatGPT App: Complete OpenAI Apps SDK Tutorial