
AIRI — A Practical Guide for Developers and Creators
AIRI is an open source project that aims to make “cyber life” — a digital companion that can chat, act, and even play games — available and practical for anyone to run, extend, and customize. This guide translates the original Chinese README into clear, approachable English and reorganizes the material so you can quickly understand what AIRI is, what it can do today, and how to start using and contributing to it. All content in this post is strictly drawn from the original project README.
Quick summary
AIRI is a browser-first, modular collection of tools and subprojects for building interactive digital agents — from simple chat companions to animated VTuber-style characters that can speak, listen, animate, and even interact with games such as Minecraft and Factorio. The project is designed to run in modern browsers (including mobile via PWA), supports multiple LLM backends and audio services, and provides separate packages for memory, embeddings, UI elements and integrations.
Who this guide is for
-
Developers who want a practical path to run or extend an interactive digital agent in the browser or desktop. -
Artists and modelers who want to plug Live2D or VRM characters into a running agent. -
Researchers and hobbyists who want to experiment with multi-modal agents that combine chat, voice, and animation. -
project contributors looking for a clear, minimal setup and an understanding of the project layout.
Why AIRI exists
AIRI is inspired by public examples of game-playing, chat-capable virtual performers. The goal is not to reproduce any single closed-source system, but to create an open, modular toolkit that lets individuals run their own “digital life” systems locally or in the cloud — with a strong emphasis on browser compatibility and developer ergonomics. AIRI positions itself as a set of interoperable packages and apps so that people with varied skills can contribute UI components, models, or runtime integrations.
What makes AIRI different
-
Browser-first design. AIRI is built to run in modern browsers and supports Web APIs such as WebGPU, WebAudio, Web Workers, WebAssembly, and WebSocket. This makes it possible to run rich, interactive agents without forcing users to install heavyweight native applications. PWA support is available so the same app can be used on mobile devices.
-
Multi-modal capabilities. The project integrates chat, voice input, speech recognition, text-to-speech, and animated characters (VRM and Live2D), enabling agents to speak, listen, show facial motion and simple idle animations.
-
Game and agent integrations. AIRI includes agent implementations and tooling that enable the agent to play games such as Minecraft and Factorio, and to interact in chat platforms like Discord and Telegram.
-
Modular ecosystem. An array of subprojects and packages focus on memory layers, embedded databases, icon sets, and deployment helpers. This modular approach lets contributors focus on a single piece without changing the entire codebase.
These design choices emphasize running advanced features in the browser while giving extension points for local or cloud backends when required.
Current project status — features and roadmap
The README tracks progress by feature area. The list below reflects what the project has implemented and what’s still under construction. Each line is taken from the project status summary.
Cognition and agents
-
Agent gameplay and interaction: AIRI can control agents that play Minecraft and Factorio. -
Chat agents: Running on Telegram and Discord is supported. -
Memory: Browser-side database support exists using DuckDB WASM or embedded sqlite; a full-featured memory layer called Alaya is under active development. -
Local browser inference (WebGPU-based) is noted as a future item.
Audio and speech
-
Browser microphone input is available. -
Discord audio capture is supported. -
Client-side speech recognition and speaking detection are implemented. -
ElevenLabs is listed as an integrated TTS provider.
Animation and presentation
-
VRM support with model control and basic animations such as automatic blinking, gaze control, and idle eye movement. -
Live2D support with the same basic automated behaviors.
This status snapshot is useful when choosing where to begin: if your goal is to play with agents in the browser, the current features already support voice, basic animation and remote chat integration. If you need advanced memory or fully local inference, those areas are still being expanded.
Supported LLM and service backends
AIRI is designed to be backend-agnostic and lists numerous LLM providers and local runtimes that the project supports or aims to support through its xsai integration. The README documents a long list of supported and work-in-progress backends. Below is a faithful transcription of that list from the source.
Currently listed as supported (checks indicate support in the README):
-
OpenRouter, vLLM, SGLang, Ollama, Google Gemini, OpenAI, Anthropic Claude, DeepSeek, Qwen, xAI, Groq, Mistral, Cloudflare Workers AI, Together.ai, Fireworks.ai, Novita, Zhifu (智谱), SiliconFlow (硅基流动), StepFun (阶跃星辰), Baichuan (百川), Minimax, Moonshot (月之暗面), Player2, Tencent Hunyuan (腾讯混元).
Listed as not fully integrated or work in progress:
-
Azure OpenAI, iFLYTEK Spark (讯飞星火), VolcEngine (火山引擎 豆包) and other platform-specific endpoints are noted as partial or WIP in the README.
This wide compatibility is intended to let developers choose a backend they prefer — whether a cloud API or a local runtime — and plug it into the AIRI stack. The README flags integration state per backend so you can plan expected work required for each one.
Subprojects and components
AIRI acts as a meta-project that spawns and depends on smaller packages. These subprojects provide pieces you’ll likely use directly or as reference when building custom features. Here is a curated list taken directly from the project index.
-
unspeech
— a proxy server for/audio/transcriptions
and/audio/speech
, intended as a universal gateway for ASR and TTS backends. -
hfup
— tooling to package and deploy the project to Hugging Face Spaces. -
@proj-airi/duckdb-wasm
— an easy wrapper around@duckdb/duckdb-wasm
to enable browser-side, embedded database use. -
@proj-airi/drizzle-duckdb-wasm
— a Drizzle ORM driver for DuckDB WASM, used to organize data access patterns in browser memory layers. -
airi-factorio
and related packages — a set of tools and libraries to allow AIRI agents to interact with Factorio and its servers, including autorio automation libraries. -
UI and icon packages such as @proj-airi/lobe-icons
provide visual resources to standardize appearance across AIRI apps.
These components reflect the project’s practical focus: rich UI, in-browser data handling, audio integration, and game interoperability. Because each component is its own package, you can adopt just the pieces relevant to your project.
Development quickstart — run it locally
If you want to get a working copy running locally, the README gives a minimal set of commands. Use these commands from the project root. They assume you have pnpm
installed and a cloned repository. Run the exact commands below:
pnpm i
pnpm dev
To run individual pieces:
-
Web version (the same build that powers
airi.moeru.ai
):pnpm dev:web
-
Desktop “tamagotchi” version (a lightweight desktop pet mode):
pnpm dev:tamagotchi
-
Local documentation website:
pnpm -F @proj-airi/docs dev
These instructions give you a straightforward path to a development server or the specific sub-app you want to test. The README refers contributors to CONTRIBUTING.md
for more in-depth development guidance.
Architecture overview
The project README includes a component diagram (Mermaid) that shows how core pieces relate. The diagram contains the main runtime, memory drivers, UI components, stage and app layers, and agent integrations for games. A faithful translation of the diagram labels is included below so you can paste it into a Mermaid renderer if you wish to visualize architecture in your own notes.
flowchart TD
Core("Core")
Unspeech("unspeech")
DBDriver("@proj-airi/drizzle-duckdb-wasm")
MemoryDriver("[WIP] Memory Alaya")
DB1("@proj-airi/duckdb-wasm")
SVRT("@proj-airi/server-runtime")
Memory("Memory")
STT("STT")
Stage("Stage")
StageUI("@proj-airi/stage-ui")
UI("@proj-airi/ui")
subgraph AIRI
DB1 --> DBDriver --> MemoryDriver --> Memory --> Core
UI --> StageUI --> Stage --> Core
Core --> STT
Core --> SVRT
end
subgraph UI_Components
UI --> StageUI
UITransitions("@proj-airi/ui-transitions") --> StageUI
UILoadingScreens("@proj-airi/ui-loading-screens") --> StageUI
FontCJK("@proj-airi/font-cjkfonts-allseto") --> StageUI
FontXiaolai("@proj-airi/font-xiaolai") --> StageUI
end
subgraph Apps
Stage --> StageWeb("@proj-airi/stage-web")
Stage --> StageTamagotchi("@proj-airi/stage-tamagotchi")
Core --> RealtimeAudio("@proj-airi/realtime-audio")
Core --> PromptEngineering("@proj-airi/playground-prompt-engineering")
end
subgraph Server_Components
Core --> ServerSDK("@proj-airi/server-sdk")
ServerShared("@proj-airi/server-shared") --> SVRT
ServerShared --> ServerSDK
end
STT -->|Speaking| Unspeech
SVRT -->|Playing Factorio| F_AGENT
SVRT -->|Playing Minecraft| MC_AGENT
subgraph Factorio_Agent
F_AGENT("Factorio Agent")
F_API("Factorio RCON API")
factorio-server("factorio-server")
F_MOD1("autorio")
F_AGENT --> F_API -.-> factorio-server
F_MOD1 -.-> factorio-server
end
subgraph Minecraft_Agent
MC_AGENT("Minecraft Agent")
Mineflayer("Mineflayer")
minecraft-server("minecraft-server")
MC_AGENT --> Mineflayer -.-> minecraft-server
end
XSAI("xsAI") --> Core
XSAI --> F_AGENT
XSAI --> MC_AGENT
Core --> TauriMCP("@proj-airi/tauri-plugin-mcp")
Memory_PGVector("@proj-airi/memory-pgvector") --> Memory
This diagram highlights a few core ideas:
-
A browser-embedded DB (DuckDB WASM) feeds into a memory layer. -
The UI and stage layers are decoupled from core logic so different frontends can be developed independently. -
Game agents and server runtime components are connected through a common core and xsai integration.
How to contribute — a minimal, practical path
If you want to contribute but don’t know where to begin, the README suggests these minimal steps. They are designed to lower the friction for first-time contributors. These suggestions are taken from the contribution guidance in the source.
-
Browse the repository and pick one subproject you find interesting — for example @proj-airi/duckdb-wasm
orairi-factorio
. -
Read CONTRIBUTING.md
for the project’s development norms and expectations. -
Open a discussion or issue to introduce yourself and explain what you plan to work on. -
If you’re an artist or modeler, consider contributing Live2D or VRM assets; if you’re a frontend developer, add UI components; if you prefer backend work, focus on integrations or memory drivers.
The project explicitly welcomes a wide range of skill sets — you do not need to be an expert in Vue, TypeScript, or WebGPU to make meaningful contributions. Creating a dedicated subdirectory for your integration (React, Svelte, or other) is encouraged.
Subproject highlights and related tools
AIRI has generated several useful side projects and companion tools that are worth knowing about if you are building or deploying agents from this stack. These are listed in the README and include: unspeech
, hfup
, demodel
, inventory
, MCP Launcher
, and others. Many of these projects solve practical problems — from audio proxying to packaging for hosting platforms. The README lists these resources as part of the broader AIRI ecosystem.
Similar projects and ecosystem context
The README provides a short list of open source and commercial projects related to AIRI. These help to situate AIRI within a larger ecosystem of agent and VTuber projects. The list includes open-source recreations, agent frameworks, and community projects. This is a helpful reference if you want to compare approaches or find complementary tooling.
Practical tips for getting the most out of AIRI
These concise, practical recommendations are distilled from the README’s guidance and project structure. They aim to reduce wasted effort when you start experimenting.
-
Start with the docs site. If you want to understand the architecture and APIs first, run the documentation local server (
pnpm -F @proj-airi/docs dev
) before launching the full app. -
Pick a familiar LLM backend. The project supports many backends. Choose one you already know to reduce integration friction, then expand to others once the core flow works. The README marks which backends are fully supported and which are WIP.
-
Work on UI or assets as a safe first step. The UI and animation layers are nicely decoupled from the core logic. Contributing model assets, UI components, or icons is an effective way to get involved without changing core runtime code.
-
Use the modular packages. Many features are implemented as independent packages; adopt just what you need (e.g., duckdb-wasm for local memory, unspeech for audio proxying).
Frequently asked questions (FAQ)
Is AIRI open source?
Yes. The project and many of its packages are hosted on GitHub, and a dedicated organization (@proj-airi
) manages subprojects. The README includes links to the main repository and subpackages.
How can I try a live demo?
The README references a web demo hosted at airi.moeru.ai
and provides a Discord invite for community interaction. Use the demo and community channels to see the project in action.
What do I need to run the project locally?
The README recommends pnpm
as the package manager. The essential commands are pnpm i
and pnpm dev
, with additional subcommands for web, tamagotchi (desktop pet), and docs.
Can AIRI run on mobile devices?
AIRI has a browser-first design and includes PWA support, which allows usage on modern mobile browsers. The README highlights browser compatibility as a primary design goal.
Does AIRI support local inference?
The README lists local browser inference on WebGPU as a target but marks pure browser-based local inference as not yet complete. The project does, however, support many local and remote runtime options through xsai integrations.
Acknowledgements and lineage
The README credits a number of projects and design inspirations that helped shape AIRI’s approach to UI, rendering and tooling. It also notes that the project was inspired by publicly visible game-playing VTuber efforts and points to multiple community projects and libraries that guided the design choices. The acknowledgements section lists contributors and related projects referenced by the maintainers.
Closing notes
AIRI is presented in the README as an evolving, modular toolkit for building browser-first, multimodal digital agents. The project’s core strengths are its browser compatibility, modular subprojects, and a clear path for contributors across design, front-end, backend and AI-focused disciplines. If your interest is in practical experimentation — running a demo locally, integrating a favorite LLM backend, contributing a Live2D or VRM model, or extending the system to play a specific game — the README provides the commands, component references, and an active ecosystem of subprojects to get started.
Structured data snippet (FAQ + HowTo)
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is AIRI?",
"acceptedAnswer": {
"@type": "Answer",
"text": "AIRI is an open source, browser-first collection of tools for creating interactive digital agents that can chat, speak, listen, animate and interact with games."
}
},
{
"@type": "Question",
"name": "How do I run AIRI locally?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Install dependencies with pnpm i, then run pnpm dev. Use pnpm dev:web for the web app or pnpm dev:tamagotchi for the desktop pet mode."
}
}
]
},
{
"@type": "HowTo",
"name": "Quick local start",
"step": [
{"@type":"HowToStep","text":"Clone the repository"},
{"@type":"HowToStep","text":"pnpm i"},
{"@type":"HowToStep","text":"pnpm dev (or pnpm dev:web / pnpm dev:tamagotchi)"}
]
}
]
}