LobsterAI: An Open-Source 24/7 Personal Agent for Full-Scenario Office Automation

Core Question This Article Answers: In an era saturated with cloud-based AI chatbots, is there a solution that can autonomously operate your computer, handle complex office tasks, and keep your data strictly private on your own device?

If you have ever stayed late to compile a weekly report or spent hours refreshing web pages to gather industry information, you may have envisioned a “digital employee” living inside your computer. You would simply tell it, “Finish this report for me,” and it would act like a human—operating software, searching for information, generating files, and presenting the results. LobsterAI is designed to turn this vision into reality.

LobsterAI Logo

Developed by NetEase Youdao, LobsterAI is more than just a chatbot; it is a full-scenario personal assistant agent. Unlike traditional dialog-based AI, LobsterAI possesses powerful “execution capabilities.” Built on Electron, it runs on your local desktop and can autonomously invoke tools, manipulate files, and execute commands. Crucially, it stands by 24/7 and allows you to trigger it remotely via mobile IM platforms like DingTalk, Feishu (Lark), Telegram, and Discord.

Redefining Productivity: The Core Value of LobsterAI

Core Question of This Section: What substantive changes can LobsterAI bring to our daily work routine?

The core design philosophy of LobsterAI is the “Cowork Mode.” This is a working mode that deeply integrates conversation with execution. In traditional AI interactions, the AI often provides text advice or code snippets, leaving you to copy, paste, save files, and run the code yourself. In Cowork Mode, LobsterAI executes these operations directly in your local environment or an isolated sandbox.

Imagine this scenario: You need to create a PowerPoint presentation containing data charts. In LobsterAI, you simply input: “Make a quarterly report PPT based on this Excel data, including bar charts and trend analysis.” The Agent will automatically invoke the xlsx skill to read the data, apply data analysis logic to generate charts, and finally call the pptx skill to produce the slide deck. The entire process requires no manual intervention in layout or calculation.

Office Capabilities Covering All Scenarios

LobsterAI aims to cover the full workflow of daily office work. It comes with 16 built-in core skills, ranging from information gathering to content production:

  • Data Analysis & Document Processing: Supports generating Word documents, Excel spreadsheets, PowerPoint presentations, and processing PDFs. Whether writing weekly reports, creating financial statements, or preparing briefing materials, it handles the task with ease.
  • Multimedia Creation: Integrated Remotion video generation capabilities allow for the creation of promotional videos or data visualization animations. It also supports Canvas drawing design for quickly producing posters or charts.
  • Automation & Information Retrieval: Through the Playwright skill, it achieves Web automation operations, such as automatically filling out forms or scraping web data. The built-in web-search skill can quickly collect and organize internet information.
  • System Interaction: Supports local system tool operations, allowing file management and execution of system commands, achieving deep integration with the operating system.

Author Insight: The Unique Value of Local Agents

In an era where cloud services dominate, LobsterAI’s choice to place core computing and storage locally is thought-provoking. For enterprises and individuals, much sensitive data (such as financial reports and internal documents) is not suitable for uploading to cloud AI platforms. Through local execution and SQLite storage, LobsterAI ensures data never leaves the user’s device. This return of “data sovereignty” is one of its greatest competitive advantages in office scenarios.

AI Data Privacy
Image Source: Unsplash

Cowork Mode: The Engine That Lets AI “Act”

Core Question of This Section: How does LobsterAI ensure it can execute tasks autonomously while maintaining system security?

The soul of LobsterAI lies in its Cowork System. This is a working session system built on the Claude Agent SDK. It allows the Agent to autonomously plan task steps, invoke tools, and execute them under user supervision.

Flexible Execution Modes

To balance execution efficiency with security, LobsterAI offers three execution modes:

Mode Applicable Scenario Security Implication
auto General Tasks The system intelligently judges the task nature and selects local or sandbox environments automatically. Suitable for most scenarios.
local Trusted Tasks Runs directly in the local environment at full speed. Highest efficiency, suitable for processing non-sensitive local file operations.
sandbox Untrusted/Complex Tasks Runs in an isolated Alpine Linux virtual machine. Even if dangerous commands are executed, the host system remains unaffected. Security is prioritized.

Permission Gating and Streaming Feedback

In a fully autonomous Agent system, users are often most concerned about “losing control.” What if the Agent deletes an important file by mistake? LobsterAI has designed a strict Permission Gating Mechanism.

All tool calls involving sensitive operations (such as deleting files, sending emails, executing terminal commands) must undergo explicit user approval. When the Agent requests to perform such an operation, a CoworkPermissionModal pops up in the frontend interface, detailing the impending action. The user can choose “Approve Once” or “Approve for Session” (trusting all operations of that tool in the current conversation).

Simultaneously, the system achieves streaming event feedback through IPC (Inter-Process Communication). You can see the Agent’s thought process, the tools being executed, and the incremental content generated in real-time. This transparency greatly enhances user trust in the AI.

Scenario Example:
Suppose you tell the Agent to “Clean up temporary files in the download folder older than one month.” The Agent will first list the file inventory, then initiate a permission request: “About to perform file deletion operation. Approve?”. Only after you confirm and click approve will the Agent actually execute the deletion command. This is not just a security measure but a process of intent confirmation.

Technical Architecture: A Solid Foundation

Core Question of This Section: How is LobsterAI architected to ensure cross-platform compatibility and data security?

LobsterAI is not a simple pile of scripts; it adopts a modernized, strict Electron process isolation architecture. Understanding its architecture helps us utilize it better.

Process Model Details

LobsterAI Architecture Diagram

  • Main Process: This is the brain of the application. It manages window lifecycles, persists SQLite data, and executes the CoworkRunner engine. It hosts the IM Gateway, handling instructions from external platforms like DingTalk and Feishu. Security is central to the main process design; it enables context isolation, disables node integration, and ensures frontend code cannot directly access underlying system resources.
  • Preload Script: Acting as a security bridge, it exposes a limited API to the frontend via contextBridge. It defines the cowork namespace, allowing the frontend to safely initiate sessions and receive streaming events.
  • Renderer Process: This is the user interface. Built on React 18 and TypeScript, it uses Redux Toolkit for state management and Tailwind CSS for styling. All business logic runs here, but system calls must communicate with the main process via IPC.

This three-tier architecture perfectly balances the development efficiency of Web technologies with the security requirements of desktop applications.

Technology Stack Overview

To meet high-performance and modern development needs, LobsterAI utilizes the following core technology stack:

  • Framework & Build: Electron 40 provides the cross-platform runtime; Vite 5 handles rapid building and hot reloading.
  • Frontend Ecosystem: React 18 combined with TypeScript ensures code robustness; Tailwind CSS implements highly customizable UI; Redux Toolkit handles complex state management.
  • AI & Storage: The underlying AI engine adopts the Claude Agent SDK. Local storage uses sql.js (WebAssembly version of SQLite), achieving fully localized data management.

Skill System: An Infinitely Extensible Toolbox

Core Question of This Section: Beyond built-in functions, how can we extend the Agent’s capabilities according to our needs?

LobsterAI defines its capabilities through “Skills.” Each skill is a collection of specific functions, configured in the SKILLs/skills.config.json file. This modular design allows the Agent’s capability boundaries to be extended infinitely.

Built-in Skills Detail

LobsterAI provides a rich skill library by default, capable of meeting the vast majority of office scenarios:

Skill Name Function Description Typical Application Scenario
web-search Web Information Retrieval Gathering competitor data, querying industry news, organizing academic literature.
pptx PowerPoint Creation Automatically generating quarterly report PPTs, making product introduction slides.
remotion Video Generation Creating data visualization dynamic videos, generating simple marketing clips.
playwright Web Automation Automatically logging into internal systems to scrape reports, batch filling web forms.
scheduled-task Scheduled Tasks Daily automatic email summarization, weekly work report generation.
imap-smtp-email Email Processing Auto-replying to customer emails, categorizing and organizing the inbox.

Custom Skills: Creating an Exclusive Agent

For developers or users with specific needs, LobsterAI provides the skill-creator skill. This allows users to create brand-new skills through natural language descriptions or code definitions, supporting hot loading.

Scenario Example:
You are a DevOps engineer who frequently needs to query server status. You can create a “Server Monitor” skill, defining SSH scripts for connecting to servers and parsing status logic. Once created, you simply tell LobsterAI: “Check the production environment server load,” and it will automatically invoke that skill to execute the task.

Persistent Memory: The Secret to Getting Better with Use

Core Question of This Section: How does AI maintain continuous understanding of user preferences across sessions?

Traditional AI conversations are often “forgetful”—every time a new window is opened, the previous context is lost. LobsterAI introduces a Persistent Memory system to solve this pain point. It can automatically extract user personal information and preference habits from conversations and store them in the local database.

Memory Extraction Mechanism

The system automatically analyzes content after each round of conversation, extracting different types of memory:

  • Personal Profile: Such as “My name is John” or “I am a Product Manager.” This type of information has the highest confidence level, and the Agent treats it as long-term background knowledge.
  • Personal Preferences: Such as “I like a concise style” or “I don’t like using emojis.” When generating documents or replies later, the Agent will automatically follow these style guides.
  • Active Notification: When you explicitly say “Remember this…”, the system records the information with the highest priority.

Practical Application of Memory

The experience improvement brought by this memory mechanism is significant. For instance, the first time you use it, you tell the Agent, “I prefer viewing code output in Markdown format and prefer Chinese replies.” In countless subsequent interactions, whether you command it to write code or generate documents, it will default to following this format without you needing to repeat yourself.

Author Reflection:
The memory function is a key step towards “Personalized Assistants.” However, the accuracy of memory is crucial. LobsterAI introduces a “Capture Strictness” setting, allowing users to control the sensitivity of automatic extraction. This reflects a product design trade-off: overly aggressive extraction might record noise, while being too conservative might miss important information. Giving control to the user is the best solution to this contradiction.

IM Remote Control: Commanding Work Anytime, Anywhere

Core Question of This Section: How to break the physical limitations of the desktop client to achieve mobile office automation?

Modern office work is often not limited to the desk. LobsterAI’s IM Integration feature turns your mobile phone into a remote control console. Through configuration, you can bridge the Agent to DingTalk, Feishu, Telegram, or Discord.

How It Works:
When you send a message on your phone via DingTalk saying “Help me check today’s industry news and generate a brief,” the IM Gateway receives the message and passes it to the local CoworkRunner via IPC. The Agent executes the search, organization, and generation tasks on your computer, and pushes the results back to your phone upon completion.

Typical Scenario:
You are on the subway heading home and suddenly remember a document hasn’t been sent. You open Telegram and send a command to LobsterAI: “Send the final_version.docx on the desktop to client client_name.” The Agent executes the email sending task on your home computer, and seconds later you receive a “Sent Successfully” feedback. This seamless mobile office experience greatly liberates productivity.

Mobile Remote Control
Image Source: Unsplash

Scheduled Tasks: The Ultimate Form of Automation

Core Question of This Section: How to make repetitive work complete automatically without human intervention?**

For cyclical work, LobsterAI provides powerful Scheduled Task functionality. This is not just a simple alarm clock, but an intelligent task scheduling system.

You can create tasks in two ways:

  1. Conversational Creation: Simply say “Help me collect the latest tech news every morning at 9 AM.” The Agent automatically parses the time intent and creates a Cron task.
  2. GUI Interface Creation: Manually configure detailed execution rules in the settings panel.

Application Scenarios:

  • News Collection: Automatically search for news with specified keywords at 09:00 daily, generate a summary, and push it to your IM.
  • Email Organization: Check the inbox every hour, automatically archive spam, and mark important emails.
  • System Monitoring: Check server status every 10 minutes and alert immediately if an anomaly is detected.

These tasks rely on Cron expressions, supporting minute, hour, day, week, and month granularities, truly achieving “configure once, run automatically.”

Quick Start: Deploying Your First LobsterAI

Core Question of This Section: How can an average developer get LobsterAI running on their device?**

As an open-source project, the deployment process for LobsterAI is transparent and standardized.

Prerequisites

Before starting, ensure your development environment meets the following requirements:

  • Node.js: Version >= 24 and < 25. Using nvm for version management is recommended.
  • npm: Installed along with Node.js.

Installation Steps

First, clone the project repository and install dependencies:

# Clone the repository
git clone https://github.com/netease-youdao/LobsterAI.git
cd lobsterai

# Install dependencies
npm install

Starting Development Mode

Run the following command to start the development server. This launches both the Vite hot-reload service and the Electron window.

npm run electron:dev

By default, the frontend interface runs on http://localhost:5175. You can modify the code here, and the interface will refresh in real-time.

Production Build and Packaging

If you need to package the application as an executable for distribution, use the following commands:

# Compile TypeScript and package frontend assets
npm run build

# Package for different platforms
npm run dist:mac     # macOS
npm run dist:win     # Windows
npm run dist:linux   # Linux

The packaged installers will be output in the release/ directory.

Data Storage and Security Model

Core Question of This Section: Where is my data stored, and how does the system defend against potential security threats?**

In the AI era, data security is one of the user’s top concerns. LobsterAI made security a core consideration from the start of its design.

Localized Data Storage

All data—including chat history, configuration information, scheduled tasks, and memory data—is stored in a local SQLite database. The file name is lobsterai.sqlite, located in the user data directory. This means that as long as you don’t actively upload it, your data stays on your hard drive.

The database table structure is clearly designed:

  • kv: Stores application configuration.
  • cowork_sessions / cowork_messages: Stores session and message history.
  • scheduled_tasks: Stores scheduled task definitions.

Multi-Layer Security Defense

  1. Process Isolation: Electron’s renderer process is strictly isolated from the main process. The renderer cannot directly access Node.js APIs or system files; it must communicate through limited interfaces exposed by the preload script.
  2. Sandbox Execution: For uncertain code or scripts, execution can be chosen to run in an Alpine Linux sandbox. This sandbox is an isolated virtual environment; even if malicious code is executed, it cannot access the host file system.
  3. Permission Approval: As mentioned earlier, sensitive operations require manual approval. This prevents “hallucination” behaviors by the Agent from leading to data loss.

Practical Summary / Action Checklist

To facilitate a quick start, here is a brief action checklist:

  1. Installation: Ensure Node.js version is correct, run npm install, then npm run electron:dev.
  2. Basic Configuration: Set the working directory in the settings panel and choose the execution mode (default auto recommended).
  3. Memory Settings: Enable “Auto Capture” to let the Agent learn your preferences.
  4. IM Binding: If remote control is needed, fill in the Tokens for platforms like DingTalk/Feishu in settings.
  5. Task Testing: Try inputting “Search for the latest AI news for me” to observe the Agent’s execution flow and permission requests.
  6. Scheduled Tasks: Try creating a simple scheduled task, like “Report time every minute,” to verify the scheduling system works.

One-Page Summary

LobsterAI is a full-scenario personal assistant agent open-sourced by NetEase Youdao, built on Electron and React.

  • Core Capabilities: Autonomous execution of data analysis, document generation, video creation, email handling, and other office tasks.
  • Technical Highlights: Cowork Mode (local/sandbox execution), Persistent Memory System, IM Remote Control, Scheduled Tasks.
  • Security Architecture: Process isolation + Permission gating + Local SQLite storage, ensuring data privacy.
  • Target Audience: Office workers needing to automate repetitive tasks, developers pursuing efficiency.
  • License: MIT License, supporting free commercial use and secondary development.

Frequently Asked Questions (FAQ)

1. Which operating systems does LobsterAI support?
LobsterAI is cross-platform, supporting macOS (Intel and Apple Silicon), Windows, and Linux desktops. Combined with IM integration, it also supports mobile remote control.

2. What environment is needed to run LobsterAI?
You need to install Node.js, version 24 or higher but below 25. The npm package manager is also required.

3. Is data uploaded to the cloud?
No. LobsterAI adopts a local-first strategy. All data is stored in a local SQLite database; chat logs and configurations never leave your device.

4. How can I prevent the Agent from accidentally deleting my important files?
LobsterAI has a built-in permission gating mechanism. All sensitive operations (like deleting files or sending emails) require explicit user approval before execution.

5. What is Cowork Mode?
Cowork Mode is the core of LobsterAI. It allows the AI Agent to autonomously execute tools, manipulate files, and run commands in a local or sandbox environment, rather than just generating text suggestions.

6. Can I use LobsterAI on my mobile phone?
LobsterAI is a desktop application, but it supports remote triggering via IM platforms like DingTalk, Feishu, Telegram, and Discord. You can send commands on your phone to have the desktop Agent execute tasks.

7. How does the Agent remember my preferences?
The system has persistent memory enabled by default. You can tell the Agent your preferences directly in the conversation (e.g., “I like Markdown format”), and it will automatically extract and record them for application in future dialogues.