Local Data Desensitization: An Innovative Solution to AI Service Privacy Leaks

In today’s digital landscape, artificial intelligence services have become indispensable components of our daily lives and professional workflows. However, as AI applications proliferate, a critical challenge has emerged: the risk of privacy data leaks in AI services. From the early 2025 data breaches involving DeepSeek and OmniGPT to recent privacy incidents in immersive translation tools, these events serve as stark reminders that AI conversation records containing sensitive information face unprecedented security challenges.
AI service providers typically store user conversation records in plaintext format. These records may contain sensitive data such as names, ID numbers, phone numbers, home addresses, medical information, and even business secrets. Should a data breach occur, users face multiple risks including identity theft, financial losses, and privacy violations.
Traditional data desensitization methods primarily rely on regular expression replacement. While straightforward, this approach has significant limitations: it can only handle predefined rules for fixed-format data like phone numbers and ID cards, proving ineffective for format-insensitive information such as names, addresses, and passwords. Named Entity Recognition (NER) technology can identify more types of sensitive information, yet when handling complex interpersonal relationships, simple replacement methods like [PERSON1], [PERSON2] can cause confusion in AI models, degrading response quality.
PromptMask introduces a novel local data desensitization solution that ingeniously combines the strengths of local small models and large models, protecting user privacy while maintaining AI service quality. This approach addresses the core tension between data utility and confidentiality that has long plagued AI service providers.

The Privacy Crisis in AI Services

AI service providers face an inherent dilemma: they need user data to improve services while simultaneously protecting that data from exposure. When conversations contain personal details, medical histories, or confidential business information, the stakes become exceptionally high. Consider a scenario where a user discusses their health concerns with an AI assistant – if this data is stored insecurely, it could lead to insurance discrimination, employment issues, or social stigma.
The consequences of AI privacy breaches extend far beyond individual harm. Organizations using AI services may inadvertently expose trade secrets or strategic plans through employee interactions. In sectors like healthcare and finance, such breaches could violate regulatory compliance requirements, resulting in legal penalties and loss of customer trust.
Traditional desensitization methods fall short in addressing these complex scenarios. Regular expression-based systems can identify patterns like phone numbers or email addresses but struggle with context-dependent sensitivities. For instance, a name might be sensitive in one context but acceptable in another. NER systems, while more sophisticated, create artificial placeholders that disrupt the semantic relationships between entities, potentially altering the meaning of the conversation.

How PromptMask Works

PromptMask operates through a three-tiered architecture that maintains data utility while ensuring privacy protection. The system first processes user input locally using a lightweight desensitization model, then sends the sanitized text to the cloud-based large language model, and finally reconstructs the response with appropriate entity references.
The local desensitization model performs two critical functions: entity detection and replacement. It identifies sensitive entities such as names, locations, dates, and custom-defined patterns. Instead of using generic placeholders, PromptMask employs a unique “entity relationship mapping” technique that preserves the semantic connections between entities while obscuring their actual values.
For example, consider this conversation:
User: “I need to book a flight from Beijing to Shanghai on March 15th.”
Traditional desensitization: “I need to book a flight from [CITY1] to [CITY2] on [DATE1].”
PromptMask: “I need to book a flight from Origin City to Destination City on Travel Date.”
This approach maintains the logical flow of the conversation while protecting sensitive information. The cloud model can still understand the relationships between entities and provide relevant responses.

Technical Implementation

Setting up PromptMask requires several key components. The system runs on Python 3.8+ and requires PyTorch 1.9.0 or later. Installation follows these steps:

Clone the repository from the official GitHub page
Install dependencies using pip install -r requirements.txt
Download the pre-trained desensitization model weights
Configure the settings file with your entity patterns and replacement rules
The configuration file allows users to define custom sensitive patterns using regular expressions or entity types. For instance:

entities:
  - name: PERSON
    patterns: ['\b[A-Z][a-z]+ [A-Z][a-z]+\b']
    replacement: 'User Name'
  - name: LOCATION
    patterns: ['\b[A-Z][a-z]+, [A-Z]{2}\b']
    replacement: 'City, State'

The system processes text in batches to optimize performance. For typical use cases, it can handle approximately 1000 tokens per second on a standard CPU, with significant improvements when using GPU acceleration.

Optimization Techniques

PromptMask incorporates several optimization techniques to ensure efficient operation:

Quantization Models: The local desensitization model uses quantized versions to reduce memory footprint and computational requirements. This allows the system to run efficiently on consumer hardware without compromising detection accuracy.
Batch Processing: Text is processed in batches rather than single inputs, reducing overhead and improving throughput. This is particularly beneficial for applications handling multiple simultaneous requests.
Caching Mechanism: Common sensitive information patterns are cached to avoid redundant processing. The system maintains a cache of frequently encountered entities and their replacements, significantly speeding up repeated interactions.
Hardware Acceleration: GPU acceleration is supported for local inference, providing up to 5x performance improvement on compatible hardware. The system automatically detects available hardware and optimizes processing accordingly.

Security Considerations

When implementing PromptMask, several security measures should be observed:

Model Updates: Regularly update the local desensitization model to ensure it can identify the latest entity patterns. The development team releases monthly updates to address emerging privacy concerns.
Configuration Management: Securely store configuration files containing sensitive patterns. These files should be encrypted at rest and accessible only to authorized personnel.
Network Isolation: Ensure the local model service operates within a trusted network environment. Implement firewall rules to restrict access to the local processing component.
Access Control: Implement appropriate API access controls for the desensitization service. Use authentication mechanisms like API keys or OAuth tokens to prevent unauthorized use.

Future Development Directions

The PromptMask project continues to evolve with several planned enhancements:

Multimodal Support: Future versions will extend desensitization capabilities to image and audio data. This will address privacy concerns in multimodal AI applications that process both text and non-text inputs.
Adaptive Desensitization: The system will incorporate context-aware desensitization that adjusts protection levels based on conversation context. For example, a medical consultation might trigger higher protection thresholds than a casual chat.
Federated Learning Integration: By combining federated learning techniques, PromptMask can improve entity detection while preserving user privacy. The system will learn from distributed data without centralizing sensitive information.
Industry Standards: The development team actively participates in establishing industry standards for AI data desensitization. This includes collaborating with regulatory bodies to ensure compliance with emerging privacy regulations.

Conclusion

As we increasingly rely on AI services in our personal and professional lives, addressing data privacy concerns becomes paramount. PromptMask offers an innovative solution that balances the need for data utility with the imperative of confidentiality. By leveraging local processing to protect sensitive information before it reaches the cloud, the system maintains service quality while safeguarding user privacy.
The “local-small model, cloud-large model” architecture represents a paradigm shift in AI service design. It enables organizations to deliver sophisticated AI experiences without compromising on privacy – a balance that has been elusive in the industry. As AI applications continue to expand into sensitive domains like healthcare, finance, and legal services, technologies like PromptMask will become essential components of responsible AI deployment.
Through the synergistic work of local small models and cloud large models, PromptMask achieves an elegant balance reminiscent of “divine machinery with a hidden mask” – concealing real information to benefit others while preserving semantic relationships to accomplish tasks. This philosophy of “hiding oneself to benefit others, concealing truth to achieve goals” embodies the true essence of privacy protection in the AI era.

Local Data Desensitization: Solving the Privacy Crisis in AI Services