Meet Redakt: Practical GDPR Compliance for AI Teams
TL;DR
Telling employees "don't enter personal data into AI tools" doesn't work without giving them a way to comply. Redakt is an open-source PII anonymizer built on Microsoft Presidio that sits between your employees and their AI tools. Paste text in, get an anonymized version with placeholders. Paste the AI's response back, get the original values restored. The server never stores PII. It runs on your infrastructure, inside your network. No additional data processing agreements needed. The tool is free, the code is open.
Earlier this month I wrote about shadow AI and the compliance gap discussing how employees using unapproved AI tools with personal data are creating quiet GDPR liability across Europe, and how the gap between what the law requires and what companies actually do is growing every day.
The response made one thing clear: people know they have a problem. Current measures feel like we are all playing whack-a-mole.
The advice that post ended with was honest but incomplete. Department-specific guidelines, approved tool lists, clearer communication — all necessary, all insufficient. Because even with perfect policies, you still have the same fundamental problem: an employee sitting in front of ChatGPT with a paragraph of text containing a customer's name, email, and order history, and no practical way to strip it out before hitting enter.
So I built something.
A Tool Better Than Policy
A sales rep pastes a customer's name, email, and order history into ChatGPT's free tier, using a personal account, to draft a follow-up email. Thirty seconds later they have a polished result. They send it and never think about it again. That thirty seconds just created a potential data breach under GDPR Article 4(12): personal data transmitted to a third party without a DPA, without a legal basis, and without the data subject's knowledge. The 72-hour notification clock starts ticking once the company finds out.
The sales rep isn't reckless. They're doing what every productivity blog tells them to do. I can't blame them. The pressure to adopt AI is real. The problem isn't motivation. It's that "just anonymize it first" is advice without a mechanism.
What does "anonymize it" even mean to someone who isn't a data protection specialist? Manually find every name and replace it with "Person A"? You can't solve a behavioral problem with a policy document. You solve it with a tool.
Introducing Redakt
Redakt is an open-source web application and REST API for detecting and anonymizing PII in text before it reaches an AI tool. It wraps Microsoft Presidio, Microsoft's proven PII detection framework, and adds a practical workflow designed for exactly the scenario above.
Here's how it works:
1. Paste your text. The employee takes the text they want to send to an AI tool and pastes it into Redakt's web interface.
2. Redakt detects and replaces PII. Names become <PERSON_1>, email addresses become <EMAIL_1>, phone numbers become <PHONE_1>. Every entity type gets a numbered placeholder that preserves the structure and meaning of the original text.
3. Copy the anonymized text into your AI tool. The cleaned version goes into ChatGPT, Claude, DeepL, or whatever the employee prefers. The AI generates its response using the placeholders.
4. Paste the AI's response back into Redakt. The response comes back with <PERSON_1> and <EMAIL_1> intact. Redakt's deanonymization restores the original values and the employee has a finished, personalized result.
The mapping between placeholders and real values lives in the browser session. It never touches the server. The server processes text, detects PII, returns results, and forgets. Stateless by design.
The Compliance Tool Is Compliant
Every architectural decision in Redakt was made to minimize the compliance burden of the tool itself.
No PII at rest. The server never stores personal data. It processes text in memory and discards it. This means Redakt doesn't become another system you need to write a privacy policy for.
No additional DPA required. Because Redakt runs on your infrastructure and doesn't persist data, you don't need a Data Processing Agreement with anyone to use it. Compare that to sending the same data to a cloud-based anonymization service, which would itself require a DPA, international transfer mechanisms, and all the same compliance overhead you're trying to avoid.
Enterprise internal deployment. One docker compose up command and you have the full stack running inside your network. Your data never leaves your infrastructure. No cross-border transfer concerns. No third-party processing.
REST API for automation. The same anonymization capabilities available through the web UI are exposed as API endpoints. AI agents and automated workflows can call Redakt programmatically. This matters as companies move from individual AI tool usage toward agentic workflows where prompts are generated and sent without human intervention.
Built on Presidio. This isn't a regex-based toy. Microsoft Presidio combines pattern matching (for structured PII like email addresses, IBANs, and tax IDs), NLP-based named entity recognition (for person names, locations, and organizations), and contextual scoring (surrounding words like "email" or "phone" boost detection confidence). It ships with 13 German-specific recognizers: Steueridentifikationsnummer, Reisepass, Personalausweis, KFZ-Kennzeichen, and more. For a European audience, this coverage matters.
This Isn't Magic
PII detection isn't perfect. No system catches 100% of personal data. Context-dependent PII — a street address that doesn't match a known pattern, a nickname, an indirect identifier — can slip through. Redakt leans toward over-detection (flagging something that isn't PII is better than missing something that is), but it's a layer of protection, not a guarantee.
This doesn't make free-tier AI tools compliant. Even with anonymized text, using free-tier tools for business purposes raises other compliance questions (terms of service, data retention policies, lack of enterprise controls). Redakt reduces the personal data risk, but the ideal setup is still: enterprise-tier tools with proper DPAs, plus anonymization as a defense-in-depth layer.
Behavioral adoption is still the hard part. The tool exists. Getting every employee to use it before every prompt is a change management challenge, not a technical one. But having a concrete, easy-to-use tool makes that conversation much more practical than "just be careful with personal data."
A Way Forward
Compliance isn't about restricting AI use. Every regulation I've worked through as an AI engineer has the same underlying logic: you can use these tools, but you need to protect the people whose data you're processing. That's not an unreasonable ask. It's the minimum.
The code is on GitHub. The predecessor post on shadow AI and the compliance gap gives the full regulatory context. Redakt is one tool for one part of that problem. It's open source because compliance tooling shouldn't be a profit center, it should be infrastructure.