Remediate AI model risks at the source
Unlearn risky data and behavior your models should never have learned. Machine Unlearning provides core remediation — not just perimeter defense.
Perimeter defenses don’t fix what’s inside the model
When models memorize PII, develop jailbreak vulnerabilities, or absorb poisoned training data, guardrails and monitoring only manage the symptoms. The risk stays in the model.
Models know and do things they shouldn’t
Whether your model is pre-trained or fine-tuned, it contains data and weaknesses that can be exploited in production.
Guardrails get bypassed
Output filtering and prompt controls are important layers, but adversaries find ways around them.
No way to fix the model itself
Process controls and fine-tuning add overhead and mask LLM vulnerabilities — without actually remediating them.
Harden against jailbreaks and attacks
Reduce prompt injection vulnerabilities by 85% and strengthen models against adversarial attacks — at the parameter level, not just the perimeter.
Remove memorized data completely
Eliminate the risk of data leakage or exfiltration with 100% removal of PII, PHI, and sensitive information from trained models. Not filtered from outputs — actually gone from the model’s parameters.
Reduce risks in hours, not months
When security issues are discovered — whether in testing or production — mitigate them in hours without taking systems offline. No months-long retraining, no service disruption.
85% reduction in jailbreaks
Unlearned models are up to 85% more protected against prompt injections, verified on benchmarks like PurpleLlama.
100% PII removal
100% removal of fine-tuned PII from LLMs, with zero impact on other data or functionality.
70% reduction in biases
Our unlearned LLMs achieved up to 70% reduction in biases, verified on benchmarks like BBQ.
What types of risks does Hirundo target at the model level?
Hirundo targets risks that arise from what a model has learned and encoded internally. This includes memorized sensitive data, susceptibility to prompt injection and jailbreaks, biased or unsafe response patterns, and other learned behaviors that persist even when inputs and outputs are monitored or filtered.
Why aren’t guardrails and output monitoring sufficient?
Guardrails and monitoring operate outside the model. They can detect or block known failure cases, but they don’t change the representations that generate those failures. As a result, the same risks tend to reappear under prompt variation, adversarial pressure, or distribution shift. Hirundo addresses this by modifying the model’s internal behavior so those failure modes are less likely to occur at all. Guardrails are still useful, but with Hirundo they are significantly harder to manipulate.
Can Hirundo actually remove sensitive data from a trained model?
For open-weight models, yes. Hirundo can remove specific fine-tuned or injected data so it becomes unrecoverable from the model’s parameters, even under targeted extraction attempts. This applies to data that was added during fine-tuning or post-training, where the goal is to eliminate the model’s ability to reproduce that content, not just suppress it at generation time.
How does Hirundo handle risks like jailbreaks and prompt injection?
Jailbreaks are not isolated bugs – they reflect learned patterns that allow certain prompts to override constraints. Hirundo identifies representations associated with these patterns and attenuates them, reducing how strongly the model responds to jailbreak-style prompts. This doesn’t make jailbreaks theoretically impossible, but it materially reduces exploitability compared to perimeter-only defenses.
How do you validate that unlearning actually reduced risk?
Before and after unlearning, models are evaluated using adversarial prompt sets and extraction attempts designed to surface the targeted risk. These are paired with general utility benchmarks to ensure changes are localized. The goal is measurable reduction in exploit success without degrading unrelated behavior like reasoning or factual accuracy.
Does this work for models that are already deployed?
Yes. Hirundo is designed for post-deployment remediation as well as pre-deployment hardening. When risks are discovered through red teaming, monitoring, or real-world use, they can be addressed without taking systems offline or waiting for full retraining cycles.
What kinds of models can Hirundo work with?
For open-weight models, Hirundo directly edits the model’s parameters.
For closed or API-only models, Hirundo uses a mechanism called Prism, which operates at inference time by adjusting token probabilities to suppress unsafe generations. The underlying approach differs, but both aim to reduce risk at the point where behavior is produced.
What are the limits of this approach?
Hirundo is not claiming to make models perfectly safe or to eliminate all risk. Learned behaviors can be attenuated, not erased in an absolute sense. The goal is to significantly reduce the likelihood and severity of known failure modes, and to give teams a way to respond when new ones emerge – without relying solely on external controls.
When should teams use Hirundo instead of retraining?
Retraining is useful for broad capability improvements. Hirundo is better suited for targeted risk reduction: removing specific data, suppressing known exploit paths, or correcting unsafe behaviors without reintroducing new ones. In practice, teams use retraining to move forward and unlearning to clean up what needs to be removed.
How does this change the security posture of an AI system?
Retraining is useful for broad capability improvements. Hirundo is better suited for targeted risk reduction: removing specific data, suppressing known exploit paths, or correcting unsafe behaviors without reintroducing new ones. In practice, teams use retraining to move forward and unlearning to clean up what needs to be removed.
Seamless integration with your AI stack
No need to change workflows
Leading AI experts trust Hirundo

As AI regulation evolves, cost effective Machine Unlearning technology will become a must.

Avi Tel-Or
CTO, Intel Ignite

I've tried many data quality solutions. Hirundo finds data issues and mislabels at a level I’ve never seen before.

Dan Erez
AI Tech Lead, Taranis