Detect and fix risks in your AI models, fast
Hirundo gives you control over your models – by discovering model weaknesses before they become critical risks, and surgically fixing them without retraining.
Risks are deeply embedded
in every AI model
Whether you’re using off-the-shelf models or fine-tuning them, critical weaknesses are baked into the model itself – and you can’t “just fix” them.
No way to iterate in production
Once models are in production, you can’t change them at the core – only wrap them in guardrails and output filters.
Months-long retraining cycles
Want to harden your model or reduce hallucinations? Every fix requires rebuilding the model from scratch.
Launch windows you can’t hit
Security issues discovered a week before launch force impossible choices: delay for months or ship with known risks.
Regain control over your model
Surgically pinpoint your model’s weaknesses – so you can enhance and customize behavior, strengthen security, and improve resilience, all without retraining.
Harden models in hours, not months
Red team finds jailbreaks three days before your launch? Fix the model's vulnerability in hours – so late-stage security findings don't force months-long delays.
Fix production issues without downtime
Eliminate mission-critical risks – with no collateral damage to model utility or performance.
What does Hirundo mean by “Machine Unlearning”?
Machine Unlearning is the ability to remove specific data from a model (data unlearning) or reduce risky model behavior (behavior unlearning) without retraining the model from scratch. That includes things like jailbreak vulnerabilities, hallucination patterns, bias, or memorized data that shouldn’t be there. Instead of filtering outputs, Hirundo changes how the model behaves at its core.
How is this different from prompts, guardrails, or output filtering?
Prompts, guardrails, and filters sit around the model. They can reduce bad outputs, but the model itself stays the same. That’s why the same issues keep coming back under new prompts or adversarial pressure. Hirundo addresses the underlying causes, so the model behaves differently even without extra rules wrapped around it. Guardrails are still useful, but with Hirundo they are significantly harder to manipulate.
What kinds of models does Hirundo work with?
Hirundo works with:
- Open-weight models (both OSS base models and fine-tuned variants), where we directly editthe model’s weights.
- Closed-weight/API models (like Gemini or ChatGPT), where we use a separate mechanism called Prism relying on log probs to change model behavior at inference time.
In both cases, the goal is the same: reduce unwanted behavior instead of just blocking it after the fact.
How does Hirundo reduce things like jailbreaks or hallucinations?
Jailbreaks and hallucinations aren’t tied to single examples – they’re patterns the model learned. Hirundo identifies the internal representations that drive those patterns and modifies them directly. The result is a model that’s less likely to produce those behaviors in the first place, without harming unrelated capabilities.
How does Hirundo remove specific data or content from a model?
For open-weight models, Hirundo defines:
- What should be forgotten (e.g., copyrighted content, sensitive records)
- What should be preserved (general knowledge and capabilities)
The model is then edited so the unwanted data becomes unrecoverable, while the rest of the model’s performance stays intact. This isn’t masking or redaction — the goal is that the model no longer “knows” that data in a usable way.
What does evaluation look like before and after unlearning?
Before making changes, Hirundo evaluates the model using standard benchmarks and custom tests. That can include jailbreak benchmarks, bias datasets, hallucination tests, and general utility checks like reasoning and factual accuracy. The same tests are run after unlearning to confirm the issue was fixed and performance didn’t regress elsewhere.
What kind of results should we expect?
Teams typically see significant reduction in unwanted behavior while maintaining model utility:
- 85% reduction in successful prompt injections/jailbreaks
- Up to ~75% reduction in measured biases (various categories)
- ~55% reduction in hallucinations
- Nearly 100% PII removal
- ~1% difference in utility benchmarks (i.e., minimal degradation)
Exact results depend on the model and the issue being addressed, but the goal is always targeted fixes without broad performance loss
How long does this take compared to retraining?
Unlearning typically takes hours, not weeks or months. For example, fixes on 8B-class models have been completed in under an hour on standard GPU hardware. There’s no need to rebuild training pipelines or rerun full pretraining jobs.
What happens when new issues show up later?
You fix them. Hirundo is designed to be used repeatedly. When a new jailbreak, failure mode, or risk shows up in testing or production, you can remove it without starting over. That’s the core shift: models stop being one-shot artifacts and become something you can actually maintain.
Seamless integration with your AI stack
No need to change workflows
Leading AI experts trust Hirundo

As AI regulation evolves, cost effective Machine Unlearning technology will become a must.

Avi Tel-Or
CTO, Intel Ignite

I've tried many data quality solutions. Hirundo finds data issues and mislabels at a level I’ve never seen before.

Dan Erez
AI Tech Lead, Taranis