Harmful content

We define harmful content as any language which attacks, abuses or discriminates against a person or a group based on an identity attribute (e.g. Sexism, LGBTQI+ discrimination, Racism, Religious discrimination, Ableism), in line with evolving best international standards here (i.e. U.N., E.U.). We also track threats of violence and non-identity based abuse (e.g. non-identity-based forms of online bullying).

Test potentially harmful content

This is an early release of CaliberAI and you may notice false positives/negatives as we refine our predictive models.


We're working on your classification...

Example of potentially harmful content

Loading...

What makes us different

Powered by a unique, high-quality dataset of defamation examples, and managed by a team of annotation experts, CaliberAI's pioneering tools augment human capability to detect language with a high level of legal and defamatory risk.


Unique data

Unique, carefully crafted datasets, training multiple machine learning models for production deployment.

Expert led

Expert annotation, overseen by a diverse, publisher led team with deep expertise in news, law, linguistics and computer science.

Explainable outputs

Pre-processing and post-processing with explainable AI outputs.

Contact us

Get a closer look at how our solutions work and learn more about how CaliberAI's technology can integrate with your technology stack and editorial workflow.

Get in touch with sales@caliberai.net