How safe is AI? Toward Trustworthy AI
Computer Science / Digital

How safe is AI? Toward Trustworthy AI

As AI systems grow more pervasive, questions of safety and trust demand urgent, practical answers.
Start 17:00 o'clock
End 00:00 o'clock

At a glance

Technische Universität Dresden (TUD)
ScaDS.AI Dresden/Leipzig
Andreas-Pfitzmann-Bau
1020
Nöthnitzer Straße 46
01187 Dresden (Dresdner Süden)
Website YouTube

Description

This session presents five research efforts spanning the AI safety landscape: a classifier that identifies AI-generated content across model families (IITGnGPT); toxicity detectors for low-resource Indian languages covering 17 fine-grained harm categories across 12 languages (UnityAI-Guard 1.0 & 2.0); a demonstration of backdoor attacks in YOLO, text classifiers, generative models, and translation systems — revealing a systemic adversarial threat surface; and SangrahaTox, a benchmark dataset for auditing multimodal models for stereotypes, bias, and toxicity. Together, these projects chart a path from detecting what AI produces, to exposing how it can be manipulated, to measuring the biases it silently encodes — offering both diagnostic tools and a broader framework for building trustworthy AI.

Please note that the presentation will be held in English.

Information on the event format

Presentation

Stations

Münchner Platz

  • 3 (tram)

Helmholtzstraße

  • 85 (bus)