Advancing Knowledge

Exploring the frontiers of AI and design to create technology that benefits humanity. Our research focuses on responsible AI development and human-centered design.

AI SafetyData AugmentationLLMs

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee*, Haebin Seong*, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang

Proposed effective data augmentation strategies for training robust safety guard models in LLMs.

arXiv ICLR GitHub HuggingFace

ICLR 2025

Safety GuardrailsEfficiencyOrchestration

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

Seanie Lee*, Dong Bok Lee*, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang

An adaptive framework for selecting safety guardrails to balance efficiency and accuracy in LLM deployment.

arXiv HuggingFace

ACL 2025 Findings

EthicsVulnerabilitiesSecurity

BiasJailbreak: Analyzing Ethical Biases and Jailbreak Vulnerabilities in Large Language Models

Isack Lee*, Haebin Seong*

A comprehensive analysis of how ethical biases in LLMs correlate with their susceptibility to jailbreak attacks.

arXiv GitHub

arXiv

Collaborate With Us

Interested in contributing to our research? We welcome collaborations from researchers, practitioners, and organizations who share our vision for responsible technology.