Yong Zheng-Xin

Current: Computer Science Ph.D. @ Brown University
Past: Research Scientist Intern @ Meta AI, Research Collaborator @ Cohere Labs

prof_pic.jpg

I am a fifth-year PhD student at Brown University advised by Prof. Stephen Bach. I am fortunate to have interned/collaborated at Meta (GenAI, FAIR) and Cohere Labs, and my research is funded by Open Philanthropy (for technical AI safety research @ RFP 2025).

My current focus is on reasoning models. Some recent discovery includes:

  • After benign training, reasoning models can hide safety misbehavior in CoT (preprint), or even try to reason themselves out of safety guardrail (preprint). Our work offers interpretability analysis and some potential solutions.
  • Test-time scaling can improve zero-shot multilingual reasoning (preprint).

I also work on safety alignment. I discovered that low-resource languages can jailbreak GPT-4 (⭑Best Paper Award, NeurIPS 2023 SoLaR Workshop; featured on New Scientist), which became the seminal work for multilingual red-teaming. My follow-up work used mechanistic interpretability to study crosslingual generalization of detoxification alignment training (EMNLP 2024 Findings) and adversarial attacks (NAACL 2025 Findings). We also recently released a survey on multilingual AI safety (preprint).

In addition, I work on multilingual LLMs and low-resource NLP. I co-developed Aya model (⭑Best Paper Award, ACL 2024) and worked on making speech models robust to accents (INTERSPEECH 2025). I also studied how pretrained LLMs can learn low-resource languages effectively through language adaptation (ACL 2023) and synthetic data (EMNLP 2024 Findings), as well as how they naturally mix languages (i.e. code-switching) (EMNLP 2023 CALCS, EMNLP 2023 Findings, ACL 2023 Findings).

Actively seeking full-time research roles in industry.


Featured work/preprints (see all)

  1. Yik Siu Chan* ,  Zheng-Xin Yong* ,  and  Stephen H. Bach
    arxiv preprint, 2025
  2. Zheng-Xin Yong ,  Beyza Ermis ,  Marzieh Fadaee , and 2 more authors
    arxiv preprint, 2025
  3. Zheng-Xin Yong ,  M. Farid Adilazuarda ,  Jonibek Mansurov , and 7 more authors
    arxiv preprint, 2025
  4. Samuele Poppi ,  Zheng-Xin Yong ,  Yifei He , and 4 more authors
    NAACL Findings, 2025
  5. Xiaochen Li* ,  Zheng-Xin Yong* ,  and  Stephen H Bach
    EMNLP Findings, 2024
  6. Ahmet Üstün* ,  Viraat Aryabumi* ,  Zheng-Xin Yong* , and 14 more authors
    ACL, 2024 (Best Paper Award)
  7. Zheng-Xin Yong ,  Cristina Menghini ,  and  Stephen Bach
    NeurIPS Workshop: Socially Responsible Language Modelling Research (SoLaR) , 2023 (Best Paper Award)

news

09 / 2025 Received grant from Open Philanthropy (RFP 2025) for my technical AI safety research!
07 / 2025 Gave an invited talk at Google Multilinguality Reading Group.
05 / 2025 Gave an invited talk at MilaNLP lab.
05 / 2025 1 paper accepted! Work on understanding accent bias in ASR speech models was accepted to INTERSPEECH’25. Work was done during Meta internship.
02 / 2025 1 paper accepted! Work on cross-lingual finetuning attacks was accepted to NAACL’25 findings.
Work was done during Meta internship.
09 / 2024 4 papers accepted! LexC-Gen, SEACrowd, and crosslingual alignment were accepted to EMNLP. CVQA was accepted to NeurIPS.
08 / 2024 Aya Model paper received the ⭑Best Paper Award at ACL 2024.
07 / 2024 Gave an invited talk at London Data Week.
06 / 2024 Started research scientist internship at Meta AI (FAIR)!
05 / 2024 1 paper accepted! A Safe Harbor for AI Evaluation and Red Teaming is accepted to ICML.
02 / 2024 Released Aya model and dataset papers!
I also presented Aya multilingual safety research at Aya Grand Finale.
11 / 2023 Co-organized the tutorial on current status of NLP in South East Asia at AACL 2023.
10 / 2023 Low-Resource Languages Jailbreak GPT-4” received the ⭑Best Paper Award at NeurIPS 2023 Socially Responsible Language Modeling (SoLaR) workshop.
09 / 2023 Joined the Cohere For AI’s Responsible Deployment Team for Aya red-teaming.
08 / 2023 Served as the Area Chair (Multilingualism & Linguistic Diversity Track in EMNLP 2023).
05 / 2023 Media: Our code-switching paper was featured by Wired.
05 / 2023 3 papers accepted! BLOOM+1, BLOOMZ and code-switching survey were accepted to ACL 2023.
03 / 2022 2 papers accepted! T0 was accepted to ICLR (Spotlight). PromptSource was accepted to ACL Demo.
06 / 2021 Started PhD at Brown University.

Miscellaneous


✈️ I lived in 6 different countries (USA, UK, S.Korea, Argentina, India, and Germany) for at least four months each when I studied at Minerva University.

🕺 I love dancing. I used to teach a bit of Lindy Hop and salsa. Also learned the full K-pop choreo of “Let’s kill this love” for my undergrad graduation.

👃 Used to work on lung cancer diagnosis research (Google Science Fair 2016 Finalist; work published on Journal of Thoracic Disease 2016 and featured on IEEE Spectrum) before pivoting to AI.