Yong Zheng-Xin

Computer Science Ph.D. student @ Brown University
Research Scientist Intern @ Meta AI (FAIR), Collaborator @ Cohere For AI

prof_pic.jpg

I am a incoming fourth-year Ph.D. student in Computer Science at Brown University, advised by Prof. Stephen Bach. I’m fortunate to have collaborated with amazing researchers at Cohere For AI and at Meta AI (FAIR and GenAI Team). Currently interning at Meta AI and will be back to Brown in Spring 2025.

Lately, I focus on making multilingual LLMs safe for all users, especially after I discovered that low-resource languages can jailbreak GPT-4 (⭑Best Paper Award, NeurIPS 2023 Socially Responsible Language Modeling Workshop). This work was highlighted in the International Scientific Report on the Safety of Advanced AI 2024, and it catalyzed the paradigm shift in industry AI labs towards multilingual red-teaming.

My other notable contributions in AI safety includes:

  • 🔍 I study why safety alignment works or fails in multilingual context (often with mechanistic interpretability). For instance, why toxicity reduction can generalize across languages (Findings of EMNLP 2024) and why monolingual finetuning attacks can undo multilingual safety guardrails (preprint).
  • 🔓 I have done safety red-teaming research in frontier AI labs.
    • Meta AI (GenAI and FAIR): I worked on understanding finetuning attacks on multilingual LLMs such as Llama-3.1 and QWen-2 (preprint). I also worked on red-teaming Massively Multilingual Speech models.
    • Cohere For AI: I worked on red-teaming Aya-101 (⭑Best Paper Award, ACL 2024).
  • 🌐 I think about how the global open science community can contribute to AI safety research. For instance, I joined the advocate for A Safe Harbor for AI Evaluation and Red Teaming (ICML 2024) (with an 📄 open letter signed by 300+ researchers and covered by The Washington Post and VentureBeat) to ensure legal and technical protections for AI red-teaming by independent researchers.

I also worked on making LLMs helpful for all users by making foundational models overcome language barriers and support underrepresented languages. I’ve worked on adapting LLMs to low-resource languages (ACL 2023) and generating synthetic data for extremely low-resource languages (Findings of EMNLP 2024). I have also worked on massively multilingual LLMs and speech technology at following labs/groups:

  • Meta FAIR: I worked on mitigating accent bias for the Massively Multilingual Speech model.
  • Cohere For AI: I served as a Malay language co-ambassador who coordinated the data collection efforts for Malay language in Aya dataset.
  • BigScience: I led language-modeling group to adapt BLOOM to unseen languages. I also contributed to T0 (one of the earliest instruction-following LLM), BLOOM (the world’s first largest open multilingual LLMs), and mT0/BLOOMZ (the world’s first instruction-following multilingual LLM).

As a Malaysian, I also contributed to NLP for Southeast Asian (SEA) languages. I’ve hosted *ACL tutorials, helped curate SEACrowd data hub (EMNLP 2024), and studied how well LLMs can handle SEA linguistic phenomenon, such as code-switching (EMNLP 2023 CALCS Workshop), and understand culture in SEA region (NeurIPS 2024).

Other Misc Stuff:

  • If you want to chat or collaborate on any of the research directions above (or just talk about graduate schools), feel free to send an email to me: contact [dot] yong @ brown [dot] edu.
  • My passion hobby is dancing, especially salsa and bachata. I also dance a bit of Lindy Hop, Argentine Tango and K-pop.
    I usually check out the dance scenes in the city when I travel to conferences ––– if you also enjoy dancing, hmu we can check them out together.
  • I went to Minerva University during undergrad so I had the opportunity to travel and live in six different cities around the world: 🇺🇸 San Francisco, 🇰🇷 Seoul, 🇮🇳 Hyderabad, 🇩🇪 Berlin, 🇦🇷 Buenos Aires and 🇬🇧 London.

selected publications (see all)

  1. Preference Tuning For Toxicity Mitigation Generalizes Across Languages
    Xiaochen Li* ,  Zheng-Xin Yong* ,  and  Stephen H Bach
    EMNLP Findings, 2024
  2. Aya model: An instruction finetuned open-access multilingual language model
    Ahmet Üstün* ,  Viraat Aryabumi* ,  Zheng-Xin Yong* , and 14 more authors
    ACL, 2024 (Best Paper Award)
  3. Low-Resource Languages Jailbreak GPT-4
    Zheng-Xin Yong ,  Cristina Menghini ,  and  Stephen Bach
    NeurIPS Workshop: Socially Responsible Language Modelling Research (SoLaR) , 2023 (Best Paper Award)

news

09 / 2024 LexC-Gen and mechanistic explanations of why removing toxicity generalizes across languages were accepted to Findings of EMNLP 2024. SEACrowd was also accepted to EMNLP 2024. CVQA was accepted to NeurIPS 2024 Datasets & Benchmarks.
08 / 2024 Aya Model paper received the ⭑Best Paper Award at ACL 2024.
07 / 2024 Gave a talk about multilingual AI safety at London Data Week (organized by The Alan Turing Institute and supported by Mayor of London).
06 / 2024 Meta AI: Started my research scientist internship at Meta AI (FAIR), working on Massively Multilingual Speech (MMS) models. Also collaborated with GenAI Trust Team on a multilingual safety project.
02 / 2024 Aya model and dataset papers are released! I presented Aya multilingual safety research at Aya Grand Finale.
11 / 2023 Co-organized the tutorial of Current Status of NLP in South East Asia at AACL 2023.
10 / 2023 Low-Resource Languages Jailbreak GPT-4” received the ⭑Best Paper Award at NeurIPS 2023 Socially Responsible Language Modeling (SoLaR) workshop.
09 / 2023 Cohere For AI: Joining the Responsible Deployment Team for Aya red-teaming.
05 / 2023 Interviewed by Wired on our code-switching paper and grassroot research initiative for Southeast Asian (SEA) languages.
03 / 2022 T0 is accepted to ICLR 2022 (Spotlight) and its blog post is out! PromptSource is also accepted to ACL 2022 Demo track.
06 / 2021 Started PhD at Brown University.