Yong, Zheng-Xin

contact dot yong at brown dot edu

I am a first-year PhD student in the Department of Computer Science, Brown University under the advisorship of Prof. Stephen Bach. My current research interests are weak supervision, structured predictions, and multilingual NLP.

Prior to my PhD, I completed my BSc at Minerva University where I had the opportunity to study/live in multiple countries such as Argentina, India, Germany, South Korea, U.K., and USA.

Google Scholar  /  Twitter  /  LinkedIn  /  Github  /  Blog

profile photo
News
Projects

BigScience: The Summer of Language Models
The BigScience effort is an excited international collaboration aimed at training a very large and open language model for the research community. I'm participating in the modeling working group, specifically in the prompt engineering and multilinguality subgroups.

Publications
2022

Adapting BigScience Multilingual Model to Unseen Languages
Zheng-Xin Yong* and Vassilina Nikoulina* .
(* denotes equal contribution)
ACL 2022 Workshop "Challenges & Perspectives in Creating Large Language Models" (Non-archival), 2022.
PDF Project

@article{yong:arxiv2022,
 Author = {Zheng-Xin Yong and Vassilina Nikoulina},
 Title = {Adapting BigScience Multilingual Model to Unseen Languages},
 Volume = {arXiv:2204.04873 [cs.LG]},
 Year = {2022}
}

Frame Shift Prediction
Zheng-Xin Yong, Patrick D Watson, Tiago Timponi Torrent, Oliver Czulo, and Collin F Baker.
Language Resources and Evaluation Conference (LREC), 2022.
PDF Project

@article{yong:lrec2022,
 Author = {Zheng-Xin Yong and Patrick D Watson and Tiago Timponi Torrent and Oliver Czulo and Collin F Baker},
 Title = {Frame Shift Prediction},
 Booktitle = {Proceedings of the 12th Language Resources and Evaluation Conference},
 Year = {2022}
}

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach*, Victor Sanh*, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Xiangru Tang, Mike Tian-Jian Jiang, and Alexander M. Rush
Association for Computational Linguistics (ACL) Demo track, 2022.
PDF Data / Tool

@article{bach:acldemo22,
 Author = {Stephen H. Bach and Victor Sanh and Zheng-Xin Yong and Albert Webson and Colin Raffel and Nihal V. Nayak and Abheesht Sharma and Taewoon Kim and M Saiful Bari and Thibault Fevry and Zaid Alyafeai and Manan Dey and Andrea Santilli and Zhiqing Sun and Srulik Ben-David and Canwen Xu and Gunjan Chhablani and Han Wang and Jason Alan Fries and Maged S. Al-shaibani and Shanya Sharma and Urmish Thakker and Khalid Almubarak and Xiangru Tang and Xiangru Tang and Mike Tian-Jian Jiang and Alexander M. Rush},
 Title = {PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts},
 Volume = {arXiv:2202.01279 [cs.LG]},
 Year = {2022}
}

Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh*, Albert Webson*, Colin Raffel*, Stephen H. Bach*, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng-Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, and Alexander M. Rush
International Conference on Learning Representations (ICLR), 2022.
PDF Blog Data Language Model

@article{sanh:iclr22,
 Author = {Victor Sanh and Albert Webson and Colin Raffel and Stephen H. Bach and Lintang Sutawika and Zaid Alyafeai and Antoine Chaffin and Arnaud Stiegler and Teven Le Scao and Arun Raja and Manan Dey and M Saiful Bari and Canwen Xu and Urmish Thakker and Shanya Sharma Sharma and Eliza Szczechla and Taewoon Kim and Gunjan Chhablani and Nihal Nayak and Debajyoti Datta and Jonathan Chang and Mike Tian-Jian Jiang and Han Wang and Matteo Manica and Sheng Shen and Zheng Xin Yong and Harshit Pandey and Rachel Bawden and Thomas Wang and Trishala Neeraj and Jos Rozen and Abheesht Sharma and Andrea Santilli and Thibault Fevry and Jason Alan Fries and Ryan Teehan and Stella Biderman and Leo Gao and Tali Bers and Thomas Wolf and Alexander M. Rush},
 Title = {Multitask Prompted Training Enables Zero-Shot Task Generalization},
 Volume = {International Conference on Learning Representations (ICLR)},
 Year = {2022}
}

2020

Semi-supervised Deep Embedded Clustering with Anomaly Detection for Semantic Frame Induction
Zheng-Xin Yong and Tiago Timponi Torrent.
Language Resources and Evaluation Conference (LREC), 2020
PDF Project

@article{yong:lrec2020,
 Author = {Zheng-Xin Yong and Tiago Timponi Torrent},
 Title = {Semi-supervised Deep Embedded Clustering with Anomaly Detection for Semantic Frame Induction},
 Booktitle = {Proceedings of the 12th Language Resources and Evaluation Conference},
 Year = {2020}
}

This website is inspired by Jon Barron's website.