Hi! I am a second-year PhD student in the Natural Language Processing group at the University of Hong Kong (HKUNLP). I am fortunate to be advised by Dr. Tao Yu (core), Dr. Lingpeng Kong and Prof. Ben Kao. My primary interests are Data Science and Natural Language Processing. Previously, I graduated from the Chinese University of Hong Kong, Computer Science, in 2022.

If you are proficient in Python and want to work with me, please email me at hjsu@cs.hku.hk

Publications


ARKS: Active Retrieval in Knowledge Soup for Code Generation
Hongjin Su, Shuyang Jiang, Yuhang Lai, Haoyuan Wu, Boao Shi, Che Liu, Qian Liu, Tao Yu
Preprint
[paper] [code] [data] [website]

Generative Representational Instruction Tuning
Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela
Preprint
[paper] [code] [models] [blog]

Lemur: Harmonizing Natural Language and Code for Language Agents
Yiheng Xu*, Hongjin Su*, Chen Xing*, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu
ICLR 2024 Spotlight (Top 5%)
[paper] [code] [model] [blog]

OpenAgents: An Open Platform for Language Agents in the Wild
Tianbao Xie*, Fan Zhou*, Zhoujun Cheng*, Peng Shi*, Luoxuan Weng*, Yitao Liu*, Toh Jing Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hongjin Su, Dongchan Shin, Caiming Xiong, Tao Yu
Preprint
[paper] [code] [blog]

One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su*, Weijia Shi*, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu
ACL 2023 Findings
[paper] [model (over 2M downloads)] [website] [code (used by over 3.8k repos)] GitHub Repo stars PyPI Downloads

Selective Annotation Makes Language Models Better Few-Shot Learners
Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu
ICLR 2023
[paper] [code]

Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation
Shizhe Diao, Ruijia Xu, Hongjin Su, Yilei Jiang, Yan Song, Tong Zhang
ACL 2021
[paper] [code]