Follow
SongYang Gao
SongYang Gao
Verified email at m.fudan.edu.cn
Title
Cited by
Cited by
Year
Secrets of rlhf in large language models part i: Ppo
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Q Liu, ...
arXiv preprint arXiv:2307.04964, 2023
292023
Zhiheng Xi
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Q Liu, ...
Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng …, 2023
202023
Self-polish: Enhance reasoning in large language models via problem refinement
Z Xi, S Jin, Y Zhou, R Zheng, S Gao, T Gui, Q Zhang, X Huang
arXiv preprint arXiv:2305.14497, 2023
142023
Secrets of rlhf in large language models part ii: Reward modeling
B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ...
arXiv preprint arXiv:2401.06080, 2024
92024
Loramoe: Revolutionizing mixture of experts for maintaining world knowledge in language model alignment
S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ...
arXiv preprint arXiv:2312.09979, 2023
72023
Kernel-whitening: Overcome dataset bias with isotropic sentence embedding
S Gao, S Dou, Q Zhang, X Huang
arXiv preprint arXiv:2210.07547, 2022
62022
Decorrelate irrelevant, purify relevant: Overcome textual spurious correlations from a feature perspective
S Dou, R Zheng, T Wu, S Gao, J Shan, Q Zhang, Y Wu, X Huang
arXiv preprint arXiv:2202.08048, 2022
52022
Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, et al. 2023b. Trace: A comprehensive benchmark for continual learning in large language models
X Wang, Y Zhang, T Chen, S Gao, S Jin, X Yang
arXiv preprint arXiv:2310.06762, 0
5
Zhiheng Xi, Xiao Wang, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, and Xuanjing Huang. 2023
S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou
Loramoe: Revolutionizing mixture of experts for maintaining world knowledge …, 0
5
Zhiheng Xi
B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ...
Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi …, 2024
42024
Farewell to aimless large-scale pretraining: Influential subset selection for language model
X Wang, W Zhou, Q Zhang, J Zhou, S Gao, J Wang, M Zhang, X Gao, ...
arXiv preprint arXiv:2305.12816, 2023
42023
Tooleyes: Fine-grained evaluation for tool learning capabilities of large language models in real-world scenarios
J Ye, G Li, S Gao, C Huang, Y Wu, S Li, X Fan, S Dou, Q Zhang, T Gui, ...
arXiv preprint arXiv:2401.00741, 2024
32024
Navigating the OverKill in Large Language Models
C Shi, X Wang, Q Ge, S Gao, X Yang, T Gui, Q Zhang, X Huang, X Zhao, ...
arXiv preprint arXiv:2401.17633, 2024
22024
Delve into ppo: Implementation matters for stable rlhf
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Y Zhou, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
22023
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection
S Gao, S Dou, Q Zhang, X Huang, J Ma, Y Shan
arXiv preprint arXiv:2306.15705, 2023
22023
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization
S Gao, S Dou, Y Liu, X Wang, Q Zhang, Z Wei, J Ma, Y Shan
arXiv preprint arXiv:2306.15164, 2023
22023
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages
J Ye, S Li, G Li, C Huang, S Gao, Y Wu, Q Zhang, T Gui, X Huang
arXiv preprint arXiv:2402.10753, 2024
12024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
S Gao, Q Ge, W Shen, S Dou, J Ye, X Wang, R Zheng, Y Zou, Z Chen, ...
arXiv preprint arXiv:2401.11458, 2024
12024
Trace: A comprehensive benchmark for continual learning in large language models
X Wang, Y Zhang, T Chen, S Gao, S Jin, X Yang, Z Xi, R Zheng, Y Zou, ...
arXiv preprint arXiv:2310.06762, 2023
12023
CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing
S Gao, S Dou, J Shan, Q Zhang, X Huang
arXiv preprint arXiv:2305.02865, 2023
12023
The system can't perform the operation now. Try again later.
Articles 1–20