Machine learning for synthetic data generation: a review Y Lu, M Shen, H Wang, X Wang, C van Rechem, T Fu, W Wei arXiv preprint arXiv:2302.04062, 2023 | 144 | 2023 |
Factorization bandits for interactive recommendation H Wang, Q Wu, H Wang Proceedings of the AAAI Conference on Artificial Intelligence 31 (1), 2017 | 129 | 2017 |
Contextual bandits in a collaborative environment Q Wu, H Wang, Q Gu, H Wang Proceedings of the 39th International ACM SIGIR conference on Research and …, 2016 | 128 | 2016 |
Learning hidden features for contextual bandits H Wang, Q Wu, H Wang Proceedings of the 25th ACM international on conference on information and …, 2016 | 93 | 2016 |
Adversarial domain adaptation for machine reading comprehension H Wang, Z Gan, X Liu, J Liu, J Gao, H Wang arXiv preprint arXiv:1908.09209, 2019 | 77 | 2019 |
Unbiased learning to rank: online or offline? Q Ai, T Yang, H Wang, J Mao ACM Transactions on Information Systems (TOIS) 39 (2), 1-29, 2021 | 66 | 2021 |
Factorization bandits for online influence maximization Q Wu, Z Li, H Wang, W Chen, H Wang Proceedings of the 25th ACM SIGKDD International Conference on Knowledge …, 2019 | 45 | 2019 |
Variance reduction in gradient exploration for online learning to rank H Wang, S Kim, E McCord-Snook, Q Wu, H Wang Proceedings of the 42nd International ACM SIGIR Conference on Research and …, 2019 | 40 | 2019 |
Solving verbal comprehension questions in iq test by knowledge-powered word embedding H Wang, F Tian, B Gao, J Bian, TY Liu arXiv preprint arXiv:1505.07909, 2015 | 40* | 2015 |
Global and local differential privacy for collaborative bandits H Wang, Q Zhao, Q Wu, S Chopra, A Khaitan, H Wang Proceedings of the 14th ACM Conference on Recommender Systems, 150-159, 2020 | 38 | 2020 |
Efficient exploration of gradient space for online learning to rank H Wang, R Langley, S Kim, E McCord-Snook, H Wang The 41st international ACM SIGIR conference on research & development in …, 2018 | 38 | 2018 |
Dynamic ensemble of contextual bandits to satisfy users' changing interests Q Wu, H Wang, Y Li, H Wang The World Wide Web Conference, 2080-2090, 2019 | 32 | 2019 |
Autodefense: Multi-agent llm defense against jailbreak attacks Y Zeng, Y Wu, X Zhang, H Wang, Q Wu arXiv preprint arXiv:2403.04783, 2024 | 31 | 2024 |
Embodied llm agents learn to cooperate in organized teams X Guo, K Huang, J Liu, W Fan, N Vélez, Q Wu, H Wang, TL Griffiths, ... arXiv preprint arXiv:2403.12482, 2024 | 22 | 2024 |
PARL: A unified framework for policy alignment in reinforcement learning from human feedback S Chakraborty, A Bedi, A Koppel, H Wang, D Manocha, M Wang, F Huang The Twelfth International Conference on Learning Representations, 2024 | 22* | 2024 |
Pairrank: Online pairwise learning to rank by divide-and-conquer Y Jia, H Wang, S Guo, H Wang Proceedings of the web conference 2021, 146-157, 2021 | 20 | 2021 |
Communication efficient distributed learning for kernelized contextual bandits C Li, H Wang, M Wang, H Wang Advances in Neural Information Processing Systems 35, 19773-19785, 2022 | 17 | 2022 |
Incentivized exploration for multi-armed bandits under reward drift Z Liu, H Wang, F Shen, K Liu, L Chen Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 4981-4988, 2020 | 12 | 2020 |
When are linear stochastic bandits attackable? H Wang, H Xu, H Wang International Conference on Machine Learning, 23254-23273, 2022 | 11 | 2022 |
Provable benefits of policy learning from human preferences in contextual bandit problems X Ji, H Wang, M Chen, T Zhao, M Wang arXiv preprint arXiv:2307.12975, 2023 | 9 | 2023 |