Risk and parameter convergence of logistic regression Z Ji, M Telgarsky arXiv preprint arXiv:1803.07300, 2018 | 353* | 2018 |
Gradient descent aligns the layers of deep linear networks Z Ji, M Telgarsky arXiv preprint arXiv:1810.02032, 2018 | 268 | 2018 |
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow relu networks Z Ji, M Telgarsky arXiv preprint arXiv:1909.12292, 2019 | 213 | 2019 |
Directional convergence and alignment in deep learning Z Ji, M Telgarsky Advances in Neural Information Processing Systems 33, 17176-17186, 2020 | 194 | 2020 |
Characterizing the implicit bias via a primal-dual analysis Z Ji, M Telgarsky Algorithmic Learning Theory, 772-804, 2021 | 78* | 2021 |
Gradient descent follows the regularization path for general losses Z Ji, M Dudík, RE Schapire, M Telgarsky Conference on Learning Theory, 2109-2136, 2020 | 70 | 2020 |
Neural tangent kernels, transportation mappings, and universal approximation Z Ji, M Telgarsky, R Xian arXiv preprint arXiv:1910.06956, 2019 | 57 | 2019 |
Think before you speak: Training language models with pause tokens S Goyal, Z Ji, AS Rawat, AK Menon, S Kumar, V Nagarajan arXiv preprint arXiv:2310.02226, 2023 | 52 | 2023 |
Early-stopped neural networks are consistent Z Ji, J Li, M Telgarsky Advances in Neural Information Processing Systems 34, 1805-1817, 2021 | 45 | 2021 |
Fast margin maximization via dual acceleration Z Ji, N Srebro, M Telgarsky International Conference on Machine Learning, 4860-4869, 2021 | 40 | 2021 |
Generalization bounds via distillation D Hsu, Z Ji, M Telgarsky, L Wang arXiv preprint arXiv:2104.05641, 2021 | 38 | 2021 |
Actor-critic is implicitly biased towards high entropy optimal policies Y Hu, Z Ji, M Telgarsky arXiv preprint arXiv:2110.11280, 2021 | 26 | 2021 |
Reproducibility in optimization: Theoretical framework and limits K Ahn, P Jain, Z Ji, S Kale, P Netrapalli, GI Shamir Advances in Neural Information Processing Systems 35, 18022-18033, 2022 | 23 | 2022 |
Social welfare and profit maximization from revealed preferences Z Ji, R Mehta, M Telgarsky International Conference on Web and Internet Economics, 264-281, 2018 | 8 | 2018 |
Approximation power of random neural networks B Bailey, Z Ji, M Telgarsky, R Xian arXiv preprint arXiv:1906.07709, 2019 | 7 | 2019 |
Wikidata Vandalism Detection-The Loganberry Vandalism Detector at WSDM Cup 2017 Q Zhu, H Ng, L Liu, Z Ji, B Jiang, J Shen, H Gui arXiv preprint arXiv:1712.06922, 2017 | 7 | 2017 |
Depth Dependence of P Learning Rates in ReLU MLPs S Jelassi, B Hanin, Z Ji, SJ Reddi, S Bhojanapalli, S Kumar arXiv preprint arXiv:2305.07810, 2023 | 6 | 2023 |
Agnostic learnability of halfspaces via logistic loss Z Ji, K Ahn, P Awasthi, S Kale, S Karp International Conference on Machine Learning, 10068-10103, 2022 | 6 | 2022 |
Convex analysis at infinity: An introduction to astral space M Dudík, RE Schapire, M Telgarsky arXiv preprint arXiv:2205.03260, 2022 | 5 | 2022 |
Efficient Document Ranking with Learnable Late Interactions Z Ji, H Jain, A Veit, SJ Reddi, S Jayasumana, AS Rawat, AK Menon, F Yu, ... arXiv preprint arXiv:2406.17968, 2024 | 1 | 2024 |