Follow
Assaf Hallak
Assaf Hallak
NVIDIA Research
Verified email at nvidia.com
Title
Cited by
Cited by
Year
Contextual markov decision processes
A Hallak, D Di Castro, S Mannor
arXiv preprint arXiv:1502.02259, 2015
1312015
Consistent on-line off-policy evaluation
A Hallak, S Mannor
International Conference on Machine Learning, 1372-1383, 2017
862017
Lifetime value marketing using reinforcement learning
G Theocharous, A Hallak
RLDM 2013, 19, 2013
512013
Generalized emphatic temporal difference learning: Bias-variance analysis
A Hallak, A Tamar, R Munos, S Mannor
Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016
472016
Off-policy model-based learning under unknown factored dynamics
A Hallak, F Schnitzler, T Mann, S Mannor
International Conference on Machine Learning, 711-719, 2015
312015
Model selection in markovian processes
A Hallak, D Di-Castro, S Mannor
Proceedings of the 19th ACM SIGKDD international conference on Knowledge …, 2013
232013
Cumulative success-based recommendations for repeat users
E Yom-Tov, A Hallak, N Koenigstein
US Patent App. 15/605,525, 2018
112018
On covariate shift of latent confounders in imitation and reinforcement learning
G Tennenholtz, A Hallak, G Dalal, S Mannor, G Chechik, U Shalit
arXiv preprint arXiv:2110.06539, 2021
82021
Improve agents without retraining: Parallel tree search with off-policy correction
G Dalal, A Hallak, S Dalton, S Mannor, G Chechik
Advances in Neural Information Processing Systems 34, 5518-5530, 2021
52021
System identification framework
G Theocharous, AJ Hallak
US Patent 10,558,987, 2020
52020
Automatic representation for lifetime value recommender systems
A Hallak, Y Mansour, E Yom-Tov
arXiv preprint arXiv:1702.07125, 2017
42017
Emphatic td bellman operator is a contraction
A Hallak, A Tamar, S Mannor
arXiv preprint arXiv:1508.03411, 2015
42015
Planning and learning with adaptive lookahead
A Rosenberg, A Hallak, S Mannor, G Chechik, G Dalal
arXiv preprint arXiv:2201.12403, 2022
22022
Testing a marketing strategy offline using an approximate simulator
A Hallak, G Theocharous
US Patent App. 14/080,038, 2015
22015
Reinforcement Learning with a Terminator
G Tennenholtz, N Merlis, L Shani, S Mannor, U Shalit, G Chechik, ...
arXiv preprint arXiv:2205.15376, 2022
12022
Off-policy evaluation for MDPs with unknown structure
A Hallak, F Schnitzler, T Mann, S Mannor
arXiv preprint arXiv:1502.03255, 2015
12015
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search
G Dalal, A Hallak, G Thoppe, S Mannor, G Chechik
arXiv preprint arXiv:2301.13236, 2023
2023
Method for fast and better tree search for reinforcement learning
S Mannor, AJ Hallak, G Dalal, ST Dalton, I Frosio, G Chechik
US Patent App. 17/824,680, 2022
2022
SoftTreeMax: Policy Gradient with Tree Search
G Dalal, A Hallak, S Mannor, G Chechik
arXiv preprint arXiv:2209.13966, 2022
2022
How to sample if you must: on optimal functional sampling
A Hallak, S Mannor
arXiv preprint arXiv:1208.2417, 2012
2012
The system can't perform the operation now. Try again later.
Articles 1–20