Philip Thomas

Cited by

	All	Since 2019
Citations	4565	3527
h-index	32	29
i10-index	55	48

820

410

205

615

2011201220132014201520162017201820192020202120222023202416 27 28 41 68 137 179 257 411 568 680 725 813 327

Public access

View all

27 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityVerified email at cs.stanford.edu
Georgios TheocharousAdobe ResearchVerified email at adobe.com
Bruno Castro da SilvaUniversity of MassachusettsVerified email at cs.umass.edu
Scott M. JordanPostdoctoral Fellow, University of AlbertaVerified email at ualberta.ca
George KonidarisBrownVerified email at cs.brown.edu
Scott NiekumAssociate Professor, University of Massachusetts AmherstVerified email at cs.umass.edu
Stephen GiguereUniversity of MassachusettsVerified email at cs.umass.edu
Yuriy BrunManning College of Information and Computer Sciences, University of Massachusetts AmherstVerified email at cs.umass.edu
Antonie J. (Ton) van den BogertProfessor of Mechanical Engineering, Cleveland State UniversityVerified email at csuohio.edu
Chris NotaUniversity of Massachusetts, AmherstVerified email at cs.umass.edu
Michael BranickyProfessor of Electrical Engineering & Computer Science, University of KansasVerified email at ku.edu
Sarah OsentoskiVinci4dVerified email at vinci4d.ai
Erik Learned-MillerProfessor of Computer Science, University of Massachusetts AmherstVerified email at cs.umass.edu
Blossom MetevierUniversity of Massachusetts AmherstVerified email at umass.edu
Sridhar MahadevanDirector, Data Science Lab, Adobe Research & Professor, University of Massachusetts, AmherstVerified email at cs.umass.edu
Will DabneyDeepMindVerified email at google.com
Francisco M. GarciaUniversity of Massachusetts - AmherstVerified email at cs.umass.edu
Robert KirschProfessor and Chair of Biomedical Engineering, Case Western Reserve UniversityVerified email at case.edu
Arthur GuezGoogle DeepMindVerified email at google.com
Rémi MunosDeepMindVerified email at inria.fr

Philip Thomas

University of Massachusetts Amherst

Verified email at cs.umass.edu - Homepage

Artificial Intelligence Reinforcement Learning AI Safety


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Data-efficient off-policy policy evaluation for reinforcement learning P Thomas, E Brunskill International Conference on Machine Learning, 2139-2148, 2016	741	2016
Value function approximation in reinforcement learning using the Fourier basis G Konidaris, S Osentoski, P Thomas Proceedings of the AAAI conference on artificial intelligence 25 (1), 380-385, 2011	566	2011
High-confidence off-policy evaluation P Thomas, G Theocharous, M Ghavamzadeh Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	307	2015
High confidence policy improvement P Thomas, G Theocharous, M Ghavamzadeh International Conference on Machine Learning, 2380-2388, 2015	216	2015
Ad recommendation systems for life-time value optimization G Theocharous, PS Thomas, M Ghavamzadeh Proceedings of the 24th international conference on world wide web, 1305-1310, 2015	193	2015
Preventing undesirable behavior of intelligent machines P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill Science 366 (6468), 999-1004, 2019	190	2019
Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019	183	2019
Increasing the action gap: New operators for reinforcement learning MG Bellemare, G Ostrovski, A Guez, P Thomas, R Munos Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016	169	2016
Bias in natural actor-critic algorithms P Thomas International conference on machine learning, 441-448, 2014	158	2014
Safe reinforcement learning PS Thomas	116	2015
Is the policy gradient a gradient? C Nota, PS Thomas arXiv preprint arXiv:1906.07073, 2019	68	2019
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017	67	2017
Optimizing for the future in non-stationary mdps Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ... International Conference on Machine Learning, 1414-1425, 2020	66	2020
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ... arXiv preprint arXiv:1405.6757, 2014	66	2014
Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing P Thomas, G Theocharous, M Ghavamzadeh, I Durugkar, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 31 (2), 4740-4745, 2017	63	2017
Policy gradient methods for reinforcement learning with function approximation and action-dependent baselines PS Thomas, E Brunskill arXiv preprint arXiv:1706.06643, 2017	62	2017
Evaluating the performance of reinforcement learning algorithms S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas International Conference on Machine Learning, 4962-4973, 2020	61	2020
Risk Quantification for Policy Deployment PS Thomas, G Theocharous, M Ghavamzadeh US Patent App. 14/552,047, 2016	55	2016
Importance Sampling for Fair Policy Selection. S Doroudi, PS Thomas, E Brunskill Grantee Submission, 2017	54	2017
Offline contextual bandits with high probability fairness guarantees B Metevier, S Giguere, S Brockman, A Kobren, Y Brun, E Brunskill, ... Advances in neural information processing systems 32, 2019	53	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors