-
1
-
-
14344251217
-
Apprenticeship learning via inverse reinforcement learning
-
Banff, Alberta
-
Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning, International Conference on Machine Learning (ICML), Banff, Alberta, (pp. 1-8).
-
(2004)
International Conference On Machine Learning (ICML)
, pp. 1-8
-
-
Abbeel, P.1
Ng, A.2
-
4
-
-
77954951649
-
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
-
doi:10.1007/s10458-009-9103-z
-
Amato, C., Bernstein, D. S., & Zilberstein, S. (2010). Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs. Autonomous Agents and Multi-Agent Systems, 21(3), 293-320. doi:10.1007/s10458-009-9103-z
-
(2010)
Autonomous Agents and Multi-Agent Systems
, vol.21
, Issue.3
, pp. 293-320
-
-
Amato, C.1
Bernstein, D.S.2
Zilberstein, S.3
-
5
-
-
0001700171
-
A Markov decision process
-
doi:10.1512/iumj.1957.6.06038
-
Bellman, R. (1957). A Markov decision process. Indiana University Mathematics Journal, 6(4), 679-684. doi:10.1512/iumj.1957.6.06038
-
(1957)
Indiana University Mathematics Journal
, vol.6
, Issue.4
, pp. 679-684
-
-
Bellman, R.1
-
6
-
-
0030242097
-
Input/output HMMs for sequence processing
-
doi:10.1109/72.536317
-
Bengio, Y., & Frasconi, P. (1996). Input/output HMMs for sequence processing. IEEE Transactions on Neural Networks, 7(5), 1231-1249. doi:10.1109/72.536317
-
(1996)
IEEE Transactions On Neural Networks
, vol.7
, Issue.5
, pp. 1231-1249
-
-
Bengio, Y.1
Frasconi, P.2
-
7
-
-
65349083220
-
Policy iteration for decentralized control of Markov decision processes. [JAIR]
-
Bernstein, D. S., Amato, C., Hanse, E. A., & Zilberstein, S. (2009). Policy iteration for decentralized control of Markov decision processes. [JAIR]. Journal of Artificial Intelligence Research, 34, 89-132.
-
(2009)
Journal of Artificial Intelligence Research
, vol.34
, pp. 89-132
-
-
Bernstein, D.S.1
Amato, C.2
Hanse, E.A.3
Zilberstein, S.4
-
8
-
-
9444288081
-
Stochastic local search for POMDP controllers
-
San Jose, California
-
Braziunas, D., & Boutilier, C. (2004). Stochastic local search for POMDP controllers. National Conference on Artificial Intelligence (AAAI), San Jose, California, (pp. 690-696).
-
(2004)
National Conference on Artificial Intelligence (AAAI)
, pp. 690-696
-
-
Braziunas, D.1
Boutilier, C.2
-
9
-
-
79955875655
-
Inverse reinforcement learning in partially observable domains. [JMLR]
-
Choi, J., & Kim, K.-E. (2011). Inverse reinforcement learning in partially observable domains. [JMLR]. Journal of Machine Learning Research, 12, 691-730.
-
(2011)
Journal of Machine Learning Research
, vol.12
, pp. 691-730
-
-
Choi, J.1
Kim, K.-E.2
-
10
-
-
79953864311
-
Efficient solutions to factored MDPs with imprecise transition probabilities. [AIJ]
-
doi:10.1016/j.artint.2011.01.001
-
Delgado, K. V., Sanner, S., & Nunes de Barros, L. (2011). Efficient solutions to factored MDPs with imprecise transition probabilities. [AIJ]. Artificial Intelligence, 175(9-10), 1498-1527. doi:10.1016/j.artint.2011.01.001
-
(2011)
Artificial Intelligence
, vol.175
, Issue.9-10
, pp. 1498-1527
-
-
Delgado, K.V.1
Sanner, S.2
Nunes de Barros, L.3
-
11
-
-
0002629270
-
Maximum likelihood from incomplete data via the EM algorithm
-
Dempster, A., Laird, L., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B. Methodological, 39(1), 1-38.
-
(1977)
Journal of the Royal Statistical Society. Series B. Methodological
, vol.39
, Issue.1
, pp. 1-38
-
-
Dempster, A.1
Laird, L.2
Rubin, D.3
-
13
-
-
76549099978
-
Policy explanation in factored Markov decision processes
-
Elizalde, F., Sucar, L. E., Luque, M., Diez, F. J., & Reyes, A. (2008). Policy explanation in factored Markov decision processes. European Workshop on Probabilistic Graphical Models, (pp. 97-104).
-
(2008)
European Workshop On Probabilistic Graphical Models
, pp. 97-104
-
-
Elizalde, F.1
Sucar, L.E.2
Luque, M.3
Diez, F.J.4
Reyes, A.5
-
14
-
-
79956364385
-
Non-deterministic policies in Markovian decision processes. [JAIR]
-
Fard, M. M., & Pineau, J. (2011). Non-deterministic policies in Markovian decision processes. [JAIR]. Journal of Artificial Intelligence Research, 40, 1-24.
-
(2011)
Journal of Artificial Intelligence Research
, vol.40
, pp. 1-24
-
-
Fard, M.M.1
Pineau, J.2
-
15
-
-
29344460055
-
Dynamic programming for structured continuous Markov decision problems
-
Feng, Z., Dearden, R., Meuleau, N., & Washington, R. (2004). Dynamic programming for structured continuous Markov decision problems. International Conference on Uncertainty in Artificial Intelligence (UAI), (pp. 154-161).
-
(2004)
International Conference On Uncertainty In Artificial Intelligence (UAI)
, pp. 154-161
-
-
Feng, Z.1
Dearden, R.2
Meuleau, N.3
Washington, R.4
-
17
-
-
4544318426
-
Efficient solution algorithms for factored MDPs. [JAIR]
-
Guestrin, C., Koller, D., Parr, R., & Venkataraman, S. (2003). Efficient solution algorithms for factored MDPs. [JAIR]. Journal of Artificial Intelligence Research, 19, 399-468.
-
(2003)
Journal of Artificial Intelligence Research
, vol.19
, pp. 399-468
-
-
Guestrin, C.1
Koller, D.2
Parr, R.3
Venkataraman, S.4
-
19
-
-
84898987770
-
An improved policy iteration algorithm for partially observable MDPs
-
Denver, Colorado: NIPS
-
Hansen, E. A. (1997). An improved policy iteration algorithm for partially observable MDPs. Advances in Neural Information Processing systems (pp. 1015-1021). Denver, Colorado: NIPS.
-
(1997)
Advances In Neural Information Processing Systems
, pp. 1015-1021
-
-
Hansen, E.A.1
-
23
-
-
0002956570
-
SPUDD: Stochastic planning using decision diagrams
-
Hoey, J., St-Aubin, R., Hu, J. A., & Boutilier, C. (1999). SPUDD: stochastic planning using decision diagrams. International Conference on Uncertainty in Artificial Intelligence (UAI), (pp. 279-288).
-
(1999)
International Conference On Uncertainty In Artificial Intelligence (UAI)
, pp. 279-288
-
-
Hoey, J.1
St-Aubin, R.2
Hu, J.A.3
Boutilier, C.4
-
25
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
doi:10.1016/S0004-3702(98)00023-X
-
Kaelbling, L. P., Littman, M., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101, 99-134. doi:10.1016/S0004-3702(98)00023-X
-
(1998)
Artificial Intelligence
, vol.101
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.2
Cassandra, A.R.3
-
26
-
-
84880649215
-
A sparse sampling algorithm for near-optimal planning in large Markov decision processes
-
Stockholm, Sweden: IJCAI
-
Kearns, M., Mansour, Y., & Ng, A. Y. (1999). A sparse sampling algorithm for near-optimal planning in large Markov decision processes. International Joint Conferences on Artificial Intelligence (pp. 1324-1331). Stockholm, Sweden: IJCAI.
-
(1999)
International Joint Conferences On Artificial Intelligence
, pp. 1324-1331
-
-
Kearns, M.1
Mansour, Y.2
Ng, A.Y.3
-
28
-
-
84881060324
-
Point-Based Value Iteration for Constrained POMDPs
-
IJCAI
-
Kim, D., Lee, J., Kim, K.-E., & Poupart, P. (2011). Point-Based Value Iteration for Constrained POMDPs. International Joint conferences on Artificial Intelligence (pp. 1968-1974). IJCAI.
-
(2011)
International Joint Conferences On Artificial Intelligence
, pp. 1968-1974
-
-
Kim, D.1
Lee, J.2
Kim, K.-E.3
Poupart, P.4
-
30
-
-
33750586671
-
Solving factored MDPs with hybrid state and action variables. [JAIR]
-
Kveton, B., Hauskrecht, M., & Guestrin, C. (2006). Solving factored MDPs with hybrid state and action variables. [JAIR]. Journal of Artificial Intelligence Research, 27, 153-201.
-
(2006)
Journal of Artificial Intelligence Research
, vol.27
, pp. 153-201
-
-
Kveton, B.1
Hauskrecht, M.2
Guestrin, C.3
-
31
-
-
0036374190
-
Nonapproximability results for partially observable Markov decision processes. [JAIR]
-
Lusena, C., Goldsmith, J., & Mundhenk, M. (2001). Nonapproximability results for partially observable Markov decision processes. [JAIR]. Journal of Artificial Intelligence Research, 14, 83-103.
-
(2001)
Journal of Artificial Intelligence Research
, vol.14
, pp. 83-103
-
-
Lusena, C.1
Goldsmith, J.2
Mundhenk, M.3
-
32
-
-
65349138293
-
A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains. [JAIR]
-
Mausam
-
Meuleau, N., Benazera, E., Brafman, R.I., & Hansen, E.A. & Mausam. (2009). A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains. [JAIR]. Journal of Artificial Intelligence Research, 34, 27-59.
-
(2009)
Journal of Artificial Intelligence Research
, vol.34
, pp. 27-59
-
-
Meuleau, N.1
Benazera, E.2
Brafman, R.I.3
Hansen, E.A.4
-
33
-
-
0002500946
-
Solving POMDPS by searching the space of finite policies
-
Stockholm, Sweden
-
Meuleau, N., Kim, K.-E., Kaelbling, L. P., & Cassandra, A. R. (1999), Solving POMDPS by searching the space of finite policies, International Conference on Uncertainty in Artificial Intelligence (UAI), Stockholm, Sweden, (pp. 417-426).
-
(1999)
International Conference On Uncertainty In Artificial Intelligence (UAI)
, pp. 417-426
-
-
Meuleau, N.1
Kim, K.-E.2
Kaelbling, L.P.3
Cassandra, A.R.4
-
34
-
-
80053212134
-
Apprenticeship learning using inverse reinforcement learning and gradient methods
-
Vancouver, Canada
-
Neu, G., & Szepesvari, C. (2007). Apprenticeship learning using inverse reinforcement learning and gradient methods. International Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, Canada, (pp. 295-302).
-
(2007)
International Conference On Uncertainty In Artificial Intelligence (UAI)
, pp. 295-302
-
-
Neu, G.1
Szepesvari, C.2
-
35
-
-
0042547347
-
Algorithms for inverse reinforcement learning
-
Stanford, California
-
Ng, A., & Russell, S. (2000). Algorithms for inverse reinforcement learning. International Conference on Machine Learning (ICML), Stanford, California, (pp. 663-670).
-
(2000)
International Conference On Machine Learning (ICML)
, pp. 663-670
-
-
Ng, A.1
Russell, S.2
-
37
-
-
0000977910
-
The complexity of Markov decision processes
-
doi:10.1287/moor.12.3.441
-
Papadimitriou, C. H., & Tsitsilis, J. N. (1987). The complexity of Markov decision processes. Mathematics of Operations Research, 12(3), 441-450. doi:10.1287/moor.12.3.441
-
(1987)
Mathematics of Operations Research
, vol.12
, Issue.3
, pp. 441-450
-
-
Papadimitriou, C.H.1
Tsitsilis, J.N.2
-
39
-
-
52249090123
-
Anytime point-based approximations for large POMDPs. [JAIR]
-
Pineau, J., Gordon, G., & Thrun, S. (2006). Anytime point-based approximations for large POMDPs. [JAIR]. Journal of Artificial Intelligence Research, 27, 335-380.
-
(2006)
Journal of Artificial Intelligence Research
, vol.27
, pp. 335-380
-
-
Pineau, J.1
Gordon, G.2
Thrun, S.3
-
40
-
-
20444478005
-
Policy-contingent abstraction for robust robot control
-
Pineau, J., Gordon, G. J., & Thrun, S. (2003). Policy-contingent abstraction for robust robot control. International Conference on Uncertainty in Artificial Intelligence (UAI), (pp. 477-484).
-
(2003)
International Conference On Uncertainty In Artificial Intelligence (UAI)
, pp. 477-484
-
-
Pineau, J.1
Gordon, G.J.2
Thrun, S.3
-
41
-
-
0034292276
-
Constrained Markovian decision processes: The dynamic programming approach
-
doi:10.1016/S0167-6377(00)00039-0
-
Piunovskiy, A. B., & Mao, X. (2000). Constrained Markovian decision processes: the dynamic programming approach. Operations Research Letters, 27(3), 119-126. doi:10.1016/S0167-6377(00)00039-0
-
(2000)
Operations Research Letters
, vol.27
, Issue.3
, pp. 119-126
-
-
Piunovskiy, A.B.1
Mao, X.2
-
42
-
-
33750724397
-
Point-Based Value Iteration for Continuous POMDPs. [JMLR]
-
Porta, J. M., Vlassis, N. A., Spaan, M. T. J., & Poupart, P. (2006). Point-Based Value Iteration for Continuous POMDPs. [JMLR]. Journal of Machine Learning Research, 7, 2329-2367.
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 2329-2367
-
-
Porta, J.M.1
Vlassis, N.A.2
Spaan, M.T.J.3
Poupart, P.4
-
44
-
-
80054839608
-
Closing the Gap: Improved Bounds on Optimal POMDP Solutions
-
Freiburg, Germany
-
Poupart, P., Kim, K. E., & Kim, D. (2011). Closing the Gap: Improved Bounds on Optimal POMDP Solutions, International Conference on Automated Planning and Scheduling (ICAPS), Freiburg, Germany.
-
(2011)
International Conference On Automated Planning and Scheduling (ICAPS)
-
-
Poupart, P.1
Kim, K.E.2
Kim, D.3
-
45
-
-
84898319353
-
Analyzing and Escaping Local Optima in Planning as Inference for Partially Observable Domains
-
Athens, Greece
-
Poupart, P., Lang, T., & Toussaint, M. (2011). Analyzing and Escaping Local Optima in Planning as Inference for Partially Observable Domains. European Conference on Machine Learning (ECML), Athens, Greece.
-
(2011)
European Conference On Machine Learning (ECML)
-
-
Poupart, P.1
Lang, T.2
Toussaint, M.3
-
48
-
-
84881054930
-
Eliciting additive reward functions for Markov decision processes
-
Barcelona, Spain: IJCAI
-
Regan, K., & Boutilier, C. (2011a). Eliciting additive reward functions for Markov decision processes. International Joint Conferences on Artificial Intelligence (pp. 2159-2164). Barcelona, Spain: IJCAI.
-
(2011)
, pp. 2159-2164
-
-
Regan, K.1
Boutilier, C.2
-
49
-
-
84881084517
-
Robust Online Optimization of Reward-Uncertain MDPs
-
Barcelona, Spain: IJCAI
-
Regan, K., & Boutilier, C. (2011b). Robust Online Optimization of Reward-Uncertain MDPs. International Joint Conferences on Artificial Intelligence (pp. 2165-2171). Barcelona, Spain: IJCAI.
-
(2011)
, pp. 2165-2171
-
-
Regan, K.1
Boutilier, C.2
-
50
-
-
52249086942
-
Online planning algorithms for POMDPs. [JAIR]
-
Ross, S., Pineau, J., Paquet, S., & Chaib-Draa, B. (2008). Online planning algorithms for POMDPs. [JAIR]. Journal of Artificial Intelligence Research, 32, 663-704.
-
(2008)
Journal of Artificial Intelligence Research
, vol.32
, pp. 663-704
-
-
Ross, S.1
Pineau, J.2
Paquet, S.3
Chaib-Draa, B.4
-
51
-
-
80053161811
-
Symbolic dynamic programming for discrete and continuous state MDPs
-
Barcelona, Spain
-
Sanner, S., Delgado, K. V., & de Barros, L. N. (2011). Symbolic dynamic programming for discrete and continuous state MDPs. International Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain.
-
(2011)
International Conference On Uncertainty In Artificial Intelligence (UAI)
-
-
Sanner, S.1
Delgado, K.V.2
de Barros, L.N.3
-
53
-
-
0015630091
-
Markovian decision processes with uncertain transition probabilities
-
doi:10.1287/opre.21.3.728
-
Satia, J. K., & Lave, R. E. Jr. (1970). Markovian decision processes with uncertain transition probabilities. Operations Research, 21, 728-740. doi:10.1287/opre.21.3.728
-
(1970)
Operations Research
, vol.21
, pp. 728-740
-
-
Satia, J.K.1
Lave Jr., R.E.2
-
56
-
-
58849155135
-
Efficient ADD Operations for Point-Based Algorithms
-
Shani, G., Poupart, P., Brafman, R. I., & Shimony, S. E. (2008). Efficient ADD Operations for Point-Based Algorithms. International Conference on Automated Planning and Scheduling (ICAPS), pp. 330-337.
-
(2008)
International Conference On Automated Planning and Scheduling (ICAPS)
, pp. 330-337
-
-
Shani, G.1
Poupart, P.2
Brafman, R.I.3
Shimony, S.E.4
-
58
-
-
57749207691
-
Symbolic heuristic search value iteration for factored POMDPs
-
Sim, H. S., Kim, K.-E., Kim, J. H., Chang, D.-S., & Koo, M.-Y. (2008). Symbolic heuristic search value iteration for factored POMDPs. National Conference on Artificial Intelligence (AAAI), pp. 1088-1093.
-
(2008)
National Conference On Artificial Intelligence (AAAI)
, pp. 1088-1093
-
-
Sim, H.S.1
Kim, K.-E.2
Kim, J.H.3
Chang, D.-S.4
Koo, M.-Y.5
-
59
-
-
0015658957
-
The optimal control of partially observable Markov processes over a finite horizon
-
doi:10.1287/opre.21.5.1071
-
Smallwood, R. D., & Sondik, E. J. (1973). The optimal control of partially observable Markov processes over a finite horizon. Operations Research, 21, 1071-1088. doi:10.1287/opre.21.5.1071
-
(1973)
Operations Research
, vol.21
, pp. 1071-1088
-
-
Smallwood, R.D.1
Sondik, E.J.2
-
62
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. [AIJ]
-
doi:10.1016/S0004-3702(99)00052-1
-
Sutton, R., Precup, D., & Singh, S. P. (1999). Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. [AIJ]. Artificial Intelligence, 112(1-2), 181-211. doi:10.1016/S0004-3702(99)00052-1
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.1
Precup, D.2
Singh, S.P.3
-
64
-
-
51349153274
-
Probabilistic inference for solving (PO) MDPs
-
School of Informatics, University of Edinburgh
-
Toussaint, M., Harmeling, S., & Storkey, A. (2006). Probabilistic inference for solving (PO) MDPs, Technical Report EDI-INF-RR-0934, School of Informatics, University of Edinburgh.
-
(2006)
Technical Report EDI-INF-RR-0934
-
-
Toussaint, M.1
Harmeling, S.2
Storkey, A.3
-
65
-
-
0028460403
-
Markov decision processes with imprecise transition probabilities
-
doi:10.1287/opre.42.4.739
-
White, C. C. III, & El-Deib, H. K. (1994). Markov decision processes with imprecise transition probabilities. Operations Research, 42(4), 739-749. doi:10.1287/opre.42.4.739
-
(1994)
Operations Research
, vol.42
, Issue.4
, pp. 739-749
-
-
White, C.C.I.1
El-Deib, H.K.2
-
66
-
-
0010810245
-
Planning in stochastic domains: Problem characteristics and approximation
-
Hong Kong University of Science and Technology
-
Zhang, N. L., & Liu, W. (1996). Planning in stochastic domains: problem characteristics and approximation. Technical Report HKUST-CS96-31, Hong Kong University of Science and Technology.
-
(1996)
Technical Report HKUST-CS96-31
-
-
Zhang, N.L.1
Liu, W.2
|