Reinforcement Based U-Tree: A Novel Approach for Solving POMDP

Lei Zheng, Siu Yeung Cho, Chai Quek

Research output: Chapter in Book/Conference proceedingBook Chapterpeer-review

1 Citation (Scopus)


Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process using belief states. However, because the belief state space is continuous, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete prior knowledge of the environment. This article presents a memory-based reinforcement learning algorithm, namely Reinforcement based U-Tree, which is not only able to learn the state transitions from experience, but also build the state model by itself based on raw sensor inputs. This article describes an enhancement of the original U-Tree's state generation process to make the generated model more compact, and demonstrate its performance using a car-driving task with 31,224 world states. The article also presents a modification to the statistical test for reward estimation, which allows the algorithm to be benchmarked against some model-based algorithms with a set of well known POMDP problems.

Original languageEnglish
Title of host publicationHandbook on Decision Making
Subtitle of host publicationVol 1: Techniques and Applications
EditorsLakhmi Jain, Chee Peng Lim
Number of pages28
Publication statusPublished - 2010
Externally publishedYes

Publication series

NameIntelligent Systems Reference Library
ISSN (Print)1868-4394
ISSN (Electronic)1868-4408


  • Dynamic programming
  • Markov decision processes
  • memory-based reinforcement learning
  • partially observable Markov decision processes
  • reinforcement learning

ASJC Scopus subject areas

  • General Computer Science
  • Information Systems and Management
  • Library and Information Sciences


Dive into the research topics of 'Reinforcement Based U-Tree: A Novel Approach for Solving POMDP'. Together they form a unique fingerprint.

Cite this