A unifying framework for computational reinforcement learning theory

Li, Lihong

doi:doi:10.7282/T3S46S46

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

A unifying framework for computational reinforcement learning theory

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(2.69 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Li, Lihong. A unifying framework for computational reinforcement learning theory. Retrieved from https://doi.org/doi:10.7282/T3S46S46

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleA unifying framework for computational reinforcement learning theory

NameLi, Lihong (author); Littman, Michael (chair); Pazzani, Michael (internal member); Szegedy, Mario (internal member); Schapire, Robert (outside member); Rutgers University; Graduate School - New Brunswick

Date Created

Other Date2009-10 (degree)

SubjectComputer Science, Computational learning theory, Reinforcement learning, Machine learning

Extentxvii, 264 p. : ill.

DescriptionComputational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervised-learning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understanding the nature of supervised learning, they have not been as successful in reinforcement learning (RL). Here, the fundamental barrier is the need for active exploration in sequential decision problems.
An RL agent tries to maximize long-term utility by exploiting its knowledge about the problem, but this knowledge has to be acquired by the agent itself through exploring the problem that may reduce short-term utility. The need for active exploration is common in many problems in daily life, engineering, and sciences. For example, a Backgammon program strives to take good moves to maximize the probability of winning a game, but sometimes it may try novel and possibly harmful moves to discover how the opponent reacts in the hope of discovering a better game-playing strategy. It has been known since the early days of RL that a good tradeoff between exploration and exploitation is critical for the agent to learn fast (i.e., to reach near-optimal strategies with a small sample complexity), but a general theoretical analysis of this tradeoff remained open until recently.
In this dissertation, we introduce a novel computational learning model called KWIK (Knows What It Knows) that is designed particularly for its utility in analyzing learning problems like RL where active exploration can impact the training data the learner is exposed to. My thesis is that the KWIK learning model provides a flexible, modularized, and unifying way for creating and analyzing reinforcement-learning algorithms with provably efficient exploration. In particular, we show how the KWIK perspective can be used to unify the analysis of existing RL algorithms with polynomial sample complexity. It also facilitates the development of new algorithms with smaller sample complexity, which have demonstrated empirically faster learning speed in real-world problems. Furthermore, we provide an improved, matching sample complexity lower bound, which suggests the optimality (in a sense) of one of the KWIK-based algorithms known as delayed Q-learning.

NotePh.D.

NoteIncludes bibliographical references (p. 238-261)

Noteby Lihong Li

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/T3S46S46

Languageeng

CollectionGraduate School - New Brunswick Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide