• Skip to content (access key 1)
  • Skip to search (access key 7)
FWF — Austrian Science Fund
  • Go to overview page Discover

    • Research Radar
      • Research Radar Archives 1974–1994
    • Discoveries
      • Emmanuelle Charpentier
      • Adrian Constantin
      • Monika Henzinger
      • Ferenc Krausz
      • Wolfgang Lutz
      • Walter Pohl
      • Christa Schleper
      • Elly Tanaka
      • Anton Zeilinger
    • Impact Stories
      • Verena Gassner
      • Wolfgang Lechner
      • Birgit Mitter
      • Oliver Spadiut
      • Georg Winter
    • scilog Magazine
    • Austrian Science Awards
      • FWF Wittgenstein Awards
      • FWF ASTRA Awards
      • FWF START Awards
      • Award Ceremony
    • excellent=austria
      • Clusters of Excellence
      • Emerging Fields
    • In the Spotlight
      • 40 Years of Erwin Schrödinger Fellowships
      • Quantum Austria
    • Dialogs and Talks
      • think.beyond Summit
    • Knowledge Transfer Events
    • E-Book Library
  • Go to overview page Funding

    • Portfolio
      • excellent=austria
        • Clusters of Excellence
        • Emerging Fields
      • Projects
        • Principal Investigator Projects
        • Principal Investigator Projects International
        • Clinical Research
        • 1000 Ideas
        • Arts-Based Research
        • FWF Wittgenstein Award
      • Careers
        • ESPRIT
        • FWF ASTRA Awards
        • Erwin Schrödinger
        • doc.funds
        • doc.funds.connect
      • Collaborations
        • Specialized Research Groups
        • Special Research Areas
        • Research Groups
        • International – Multilateral Initiatives
        • #ConnectingMinds
      • Communication
        • Top Citizen Science
        • Science Communication
        • Book Publications
        • Digital Publications
        • Open-Access Block Grant
      • Subject-Specific Funding
        • AI Mission Austria
        • Belmont Forum
        • ERA-NET HERA
        • ERA-NET NORFACE
        • ERA-NET QuantERA
        • Alternative Methods to Animal Testing
        • European Partnership BE READY
        • European Partnership Biodiversa+
        • European Partnership BrainHealth
        • European Partnership ERA4Health
        • European Partnership ERDERA
        • European Partnership EUPAHW
        • European Partnership FutureFoodS
        • European Partnership OHAMR
        • European Partnership PerMed
        • European Partnership Water4All
        • Gottfried and Vera Weiss Award
        • LUKE – Ukraine
        • netidee SCIENCE
        • Herzfelder Foundation Projects
        • Quantum Austria
        • Rückenwind Funding Bonus
        • WE&ME Award
        • Zero Emissions Award
      • International Collaborations
        • Belgium/Flanders
        • Germany
        • France
        • Italy/South Tyrol
        • Japan
        • Korea
        • Luxembourg
        • Poland
        • Switzerland
        • Slovenia
        • Taiwan
        • Tyrol-South Tyrol-Trentino
        • Czech Republic
        • Hungary
    • Step by Step
      • Find Funding
      • Submitting Your Application
      • International Peer Review
      • Funding Decisions
      • Carrying out Your Project
      • Closing Your Project
      • Further Information
        • Integrity and Ethics
        • Inclusion
        • Applying from Abroad
        • Personnel Costs
        • PROFI
        • Final Project Reports
        • Final Project Report Survey
    • FAQ
      • Project Phase PROFI
      • Project Phase Ad Personam
      • Expiring Programs
        • Elise Richter and Elise Richter PEEK
        • FWF START Awards
  • Go to overview page About Us

    • Mission Statement
    • FWF Video
    • Values
    • Facts and Figures
    • Annual Report
    • What We Do
      • Research Funding
        • Matching Funds Initiative
      • International Collaborations
      • Studies and Publications
      • Equal Opportunities and Diversity
        • Objectives and Principles
        • Measures
        • Creating Awareness of Bias in the Review Process
        • Terms and Definitions
        • Your Career in Cutting-Edge Research
      • Open Science
        • Open-Access Policy
          • Open-Access Policy for Peer-Reviewed Publications
          • Open-Access Policy for Peer-Reviewed Book Publications
          • Open-Access Policy for Research Data
        • Research Data Management
        • Citizen Science
        • Open Science Infrastructures
        • Open Science Funding
      • Evaluations and Quality Assurance
      • Academic Integrity
      • Science Communication
      • Philanthropy
      • Sustainability
    • History
    • Legal Basis
    • Organization
      • Executive Bodies
        • Executive Board
        • Supervisory Board
        • Assembly of Delegates
        • Scientific Board
        • Juries
      • FWF Office
    • Jobs at FWF
  • Go to overview page News

    • News
    • Press
      • Logos
    • Calendar
      • Post an Event
      • FWF Informational Events
    • Job Openings
      • Enter Job Opening
    • Newsletter
  • Discovering
    what
    matters.

    FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

    SOCIAL MEDIA

    • LinkedIn, external URL, opens in a new window
    • , external URL, opens in a new window
    • Facebook, external URL, opens in a new window
    • Instagram, external URL, opens in a new window
    • YouTube, external URL, opens in a new window

    SCILOG

    • Scilog — The science magazine of the Austrian Science Fund (FWF)
  • elane login, external URL, opens in a new window
  • Scilog external URL, opens in a new window
  • de Wechsle zu Deutsch

  

Structure in Reinforcement Learning

Structure in Reinforcement Learning

Ronald Ortner (ORCID: 0000-0001-6033-2208)
  • Grant DOI 10.55776/J3259
  • Funding program Erwin Schrödinger
  • Status ended
  • Start January 1, 2012
  • End October 31, 2012
  • Funding amount € 28,825
  • Project website

Disciplines

Computer Sciences (50%); Mathematics (50%)

Keywords

    Reinforcement Learning, Regret, Markov decision processes, Computational Learning Theory

Abstract

Markov decision processes (MDPs) are a generic tool for modeling stochastic environments and have found various applications since their introduction in the 1950s by Richard Bellman. In the 1980s Artificial Intelligence research discovered MDPs as models for learning optimal behavior in environments with "delayed feedback". While various algorithms for reinforcement learning in unknown MDPs have been developed, these methods have been denied a breakthrough in spite of some success stories like the backgammon algorithm of Gerald Tesauro. The major practical problem that prevents implementation of reinforcement learning algorithms for many potential applications is that typical algorithms are not efficient in environments with large state spaces. While many real world problems could in principle be handled by representing them as MDPs, such representations usually have a large state space or a large action space (and often both). Thus typical reinforcement learning algorithms are too costly, as their complexity and regret (the lost total reward with respect to an optimal strategy) grows linearly or even polynomially with the number of states and actions. The reason for this is that unlike humans, who can exploit symmetries and similarities in a learning problem, most reinforcement learning algorithms are not able to make use of the environment`s structure. The main focus of the proposed project lies on the investigation of similarity structures for MDPs, and the development of algorithms which are able to exploit such structures. The availability of such tools which are able to deal with structured environments will make reinforcement learning much more interesting for problem domains which are currently handled by heuristics, task specific expert knowledge, or not at all. Thus, applications would neither be restricted to toy problems nor to typical reinforcement learning domains like game playing. Instead, more general control problems in various areas such as robotics and logistics would become accessible to reinforcement learning methods. The proposed project will concentrate on the following two topics: First, similarity structures for state aggregation in MDPs shall be examined, and in a further step exploited by adaptive online aggregation algorithms. Second, these aggregation techniques shall be applied to MDPs with continuous state space, a setting which is of particular importance for applications. In design and analysis of algorithms, application of suitable upper confidence bounds will play a key role. The project shall be conducted within the SequeL team of INRIA Lille, an interdisciplinary center for reinforcement learning. However, collaboration will not be confined to the SequeL group, as INRIA hosts other groups on neighboring fields such as optimization, statistics, and control theory, which may contribute to the success of the project.

Research institution(s)
  • Inria Lille - Nord Europe - 100%

Research Output

  • 42 Citations
  • 2 Publications
Publications
  • 2012
    Title Regret Bounds for Restless Markov Bandits
    DOI 10.1007/978-3-642-34106-9_19
    Type Book Chapter
    Author Ortner R
    Publisher Springer Nature
    Pages 214-228
  • 2012
    Title Adaptive aggregation for reinforcement learning in average reward Markov decision processes
    DOI 10.1007/s10479-012-1064-y
    Type Journal Article
    Author Ortner R
    Journal Annals of Operations Research
    Pages 321-336

Discovering
what
matters.

Newsletter

FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

Contact

Austrian Science Fund (FWF)
Georg-Coch-Platz 2
(Entrance Wiesingerstraße 4)
1010 Vienna

office(at)fwf.ac.at
+43 1 505 67 40

General information

  • Job Openings
  • Jobs at FWF
  • Press
  • Philanthropy
  • scilog
  • FWF Office
  • Social Media Directory
  • LinkedIn, external URL, opens in a new window
  • , external URL, opens in a new window
  • Facebook, external URL, opens in a new window
  • Instagram, external URL, opens in a new window
  • YouTube, external URL, opens in a new window
  • Cookies
  • Whistleblowing/Complaints Management
  • Accessibility Statement
  • Data Protection
  • Acknowledgements
  • IFG-Form
  • Social Media Directory
  • © Österreichischer Wissenschaftsfonds FWF
© Österreichischer Wissenschaftsfonds FWF