Planning and Reinforcement Learning workshop at ICAPS 2021
17 décembre 5 min

Planning and Reinforcement Learning workshop at ICAPS 2021

By Hector Palacios, Research Scientist.

The second edition of the Bridging the Gap Between AI Planning and Reinforcement Learning workshop was held during ICAPS 2021, the International Conference on Automated Planning and Scheduling. This was a joint workshop program organized for AI researchers that work at the intersection of AI Planning and Reinforcement Learning. The goal was to encourage discussion and collaboration between the two communities. Both focus on sequential decision problems but with different emphasis and methods and little awareness of each other on specific issues, techniques, methodologies, and evaluation.

Before we jump into the workshop content, I’d like to provide a bit of background for anyone new to AI Planning or Reinforcement Learning for intelligent decision-making in the context of Enterprise AI research.

AI Planning vs. Reinforcement Learning for intelligent decision making

On the one hand, the AI Planning community aims to create algorithms guaranteed to achieve defined goals in specific worlds. On the other hand, the Reinforcement Learning (RL) community seeks to develop learning algorithms that produce a policy with a low average error rate in future instances of a specific world, but with no guarantees for any instance since one instance could be an uncommon case that a "good" policy ignores.

Reinforcement Learning algorithms require that the new instances are from the same world used for training, emphasizing the obtention of effective policies at the expense of specializing in a particular fixed world. In contrast, AI planning emphasizes flexibility, robustness, and adaptation to new instances and worlds at the expense of requiring a world description.

State-Of-The-Art in Reinforcement Learning consists of algorithms that need applied research to be used in specific domains, sometimes with great success like AlphaGo –a super-human Go player, or AlphaFold –an algorithm specialized in protein folding.

State-Of-The-Art algorithms in AI Planning can obtain plans with thousands of actions for unseen world descriptions. For some problems, AI Planning is particularly effective. For instance, the Logistics domain is one of the many benchmarks used for evaluating planning techniques. The planning community relies on a standard language for describing planning domain and problems: the Planning Domain Description Language (PDDL). New domains are introduced in the international planning competitions.

You can see and solve an instance of logistics using this example in online tool planning.domains, where you can see other instances of logistics and import other domains.

*** Image

The same logistics example can be explored using Visual Studio Code. Check out the repo PDDLGym, if you want to test RL algorithms in planning domains, both deterministic and with probabilistic effects. Documentation and educational material about planning are available at education.planning.domains and planning.wiki.

Why does AI Planning and Reinforcement Learning matter for Enterprise AI?

First, while Reinforcement Learning methods are becoming State-Of-The-Art in many AI use cases and applications, they do not scale as well as AI Planning when we account for generalization over a family of situations in the same domain, changes in the domain, and guarantees per instance. The relationship between Learning and Reasoning is at the core of this mismatch in scalability and calls for tighter integrations to overcome the weaknesses of each family of methods.

Second, ServiceNow offers customers a unified platform for Enterprise AI. A team of human agents using the ServiceNow platform can execute multiple actions to achieve their goals, e.g., to resolve a new IT incident. In such a scenario, AI planning and Reinforcement Learning can provide AI methods to recommend sequences of actions that resolve such incidents. In contrast, standard Machine Learning methods might be unaware of the consequences of intermediate actions.

I hope that you have found this overview of AI Planning and Reinforcement Learning to be useful. Here’s more on the workshop.

Bridging the Gap Between AI Planning and Reinforcement Learning (PRL) workshop

The organizers of the second edition of PRL accepted 25 papers out of 35, featured 5 invited talks, and hosted 11 oral presentations of papers, a poster session, and 5 discussion sessions around topics transversal to the papers we accepted. The recordings, papers, and posters are available on the PRL website. We are organizing a new edition of the PRL workshop. Please reach out if you would like to learn more or get involved.

Invited talks included:

Discussion session topics included:

  • Abstractions in Planning & RL.
  • Safe, Risk-sensitive, and Robust Planning and RL.
  • Domain Generalization in Planning and RL.

The papers converged around the following problems and techniques:

  • Problems considered:
  • Improve RL generalization in the same domain/world.
  • Using planning models to improve RL.
  • Safe RL and verifying RL guarantees.
  • Techniques used:
  • Planning and RL algorithms.
  • World simulators, some of them using a hidden planning model as ground truth.
  • Graph Neural Networks (GNNs).
  • Optimization under constraints.
  • Hierarchical representations.

Thank you to the co-organizers

We want to thank the co-organizers of the workshop: Hector Palacios from Element AI, a ServiceNow Company; Vicenç Gómez and Anders Jonsson from Universitat Pompeu Fabra in Barcelona/Spain; Scott Sanner from the University of Toronto; Andrey Kolobov from Microsoft Research in Redmond/CA/USA; and Alan Fern from Oregon State University.

Related content

In addition to being the co-chair of the PRL workshop, Hector Palacios participated in the main track of ICAPS 2021 with an invited talk titled “Planning for Controlling Business-to-business Applications” (recording).