Research on Multiagent Search




I initiated an agent research group in NTT, and start creating search algorithms for multiagent systems. I worked on path finding problems and constraint satisfaction problems, the two major search problems in AI.


I focuses on extending realtime search algorithms for autonomous agents and for a multiagent world. Though realtime search provides an attractive framework for resource-bounded problem solving, the behavior of the problem solver is not rational enough for autonomous agents. He created an algorithm so that the problem solver can utilize and improve previous experiments, adapt to the dynamically changing goals, and cooperatively solve problems with other problem solvers.


For constraint satisfaction problems, I worked with M. Yokoo and created a new field called a distributed constraint satisfaction problem, which is to formalize various coordination processes among the multiple agents. The original distributed constraint satisfaction algorithm was born under his direction in this group and become widely accepted in the multiagent community.


Controlling learning processes


Existing path planninhg algorithms can be divided into two classes: offline search such as A*, and realtime search such as Real-Time-A* (RTA*) and Learning Real-Time-A* (LRTA*). Offline search completely examines every possible path to the goal state before executing that path, while realtime search makes each decision in a constant time, and commits its decision to the physical world. The problem solver eventually reaches the goal by repeating the cycle of planning and execution. Realtime search cannot guarantee to find an optimal solution, but can interleave planning and execution.


I first invented the way of controlling learning processes. The capability of learning is one of the salient features of realtime search. The major impediment is, however, the instability of the solution quality during convergence. Other problems are that (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. He proposed and analyzed two new realtime search algorithms to stabilize the convergence process. Epsilon-search (weighted realtime search) allows suboptimal solutions with epsilon error to reduce the total amount of learning performed. Delta-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, Delta-search can better control the tradeoff between exploration and exploitation.


Moving target search


I then considered the case of heuristic search where the goal may change during the course of the search. For example, the goal may be a target that actively avoids the problem solver.  A moving target search algorithm (MTS) is thus presented to solve this problem. He proved that if the average speed of the target is slower than that of the problem solver, the problem solver is guaranteed to eventually reach the target in a connected problem space.


The original MTS algorithm was constructed using the fewest operations necessary to guarantee its completeness, and hence is not very efficient. To improve its efficiency, ideas are introduced from the area of resource-bounded planning, including (1) commitment to goals, and (2) deliberation for selecting plans.  Experimental results demonstrate that the improved MTS is 10 to 20 times more efficient than the original MTS in uncertain situations.


Realtime bidirectional search


I also investigated realtime bidirectional search (RTBS) algorithms, where two problem solvers, starting from the initial and the goal states, physically move toward each other. To evaluate the RTBS performance, two kinds of algorithms are proposed and are compared to realtime unidirectional search. One is called centralized RTBS where a supervisor always selects the best action from all possible moves of the two problem solvers. The other is called decoupled RTBS where no supervisor exists and the two problem solvers independently select their next moves.


Experiments on mazes and n-puzzles show that (1) in clear situations decoupled RTBS performs better, while in uncertain situations, centralized RTBS becomes more efficient, and that (2) RTBS is more efficient than realtime unidirectional search for 15- and 24-puzzles but not for randomly generated mazes. It will be shown that the selection of the problem solving organization equals the selection of the problem space, which determines the baseline of the organizational efficiency; once a difficult problem space is selected, local coordination among the problem solvers rarely overcomes the deficit.


Organizational problem solving

I introduced a new problem solving paradigm, called organizational problem solving, where multiple agents cooperatively achieve a shared goal. In this paradigm, during a distributed problem solving process, the reorganization of multiple agents is triggered when the agents observe undesired performance degradation. This reflective feedback between distributed problem solving and organization self-design characterizes organizational problem solving, and enables agents to maintain their efficiency in a dynamically changing environment.


Distributed constraint satisfaction


Collecting all information about a problem from multiple agents requires not only the communication costs but also the costs of translating onefs knowledge into an exchangeable format. To avoid the costs of centralizing all information to one agent he developed a formalism called a distributed constraint satisfaction problem (distributed CSP) and algorithms for solving distributed CSPs. A distributed CSP is a constraint satisfaction problem in which variables and constraints are distributed among multiple agents. Various application problems in multiagent systems can be formalized as distributed CSPs. We developed technique called asynchronous backtracking that allows agents to act asynchronously and concurrently based on their local knowledge without any global control, while the completeness of the algorithm is guaranteed.


1.      Toru Ishida and Richard E. Korf. Moving-Target Search: A Real-Time Search for Changing Goals. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 6, pp. 609-619, 1995.

2.      Toru Ishida. Real-Time Bidirectional Search: Coordinated Problem Solving in Uncertain Situations. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 617-628, 1996.

3.      Toru Ishida and Masashi Shimbo. Improving the Learning Efficiencies of Realtime Search. National Conference on Artificial Intelligence (AAAI-96), pp. 305-310, 1996.

4.      Toru Ishida, Real-Ttime Search for Learning Autonomous Agents. Kluwer Academic Publishers, 1997.

5.      Toru Ishida. Real-Time Search for Autonomous Agents and Multi-Agent Systems. Journal of Autonomous Agents and Multi-Agent Systems, Kluwer Academic Publishers, Vol. 1, No. 2, pp.139-167, 1998.

6.      Makoto Yokoo, Edmund H. Durfee, Toru Ishida, and Kazuhiro Kuwabara. The Distributed Constraint Satisfaction Problem: Formalization and Algorithms. IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No. 5, pp. 673-685, 1998.

7.      Makoto Yokoo and Toru Ishida. Search Algorithms for Agents. Gerhard Weiss Ed. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, pp. 165-199, 1999.

8.      Masashi Shimbo and Toru Ishida. Controlling the Learning Process of Real-time Heuristic Search. Artificial Intelligence. 2003.