Training AI-agents with Microsoft Project Bonsai

AnyLogic has joined forces with Microsoft to bring deep reinforcement learning and machine teaching capabilities of Project Bonsai to practical business applications. We are excited to announce that through our collaboration with Microsoft, we have developed an easy-to-use connector that allows you to use AnyLogic models as simulators connected to the Bonsai platform. This innovative novel use of business oriented simulation models brings state-of-the-art adaptive control and deep reinforcement learning to real world manufacturing and operations. Business Analysts and Engineers alike are now able to utilize advanced Artificial Intelligence without the need to become data scientists.

Reinforcement Learning is based on the idea of framing problems as a Markov decision process where an AI agent learns a control policy to always pick the best possible action for a given state of the system. Ideally, this system is somewhat random and dynamic, making a reward-based learning approach superior compared to other traditional control theories.

Project Bonsai enables subject matter experts, even those with no AI background, to incorporate their expertise directly into an AI model and teach it how to solve real-world business problems. Machine Teaching, combined with deep reinforcement learning, helps enterprises to build AI models powerful enough to optimize and automate real-world systems.

As the market leader in simulation modeling for businesses, AnyLogic has a large user base among industry leaders in multiple domains that have already used simulation for their most complex problems. The addition of deep reinforcement learning and the machine teaching capabilities of Project Bonsai opens a new dimension into novel types of simulation-based solutions that are geared toward adaptive control policies.

Project Bonsai platform preview

Project Bonsai wrapper model for easy connection

To simplify the conversion of simulation models into learning environments (“simulator” in Bonsai), you can use the provided wrapper model - a customized AnyLogic model that has all connectivity requirements built into it. With a simple drag-and-drop of the root agent into the wrapper, you can easily convert a regular AnyLogic model into a Bonsai-ready simulator.

The wrapper model helps you in establishing the connection with the Bonsai platform and adding the interfaces needed for reinforcement learning (observation and action space in each episode step). The wrapper and its user guide, which explains the steps involved in preparing a Bonsai ready simulator, are downloadable from the link below.


Wrapper and User Guide
AnyLogic Bonsai platform workflow

Webinar: Deep Reinforcement Learning with AnyLogic and Microsoft Project Bonsai

Microsoft Project Bonsai enables subject matter experts, even those with no AI background, to utilize their expertise to teach AI agents and solve real-world business problems. Here is our webinar that introduces the key concepts and workflow for using Project Bonsai and AnyLogic simulation together.

The webinar follows AnyLogic and Microsoft joining forces to bring deep reinforcement learning and machine teaching to business applications. David Coe, principal program manager in the Microsoft Project Bonsai team and AnyLogic AI Program Lead, Arash Mahdavi, demonstrate the end-to-end workflow for training AI agents using simulation and for implementing AI control in a simulation.

Example models


The examples below are two AnyLogic example models that were refactored to be used as a learning environment (simulator) with the Bonsai platform. They are already incorporated in the wrapper and are ready to be used for training. You can get access to these models from the links below. Comprehensive documentation is also provided in their companion README.md file.

  • 01

    Activity Based Costing Analysis

    Activity Based Costing Analysis (ABCA) model
    A simplistic factory floor model where cost associated with product processing is calculated and analyzed using Activity-Based Costing (ABC). Each incoming product seizes some resources, is processed by a machine, conveyed, and later releases the resources. Cost accumulated by a product is broken down into several categories for analysis and optimization. The goal is to reduce the cost per product while maintaining a high overall throughput.

  • 02

    Product Delivery

    Product Delivery model
    A supply chain includes three manufacturing facilities and fifteen distributors that order random amounts of a product every 2 to 10 days. Upon receiving an order from a distributor, each manufacturing facility waits until enough product is created to fulfill the order (if it does not have enough in its current inventory) and then sends a loaded truck to fulfill the order. The goal is to find the best policy that results in the lowest cost per product while also keeping the average delivery time to a minimum. It does this by varying which centers should be open, in addition to the production rate and number of trucks in each center.

Simulation for training and testing AI – Email Pack

AnyLogic simulation is the training and testing platform for AI in business. With AnyLogic general-purpose simulation, you can construct detailed and robust virtual environments for training and testing your AI models. The unique multi-method simulation capabilities provide a comprehensive tool for use in machine learning. Established in use at leading companies across industries, this fully cloud enabled platform with open API is enhancing and accelerating AI development today. Find out more about this powerful machine learning tool in our AI email pack and white paper!

AI pack and white paper