Training AI-agents with Microsoft Project Bonsai

AnyLogic has joined forces with Microsoft to bring deep reinforcement learning and machine teaching capabilities of Project Bonsai to practical business applications. We are excited to announce that through our collaboration with Microsoft, we have developed an easy-to-use connector that allows you to use AnyLogic models as simulators connected to the Bonsai platform. This innovative novel use of business-oriented simulation models brings state-of-the-art adaptive control and deep reinforcement learning to real-world manufacturing and operations. Business Analysts and Engineers alike are now able to utilize advanced Artificial Intelligence without the need to become data scientists.


Reinforcement Learning is based on the idea of framing problems as a Markov decision process where an AI agent learns a control policy to always pick the best possible action for a given state of the system. Ideally, this system is somewhat random and dynamic, making a reward-based learning approach superior compared to other traditional control theories.

Project Bonsai enables subject matter experts, even those with no AI background, to incorporate their expertise directly into an AI model and teach it how to solve real-world business problems. Machine teaching, combined with deep reinforcement learning, helps enterprises to build AI models powerful enough to optimize and automate real-world systems.

As the market leader in simulation modeling for businesses, AnyLogic has a large user base among industry leaders in multiple domains that have already used simulation for their most complex problems. The addition of deep reinforcement learning and the machine teaching capabilities of Project Bonsai opens a new dimension into novel types of simulation-based solutions that are geared toward adaptive control policies.

Project Bonsai platform preview

Project Bonsai wrapper model for easy connection

To simplify the conversion of simulation models into learning environments (or “simulator” in the Project Bonsai platform), we have added a new experiment type – “RLExperiment” – to all versions of AnyLogic (Professional, Researcher, and the free Personal Learning Edition). You can use this RLExperiment to export a model that has the built-in mechanisms needed to seamlessly communicate with the Project Bonsai platform.

The RLExperiment has the fields necessary to facilitate the communication between your simulation model and the bonsai platform – specifically the configuration and terminal condition of each episode, and the observation and action space in each episode step.


Download AnyLogic
AnyLogic Bonsai platform workflow

Example models


The examples below are two AnyLogic example models that were refactored to be used as a learning environment (simulator) with the Bonsai platform. They are already incorporated in the wrapper and are ready to be used for training. You can get access to these models from the links below. Comprehensive documentation is also provided in their companion README.md file.

  • 01

    Activity Based Costing Analysis

    Activity Based Costing Analysis (ABCA) model
    A simplistic factory floor model where cost associated with product processing is calculated and analyzed using Activity-Based Costing (ABC). Each incoming product seizes some resources, is processed by a machine, conveyed, and later releases the resources. Cost accumulated by a product is broken down into several categories for analysis and optimization. The goal is to reduce the cost per product while maintaining a high overall throughput.

  • 02

    Product Delivery

    Product Delivery model
    A supply chain includes three manufacturing facilities and fifteen distributors that order random amounts of a product every 2 to 10 days. Upon receiving an order from a distributor, each manufacturing facility waits until enough product is created to fulfill the order (if it does not have enough in its current inventory) and then sends a loaded truck to fulfill the order. The goal is to find the best policy that results in the lowest cost per product while also keeping the average delivery time to a minimum. It does this by varying which centers should be open, in addition to the production rate and number of trucks in each center.

Simulation for training and testing AI – Email Pack

AnyLogic simulation is the training and testing platform for AI in business. With AnyLogic general-purpose simulation, you can construct detailed and robust virtual environments for training and testing your AI models. The unique multi-method simulation capabilities provide a comprehensive tool for use in machine learning. Established in use at leading companies across industries, this fully cloud enabled platform with open API is enhancing and accelerating AI development today. Find out more about this powerful machine learning tool in our AI email pack and white paper!

AI pack and white paper