Testing or embedding ML models

In this scenario, trained ML models are used from within simulated environments. In other words, an ML model is first trained outside of the AnyLogic ecosystem. Then, to evaluate or utilize its predictive abilities, it can be included as part of the simulation model.
Testing or embedding ML models requires a connection between a trained ML model and the simulation model during runtime. AnyLogic provides multiple connection options, regardless of the language it was trained in and for both local and remote connections.

Use Cases Workflows & Tools

Case 1: ML models as an alternative for inputs that represent abstracted behavior in a model

One use of input parameters in a simulation model is to define some approximated behavior which is based on the causal rules present in a real system (e.g., delay time, arrival rate, etc); these are often modeled as univariate random variables or, in some scenarios, as a random vector with multivariate probability distributions. As a substitute, ML models can be used for these types of input parameters.

Case 2: Using ML models to approximate the behavior of components in the simulated system

When certain components within a simulation model are exceedingly complex due to requiring a great level of detail, a trained ML model could be used as a substitution to approximate the behavior. This is similar to the previous case but concerns a specific component of the model that may not be necessarily abstractable with a single value (or distribution). One obvious example would be a physical device that is not readily modellable by generic simulation methods but is based on real data that can easily be approximated with machine learning.

Case 3: Incorporating any existing, deployed ML models into the simulated environment for increased accuracy

A simulation model should replicate the rules of a real system – a fact that also applies to any existing embedded AI solutions in a given system. Rules and behaviors that are a direct result of a system’s deployed AI solutions should also be incorporated in the simulation. The most natural way of achieving this is to directly embed the AI solutions into the simulation.

Case 4: Testing the impact of an AI solution on the system’s overall performance before deployment

The objective of embedding AI components into a system is to improve overall system performance, not just of the specific components being substituted by AI. It is a reasonable expectation that deploying a well-trained AI solution will have a significant improvement on the overall performance of the target system. However, any perturbation in a system has the potential to shift any bottlenecks or cause other ripple effects. Testing a trained model on its own (e.g., using a sample of test data) does not verify that the performance of the modified system - as a whole - is sufficiently improved. Simulation models can be used as a virtual, risk-free environment to test the implications of incorporating AI into existing systems.

Case 5: Visualizing the math!

Data scientists are familiar with the problem of showing and communicating the effect of their ML solutions to those who are not familiar with the nuances or implications involved (e.g., customers, managers, or decision makers). One purpose of simulation modeling software is to present a model’s dynamic behavior in a fashion that is both visually appealing and easily comprehendible. As such, it can be used to visually show the performance of a simulated environment both with and without a ML solution.

Case 6: Testing RL-policies in the original simulated environment

The ultimate objective of reinforcement learning is to learn a useful policy that can optimally control a system. Since the learning process (training) is done in a simulated environment, the same simulation model can also be used to test the learned policy! Assessing the performance of the learned policy would be a useful step before deploying it in the real system, a process that is generally referred to as “Sim-to-Real Transfer”.

Workflows and Tools

Testing or embedding ML models requires a connection between a trained ML model and a simulation model during runtime. AnyLogic provides three general options to connect with ML models, regardless of the language it was trained in and for both local and remote connections.

Communication via API calls

Communication via API calls

Remote communication (via API calls) with ML models hosted by ML/AutoML platforms.

Any trained ML model can be deployed in a way that allows it to be queried via API. Almost all popular ML/Auto ML platforms provide an easy mechanism to do this out-of-the-box. Within your simulation, simple code can be added to construct a request at the appropriate time and apply the returned prediction. This workflow abstracts away from the programming language of the trained ML model.

ML model embedded into simulation environment

ML model embedded into simulation environment

Natively embed a trained ML model into the simulation environment.

Local querying of a trained ML model is possible when the training was either in Java (AnyLogic’s native language) or when the training platform allows the ML model to be exported in a format usable in Java. For example, with H2O.ai Driverless AI, trained models can be downloaded as MOJOs (Model Objects, Optimized) - a scoring engine that can be deployed in any Java environment with the assistance of their Java libraries.

Access to ML model using Pypeline library

Access to ML model using Pypeline library

Usage of the Pypeline library for local access to a trained ML model based in Python.

If a deployed trained ML model is accessible by a Python library, then Pypeline - a custom AnyLogic library - can be used to query it. Pypeline allows you to execute Python scripts with arguments or interactively run Python code from within your AnyLogic models, using a local installation of Python.