Synthetic data generation

Simulation models can be used to generate unlimited amounts of relevant, clean, structured, and labeled training data. When using a simulation model in this way, the basic workflow is to execute multirun simulation experiments (ideally with parallel simulation runs) and record the results in a format that is consumable by ML algorithms. AnyLogic and AnyLogic Cloud provide a variety of ways to execute the models and write the outputs to a desirable repository.

Use Cases Workflows & Tools

Case 1: Test the efficacy of novel ML algorithms

ML Researchers can make use of the simulation model as an engine for the creation of clean, free-of-noise, unlimited, labeled data to test the efficacy of novel ML algorithms.


Case 2: Enhancing real-world data with additional synthetic (simulated) data

Simulation models that are properly verified and validated can be used to generate relevant data for training data-hungry ML models, especially deep learning models.


Case 3: Proof-of-concept ML solutions prior to investing in gathering real data

For any business thinking about future proofing its machine learning strategies, investing in mechanisms to expand and accelerate data collection is a major decision - any misstep could jeopardize the viability of its future data-centric solutions. Part of the dilemma is, prior to making use of the data, how to properly choose the relevance, type, source, and velocity of the collected data! Synthetic data generated from simulations can help data scientists test their hypotheses with proof-of-concept ML models prior to investing in data gathering methods and technologies.


Case 4: Approximating the simulation models with ML models

A metamodel, which is a simpler representation or substitute for the simulation model itself, can be developed by training an ML model on the inputs and outputs of a simulation model. This is extremely useful for scenarios where analysis of the simulation experiments’ results is a very computationally expensive process. ML models, and especially deep learning models, have shown a lot of promise in being able to capture the essence of non-linear dynamic systems. The resultant metamodel can be used for all types of experimentations that requires exploration of a massive search space.


Case 5: Deploying approximated simulation (trained ML models) on edge devices

An ML metamodel, developed from a simulation model, can serve as a light and portable version of the simulation. In doing this, it can then be effectively deployed on a growing number of AI deployment platforms, including edge devices. This approach provides a practical means to deploy simulation models on the back of deployment infrastructures built for AI solutions.

Workflows and Tools

When using a simulation model as an engine to generate synthetic data, the basic workflow is to execute multirun simulation experiments (ideally with parallel simulation runs) and record the results in a format that is consumable by ML algorithms. AnyLogic and AnyLogic Cloud provide a variety of ways to execute the models and write the outputs to a desirable repository.

Outputting to database connected to the model

Outputting to database connected to the model

Outputting to the AnyLogic built-in database or external databases connected to the model.

Each AnyLogic model comes with a built-in database which is suitable for high velocity data transfers and supports exporting to Excel files. You can also directly connect to Excel and Text files (locally or remotely) via AnyLogic’s simple and easy-to-use APIs. AnyLogic models can also connect to any relational database that supports JDBC.

AnyLogic Cloud for better scalability

AnyLogic Cloud for better scalability

Using AL Cloud to rapidly scale the multirun simulation experiment and output generation.

AnyLogic Cloud gives you access to a scalable and server-based execution platform for streamlining the generation of outputs. In its graphical environment, you can setup experiments and later export the results to Excel or JSON. The same can also be done via API calls (in JavaScript, Python, and Java). Models in the Cloud support direct connection to any relational database, in addition to automated Excel & JSON export and a write-to file-mechanism via its APIs.