Especially in complex manufacturing systems and uncertain conditions, sequencing operations in a machines’ queue can pose a difficult problem.
To solve the problem, intelligent, adaptable, and autonomous systems have been developed using machine learning and simulation to sequence operations under uncertainty within large manufacturing systems.
The transportation network and maintenance optimization research is based on the recent case of the renewal Vienna’s oldest subway line. For that purpose, large segments of the line are successively closed for several months.
The use of reinforcement learning (RL), as an area of machine learning, has proven to be suitable for selecting specific operations from queues or repairing schedules. This paper presents reinforcement learning as hyper-heuristic dynamically adjusting simple priority-based sequencing rules.
Reinforcement learning and simulation
In manufacturing systems, simple sequencing rules assign a priority to the jobs waiting in the queue depending on given criteria whenever a machine is freed for operation. One example could be sorting the jobs by the “shortest processing time” (SPT) or the “earliest due date” (EDD) to choose the one with the highest priority to be processed next.
This method of sequencing jobs in front of a machine is long known and still applied in the industry due to its simplicity.
Based on the simulation of the complex scenarios an approach utilizing simulation, computation power, and hyper-heuristics as well as data processing technique is applied.
Starting from an unknown environment, a reinforcement learning agent can perform different actions from a selection of different activities during the simulation (also called policy), based on the current environment variables (also called states).
Through the interaction with the environment and the resulting change, the agent receives a reward for the activity carried out. By trying different actions, the agent learns over time which activities bring a reward.
Using the discrete event-based simulation, realized with AnyLogic, a machine learning agent is trained as hyper-heuristic in a flexible manufacturing job shop scenario.
Result
A reinforcement learning hyper-heuristic has been integrated into a complex flexible manufacturing job shop simulation with uncertainty to dynamically select and adjust sequencing rules.
During the evaluation, it has been proven, that the dynamic selection of single attribute sequencing rules can improve the key performance indicators of the system.