This repository contains results of following conference publication:
An Approach for Deep Reinforcement Learning for Production Program Planning in Value Streams
- Nikolai West ORCID
- Florian Hoffmann ORCID
- Lukas Schulte ORCID
- Victor Hernandez Moreno ORCID
- Jochen Deuse ORCID
The application of Reinforcement Learning (RL) methods offers a potential for improvement in operational Production Program Planning. Numerous influences and domain-specific practices characterize the multi-dimensional planning paradigm. RL can support human planning personnel in the determination of optimal production parameters. This requires a suitable abstraction of the overall system by means of simulation and subsequent optimization by a self-learning agent. In this paper, the authors present an application example for sequence planning using RL. The case study includes a discrete-event simulation built with SimPy that is trained by a Duelling Deep-Q-Network implemented in PyTorch. Finally, the suitability of two reward functions is discussed. The authors fully provide the case study via GitHub.
2021 ASIM 2nd Simulation in Production and Logistics (ASIM 2021)
- Published (available)
The repository contains two main files to perform the proposed approach for deep reinforcement learning for production program planning in manufacturing value streams.
1. Simulation of a simple manufacturing system (factory_simulation.py)
Contains Python code for a simple manufacturing simulation. This simulation models a factory with five stations and two products (A and B). It is implemented using SimPy and provides an interface that resembles an OpenAI Gym environment, making it suitable for Reinforcement Learning applications.
2. Deep reinforcement learning agent using DDQN (factory_agent.py)
Contains Python code that defines a reinforcement learning agent for optimizing manufacturing processes. The agent uses Q-learning to learn the best actions to take in a simulated factory environment, balancing exploration and exploitation through an epsilon-greedy strategy. It also includes functions for training the agent and making decisions to maximize rewards and improve manufacturing efficiency. The design of the simulation is choosen as to allow one ideal scenario, outlined below. It is the agents job to identify such a pattern using the DDQN.
All results of the training are made available in the repsository (/results). The folder contains data and plots for both reward functions (RF1 and RF2) as outlined in the paper.
- Clone the Repository: Clone this repository to your local machine
- Install Dependencies: Set up a new env using requirements.txt
- Run the Project: You can now run the project. For example:
-
To run the simulation, use the following command:
python factory_simulation.py
-
To train the agent, execute the training script:
python train_agent.py
- Explore the Results: After running, you can explore the generated results, plots, or trained models based on the parameters you have set for the simulation (or agent).
We welcome contributions to this repository. If you have a feature request, bug report, or proposal, please open an issue. If you wish to contribute code, please open a pull request.