News Release

Towards a safe society 5.0: Reinforcement learning pentesting agent training in realistic network environments

Researchers developed an innovative and realistic reinforcement learning agent training framework for penetration testing (pentesting) purposes

Peer-Reviewed Publication

Japan Advanced Institute of Science and Technology

Figure1_Overview of the PenGym framework architecture

image: 

Overview of the PenGym framework architecture.

view more 

Credit: Razvan Beuran from JAIST.

  • Researchers at the Japan Advanced Institute of Science and Technology (JAIST) implemented a framework named PenGym that supports the creation of realistic training environments for reinforcement learning pentesting agents, accommodating diverse complexity scenarios with actual network hosts and security vulnerabilities. This implementation was done in collaboration with KDDI Research, Inc. (hereafter KDDI Research).
  • The experiment results demonstrated the advantages and effectiveness of using PenGym as a realistic training environment, with PenGym-trained agents having superior pentesting performance compared to simulation-trained agents.
  • The optimizations ensure that PenGym offers a reasonable training duration compared to simulation, even though the agents execute actual actions on the network hosts, and the approach leads to a high overall realism of the trained agents.

 

Ensuring the security of network systems and infrastructure is a critical aspect of cybersecurity. Penetration testing (pentesting) is an effective method for evaluating the network security posture. In recent years, researchers aimed to develop efficient approaches for conducting the pentesting procedure automatically to address the issues in traditional manual and time-consuming methods. One approach is to use reinforcement learning (RL) techniques, which have been applied to create automated agents that mimic the actions of human pentesters but have enhanced speed, scale, and precision. Various simulation environments have been introduced as the main method to train these RL agents. However, the heavy reliance on predefined constants and probabilistic values for agent actions and environment states leads to potential inaccuracies in replicating real-world behavior due to factors that were not modeled, thus decreasing agent accuracy and performance. In addition, the simulated network may not accurately represent the configuration and topology of an actual network.

To address this “reality gap”, a team of researchers led by Associate Professor Razvan Beuran, along with his doctoral student Huynh Phuong Thanh Nguyen at the Japan Advanced Institute of Science and Technology (JAIST), and researchers at KDDI Research, has designed and implemented PenGym, an effective and reliable realistic training framework for RL pentesting agents that was developed as part of a joint project with KDDI Research. PenGym enables RL agents to execute actual actions on realistic hosts in network environments. For this purpose, the framework contains an Action/State Module that implements a set of real pentesting actions for the interaction between the RL agents and the training environment. Moreover, the training environment is based on the cyber range technology used for human cybersecurity training and is created automatically according to several pentesting scenarios. Several optimization techniques were implemented to enhance the time execution performance of PenGym. As a result, their framework eliminates the need for action modeling, resulting in a more accurate representation of network and security dynamics compared to simulation-based environments. Their study was published in Computers & Security.

The approach of using a real network environment that makes possible the execution of actual pentesting actions, as employed in this research, yields promising results compared to simulated environments. In particular, their experiments demonstrated the advantages and effectiveness of using PenGym as a realistic training environment for RL pentesting agents. Thus, the PenGym-trained agents showed a superior pentesting performance in real networks compared to simulation-trained agents.

Based on the experiment results the researchers obtained, they consider that their research could lead to changes in various network-related research areas, potentially replacing the traditional approach of creating complex logical models to simulate network environments with more realistic methods. Furthermore, realistic training environments can be applied to other research areas. One important example is automated cyber defense using RL agents, which can be used to enhance the protection mechanisms of real network infrastructure and contribute to the trustworthiness of Society 5.0. To support the potential activities of other researchers in this field, they released PenGym as open source on GitHub.

 

###

 

Reference

Title of original paper:

PenGym: Realistic training environment for reinforcement learning pentesting agents

Authors:

Huynh Phuong Thanh Nguyen, Kento Hasegawa (KDDI Research), Kazuhide Fukushima (KDDI Research), Razvan Beuran

Journal:

Computers & Security

DOI:

10.1016/j.cose.2024.104140

 

PenGym source code URL: https://github.com/cyb3rlab/PenGym            

 

 

About Japan Advanced Institute of Science and Technology, Japan

Founded in 1990 in Ishikawa prefecture, the Japan Advanced Institute of Science and Technology (JAIST) was the first independent national graduate school in Japan. Now, after 30 years of steady progress, JAIST has become one of Japan’s top-ranking universities. JAIST counts with multiple satellite campuses and strives to foster capable leaders with a state-of-the-art education system where diversity is key; about 40% of its alumni are international students. The university has a unique style of graduate education based on a carefully designed coursework-oriented curriculum to ensure that its students have a solid foundation on which to carry out cutting-edge research. JAIST also works closely both with local and overseas communities by promoting industry–academia collaborative research.

 

Collaboration

This work was based on a joint-research project with KDDI Research, Inc.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.