中文 |

Newsroom

Researchers Propose Packet-Length-Adjustable Attention Model Based on Bytes Embedding for Smart Cybersecurity

Aug 20, 2019

In the studies of cybersecurity, malicious traffic detection is attracting increasing attention due to its capability of detecting attacks. Almost all of the intrusion detection methods based on deep learning have poor data processing capacity with the increase of data length. 

Most intrusion detection methods only handle the header part of the traffic and omit valuable information from the payload. As a result, they could not detect the malicious traffic when the hacker hides attack behavior in the payload. 

Researchers from the Institute of Acoustics (IOA) of the Chinese Academy of Sciences proposed an attention model which could process network traffic flow with adjustable length to detect payload-based attacks. Furthermore, they designed a Flow-WGAN (Wasserstein Generative Adversarial Networks) model to generate new network traffic data from the original data sets to enhance network packet data and protect users' privacy. 

The study was published online in the academic journal IEEE Access. 

In network traffic, different binary bytes have different meanings and are related to each other. However, the one-hot encoding (OHE) does not reflect the association between them but converts the bytes into byte vectors by means of word-embedding method. 

In this study, the researchers processed the network flow as natural language and employed word2vector algorithm to embed the bytes-to-bytes vector. So the distance between any two vectors could represent a partial semantic relationship of two associated bytes. 

They proposed a hierarchical attention model which could learn information from two levels of the network flow structure. This model could automatically find the parts which are critical to the classification task. As a consequence, this model will complete the classification task with better performance.  

Then researchers processed the packet vector as same as processing bytes. Inspired by the natural language model, the bytes are embedded into a vector so that the model only needs to classify the vectors. 

Furthermore, researchers proposed a Flow-WGAN to generate new data from the original data set. In order to evaluate the performance of the classifiers or improve the classifiers, researchers could take advantage of the generated network flow packet to simulate a new type of Internet application.  

The experiment showed that all classifiers had higher FAR (false acceptance rate) when process the generated network flow packets and our model possessed a lower FAR than the HAST-IDS (hierarchical spatial-temporal features-based intrusion detection system) model.

 

Figure 1. The structure of our hierarchical attention model (Image by IOA)  

 

Figure 2. Flow-WGAN to generate new type of flow (Image by IOA)  

The experiments based on the ISCX-2012 and ISCX-2017 datasets proved that the proposed model had higher performance in accuracy and true positive rate (TPR) than four state-of-the-art deep learning methods. The experiment showed that the proposed model outperformed the existing HSAT-IDS in the detection of the generated packets. In addition, the training time of this model was 30% less than the training time of HAST-IDS. This indicated that the proposed model could find the critical parts with attention mechanism and convergence faster. 

 
Contact

ZHOU Wenjia

Institute of Acoustics

E-mail:

A Packet-Length-Adjustable Attention Model Based on Bytes Embedding Using Flow-WGAN for Smart Cybersecurity

Related Articles
Contact Us
  • 86-10-68597521 (day)

    86-10-68597289 (night)

  • 86-10-68511095 (day)

    86-10-68512458 (night)

  • cas_en@cas.cn

  • 52 Sanlihe Rd., Xicheng District,

    Beijing, China (100864)

Copyright © 2002 - Chinese Academy of Sciences