Choongin Lee, Jeonghan Bae, Heejo Lee
Korea University, Seoul, Republic of Korea
Abstract. Protocol reverse engineering is the process of extracting application-level protocol specifications. The specifications are a useful source of knowledge about network protocols and can be used for various purposes. Despite the successful results of prior works, their methods primarily result in the inference of a limited number of message types. We herein propose a novel approach that infers a minimized state machine while having a rich amount of information. The combined input of tokens extracted from the network protocol binary executables and network traces enables the inference of new message types and protocol behaviors which had not been found in previous works. In addition, we propose a state minimization algorithm that can be applied to real-time black-box inference. The experimental results show that our approach can infer the largest number of message types for file-transfer protocol (FTP) and simple mail-transfer protocol (SMTP) compared to eight prior arts. Moreover, we found unexpected behaviors in two protocol implementations using the inferred state machines.
Keywords: Protocol reverse engineering, State machine reconstruction, Automatic protocol analysis
The paper published in the IFIP SEC 2018 confeence proceedings by Springer Verlag