|Sendner, Christoph; Chen, Huili; Fereidooni, Hossein; Petzi, Lukas; König, Jan; Stang, Jasper; Dmitrienko, Alexandra; Sadeghi, Ahmad-Reza; Koushanfar, Farinaz
|Ethereum smart contracts are automated decentralized applications on the blockchain that describe the terms of the agreement between buyers and sellers, reducing the need for trusted intermediaries and arbitration. However, the deployment of smart contracts introduces new attack vectors into the cryptocurrency systems. In particular, programming flaws in smart contracts have been already exploited to lead to enormous financial loss. Hence, it is crucial to detect various vulnerability types in contracts effectively and efficiently. Existing vulnerability detection methods are limited in scope as they typically focus on one or a very limited set of vulnerabilities. Also, extending them to new vulnerability types requires costly re-design.
In this work, we develop ESCORT, a deep learning-based vulnerability detection method that uses a common feature extractor to learn generic bytecode semantics of smart contracts and separate branches to learn the features of each vulnerability type. As a multi-label classifier, ESCORT can detect multiple vulnerabilities of the contract at once. Compared to prior detection methods, ESCORT can be easily extended to new vulnerability types with limited data via transfer learning. When a new vulnerability type emerges, ESCORT adds a new branch to the trained feature extractor and trains it with limited data. We evaluated ESCORT on a dataset of 3.61 million smart contracts and demonstrate that it achieves an average F1 score of 98% on six vulnerability types in initial training and yields an average F1 score of 96% in transfer learning phase on five additional vulnerability types. To the best of our knowledge, ESCORT is the first deep learning-based framework that utilizes transfer learning on new vulnerability types with minimal model modification and re-training overhead. Compared with existing non-ML tools, ESCORT can be applied to contracts of arbitrary complexity and ensures 100% contract coverage. In addition, we enable concurrent detection of multiple vulnerability types using a single unified framework, thus avoiding the efforts of setting up multiple tools and greatly reducing the detection time. We will open source our dataset and the data labeling toolchain to facilitate future research.