Article Info

An Extensive Mining of Ethereum Contract Code by Adapting Relevant Feature Generators to Optimize The Performance of Anomaly Detection Using an Ensemble Model

Sabri Hisham, Mokhairi Makhtar, Azwa Abdul Aziz
dx.doi.org/10.17576/apjitm-2025-1401-15

Abstract

Blockchain 3.0 has introduced a decentralized application (dApp) that makes smart contracts and Ethereum (ETH) more widely used in important sectors. However, this technology's open nature has brought high-security risks, such as fraud and manipulation of smart contracts. Manual anomaly detection steps are inefficient because they involve analyzing complex and high-dimensional ETH data sets. Early detection using machine learning approaches is an effective measure to identify fraudulent activities. However, previous studies face challenges in selecting relevant features because most focus on analyzing historical transactions and limited feature components. Thus, this study proposes the generation of relevant features based on the combination of three data sets from the contract source code containing binary classes (normal, Ponzi) extracted from Etherscan.io, namely ABI, opcode and contract transaction, to produce four combinations of hybrid features. This set of relevant features becomes the input to the anomaly detection model based on the highest accuracy rate produced by the six meta-classifiers through the soft voting method. The opcode+ABI+contract transaction feature component is the most relevant component, with the highest accuracy rate at 96.86%. This model successfully optimizes the classification time, minimizes the misclassification rate, and obtains higher accuracy than the results of other studies, as well as comparisons with the Boruta and Searching for Uncorrelated List of Variables (SULOV) techniques. This study has contributed to determining the relevant features of hybrid feature components and producing anomaly detection models for improving the performance of anomaly detection (Ponzi) based on the ensemble learning approach.

keyword

Ensemble learning, Ethereum, Ponzi scheme, anomaly detection.

Area