Researchers from Salus Security conducted tests on the capabilities of GPT-4, along with other artificial intelligence (AI) systems, in detecting common security vulnerabilities in smart contracts. The study aimed to assess GPT-4’s performance in parsing and auditing smart contract code.
The researchers used a dataset of 35 smart contracts, named the SolidiFI-benchmark vulnerability library, containing a total of 732 vulnerabilities. GPT-4 demonstrated proficiency in code parsing and providing vulnerability hints, making it a useful tool in smart contract auditing.
However, its limitations in vulnerability detection were evident, preventing it from entirely replacing professional auditing tools and experienced auditors.
The findings revealed that GPT-4 excelled in detecting true positives, with over 80% precision in testing. However, it struggled with generating false negatives, as indicated by a low recall rate of only 11%. The recall rate measures the ability to identify all relevant instances of vulnerabilities.
The researchers concluded that GPT-4’s vulnerability detection capabilities are currently lacking, with an overall accuracy reaching only 33%. As a result, they recommend a combined approach, using GPT-4 alongside dedicated auditing tools and human expertise to enhance the accuracy and efficiency of smart contract audits.
In summary, while GPT-4 demonstrates potential in code parsing and providing hints for vulnerabilities in smart contracts, it falls short in comprehensive vulnerability detection. The study underscores the importance of a multi-faceted approach to smart contract auditing, leveraging both AI tools and traditional methods for optimal results.