Before proceeding with the listing from tokens on its platform, Coinbase realizes a audit of their smart contracts. The exchange has tested the reliability from ChatGPT in this sector. Result: its performance are insufficient at this stage.
The buzz has been going on. For months, the generative artificial intelligence tool fromOpenAI arouses curiosity and experimentation. ChatGPT is thus capable of surprising performances. But can it still be a decision support tool for a cryptocurrency exchange?
That’s what Coinbase’s security teams sought to determine through a battery of tests. The exchange’s Blockchain Security team shares its feedback from ChatGPT on its blog.
ChatGPT Versus blockchain security engineer
Composed of computer security engineers, this one performs an audit of the ERC20/721 smart contracts of the tokens studied for a listing with users. This examination allows to determine a security score.
ChatGPT Could it take the place of or assist an engineer in vulnerability analysis? To evaluate the capabilities of OpenAI’s AI, cybersecurity experts at Coinbase first had to determine how to teach the AI to generate a score according to its internal methodology.
To submit queries to ChatGPT, users used a “prompt”. Defining a complex query therefore requires extensive work. We now speak of prompt engineering. The prompt submitted by Coinbase is as follows:
I want you to act as a blockchain security engineer. Your task is to identify security risks within a tokenized smart contract based on the risk associated with its functions. Here is our framework [ + framework de risque]. Does the following smart contract have any of these risks?”
8 ChatGPT errors, so 5 serious
The exchange therefore provided ChatGPT with its internal audit framework, and then submitted the code for a smart contract so that it could perform its risk analysis. A total of 20 tokens were submitted to the machine and the results, a score, compared with those of an engineer.
In 12 cases, ChatGPT produced the same result as the manual review. In the remaining cases, the AI made a mistake. And of these 8 failures, 5 resulted in underestimating the risk level. Underestimating a risk score is much more detrimental than overestimating it.
While the effectiveness of a ChatGPT review is remarkable, there are still some limitations that detract from the tool’s accuracy,” the safety engineers therefore judge.
There are several reasons for this assessment. ChatGPT is not able to determine when there is a lack of data to perform the security analysis. Robustness is also lacking. The same question does not systematically generate the same answer.
ChatGPT still needs to improve
This is problematic for such a sensitive task. This flaw in ChatGPT was also identified for other use cases, limiting its application in a B2B environment. This inconsistency is due in particular to the updates made by OpenAI.
While ChatGPT shows promise in its ability to quickly assess smart contract risks, it does not meet the accuracy requirements necessary to be integrated into Coinbase’s security review processes,” the crypto exchange therefore concludes.
ChatGPT is not totally disqualified. While it cannot be used for automatic control, the tool could be used as a “secondary quality control”. It would help engineers to perform an additional review and thus potentially detect unnoticed risks.
Artificial intelligence may in the future be able to claim to join Coinbase’s Blockchain Security Team again. With this in mind, “ChatGPT prompts are saved for future use by engineers and there are plans to improve them over time.”
Follow Corners.en on Twitter, Linkedin, Facebook or Telegram to not miss anything. Register-to our crypto newsletter to receive a news summary every week.