Exploring the connection of Crypto and AI: opportunities and challenges

These are the four main of Crypto and AI groups

The idea of “AI” is very broad. You can think of it as the set of algorithms that you make without explicitly specifying them. Instead, you create them by stirring a big computational soup and applying some kind of optimization pressure that pushes the soup toward making algorithms with the properties you want. This description should not be taken lightly at all, because it covers the process that made us humans in the first place! But it does mean that AI systems share some traits: they can do very powerful things, and we can’t always figure out what’s going on inside.

AI can be put into a lot of different groups. For the purposes of this post, which is about how AI interacts with blockchains (which have been called a platform for making “games”), I will use the following groups:

AI as a player in a game: AIs taking part in systems where the main source of incentives comes from a protocol with human inputs.
AI as an interface to the game: The use of AI as a game interface has a lot of potential, but it also comes with some risks. AIs can help people understand the crypto world and make sure that their actions (like signing messages and transactions) are in line with their goals so they don’t get scammed or tricked.
AI as the rules of the game: If you think of AI as the rules of the game, you should be very careful. Blockchains, DAOs, and other similar systems call AIs directly. Think of one example. “AI judges”
AI as the objective of the game: Longer-term but interesting: designing blockchains, DAOs, and other similar systems with the goal of building and maintaining an AI that could be used for other things. The crypto bits could be used to either better reward training or keep the AI from leaking private data or being hacked.

Let’s look at these one by one.

AI as a player in a game

At least since on-chain decentralized exchanges (DEXes) began to be used a lot, this group has been around for almost ten years. A chance to make money through arbitrage exists every time there is a trade, and bots are much better at it than people are. This use case has been around for a long time, even when AIs were much easier than they are now. In the end, though, it is a very real place where AI and crypto meet. Not long ago, we saw MEV arbitrage bots taking advantage of each other a lot. They will be present in any blockchain app that has auctions or trade, as they are automatically generated.

The first example of this type of software is AI trading bots, but I think this type of software will soon be used for many other things as well. Meet AIOmen, an example of a market for predictions where AIs play:

For a long time, prediction markets have been the holy grail of epistemics technology. I was excited about using prediction markets as an input for government (“futarchy”) back in 2014, and I played around with them a lot during the last election and more recently. But prediction markets haven’t really taken off yet, and there are a lot of reasons for this. For example, the biggest players are often making irrational bets, smart people won’t take the time to bet unless they stand to lose a lot of money, markets are often thin, and so on.

One way to answer this is to say that Polymarket or other new prediction markets are making UX changes all the time and hope that these will work where earlier versions have failed. People are ready to bet tens of billions of dollars on sports, so why wouldn’t they bet enough on US elections or LK99 for serious players to start showing up? However, this argument has to deal with the fact that prediction markets haven’t worked in the past (at least not as well as their supporters had hoped), so it looks like something new is needed to make them work. So, a different answer would be to talk about one specific thing about prediction market environments that we won’t see in the 2010s but will see in the 2020s: AIs could be involved everywhere.

AIs are ready to work for less than $1 an hour and know as much as an encyclopedia. If that isn’t enough, they can even be made to search the web in real time. Set up a market and offer a $50 liquidity grant. People won’t bother to bid, but thousands of AIs will quickly swarm over the question and make the best guess they can. It might not be very important to do well on any one question, but it might be very important to make an AI that can make good predictions in general. Keep in mind that you might not even need humans to decide on most questions. You could use a disagreement system with more than one round, like Augur or Kleros, with AIs also taking part in earlier rounds. In the few cases where things have gotten worse over time and a lot of money has been spent on both sides, humans would only need to step in.

You can use this useful primitive for a lot of different types of questions once you figure out how to make a “prediction market” work on such a tiny scale:

Does this social media post follow [terms of use]?
In other words, what will happen to the price of stock X?
Does this account that’s texting me right now really belong to Elon Musk?
Is it okay to post this job on an online task marketplace?
If you go to https://examplefinance.network, is the dapp a scam?
Is 0x1b54….98c3 really the address of the ERC20 token “Casinu Inu”?

You may have noticed that a lot of these ideas are similar to what I wrote about in my “d/acc” posts. I called it “info defense.” In a broad sense, the question is how to help people tell the difference between real and fake news and spot scams without giving too much power to one group that could abuse that power? At a very small level, “AI” may be the answer. The big question, though, is who builds the AI. Since AI is a copy of the process that made it, it can’t help but have biases. As a result, there needs to be a higher-level game where the AIs can play as real people and the game judges how well they are doing.

I believe this use of AI, in which AIs take part in a system where they are eventually rewarded or punished (probabilistically) by an on-chain system that collects inputs from humans (distributed market-based RLHF?), is something that should be looked into. We should look into these kinds of use cases more now that blockchain scaling is finally working. This means that “micro-” things can now be used on the blockchain, which wasn’t always possible before.

Another type of applications is linked to highly autonomous agents that use blockchains to work together better, either by making payments or using smart contracts to keep their promises.

Navigating the intersection of Privacy, Blockchain, and AI

AI as an interface to the game

In my works, I talked about the idea that there is a market for writing software that protects users’ interests by understanding and finding dangers in the online world they are exploring. Metamask’s scam detection tool is an example of this that already exists:

One more example is the simulation tool in the Rabby wallet, which tells the user what will happen when they sign a transaction.

Exploring the connection of Crypto and AI: opportunities and challenges image 91 — An earlier version of this post called this token a scam that was trying to look like bitcoin. It’s not; it’s a memecoin. I’m sorry for the mistake.

AI could be added to these kinds of tools to make them even better. A modern LLM would know that BITCOIN is not just a string of characters; it’s the name of a major cryptocurrency that is not an ERC20 token and has a price much higher than $0.045; and so on. AI could give a much more detailed and easy-to-understand explanation of what kind of dapp you are using, the results of more complex operations you are signing, and whether a certain token is real. Some projects are going all the way in this direction, like the LangChain wallet, which is mainly controlled by AI. Pure AI interfaces are probably too dangerous right now, in my opinion, because they make other kinds of mistakes more likely. However, AI that works with a more traditional interface is becoming very practical.

One risk stands out and should be mentioned. As I talk more about “AI as rules of the game” below, the main issue is adversarial machine learning: if a user has access to an AI assistant in an open-source wallet, then the bad guys will also have access to that AI assistant, giving them endless chances to make their scams work better so that they don’t set off the wallet’s defenses. There are bugs in all current AIs, and they’re not hard to find during training, even if the trainers only have limited access to the model.

“AIs participating in on-chain micro-markets” works better in this situation: each AI is vulnerable to the same risks, but you’re building an open ecosystem with dozens of people who are always iterating and improving them. In addition, each AI is closed: the system’s security comes from how open the rules of the game are, not from how each player works on the inside.

AI can help people understand what’s going on in simple terms, act as a real-time teacher, and keep people from making mistakes. But be careful using it directly against dishonest people and con artists.

AI as the rules of the game

I think this is the most dangerous use of AI and where we need to be very careful: AIs being part of the rules of the game. This is something that a lot of people are excited about. A lot of people in politics are excited about “AI judges” (for example, see this piece on the website of the “World Government Summit”). Blockchain applications can help with these kinds of hopes. Would it be possible to add an AI to a blockchain-based smart contract or DAO to help enforce the rules if it needs to make a subjective decision? For example, would a certain piece of work be acceptable in a work-for-hire contract? What is the right way to read a natural-language constitution like the Optimism Law of Chains

In this case, hostile machine learning will be a really tough problem to solve. The main reason why in two sentences is this:

If an AI model that is important to a process is closed, you can’t check how it works on the inside, so it’s not better than a centralized app. If the AI model is public, an attacker can download it, run it locally, and come up with highly optimized attacks to fool it. They can then use these attacks on the real network.

Exploring the connection of Crypto and AI: opportunities and challenges image 92 — Example of adversarial machine learning. ResearchGate.net is the source.

People who read this blog often (or live in the cryptoverse) may already be thinking: but wait! No-knowledge proofs and other really cool types of cryptography are here. We can surely use cryptography to hide how the model works so that attackers can’t make attacks more effective. At the same time, we need to show that the model is being run properly and was built using a reasonable training process on a reasonable set of data!

Most of the time, I support this way of thinking on this blog and in other works as well. But when it comes to AI-related computing, there are two main problems:

Cryptographic overhead: It takes a lot more time and effort to do something “in the clear” than it does to do it inside a SNARK (or MPC or…). Given that AI already uses a lot of computing power, is it even possible to do AI inside of cryptographic black boxes?
Black-box adversarial machine learning attacks: you can make attacks more effective against AI models even if you don’t know much about how they work on the inside. And if you hide too much, the person who picks the training data might be able to use poisoning attacks to break the model too easily.

These are both deep rabbit holes, so let’s take turns going down each one.

Cryptographic overhead

It costs a lot to use cryptographic tools, especially all-purpose ones like ZK-SNARKs and MPC. A client can directly check an Ethereum block in a few hundred milliseconds, but it can take hours to make a ZK-SNARK that proves the truth of that block. Other cryptographic tools, such as MPC, can have even more annoying noise. AI computation is already very expensive: the fastest LLMs can only write words a little faster than humans can read them. Plus, training the models often costs millions of dollars in computing power. There is a big difference in quality between the best models and the ones that try to save money on education or the number of parameters. At first look, this seems like a very good reason to be wary of the project to add security to AI by enclosing it in cryptography.

Luckily, AI is a very specific type of computation, which means that it can be improved in ways that “unstructured” computations like ZK-EVMs can’t. Let us take a look at how an AI model is put together:

Most of the time, an AI model is made up of a set of matrix multiplications and nonlinear operations on each element, like the ReLU function (y = max(x, 0)). As time goes on, matrix multiplications take on more and more work. For example, multiplying two N*N matrices takes
time, while the number of non-linear processes is a lot less. This is very helpful for cryptography because a lot of different types of cryptography can do linear operations almost “for free.” Matrix multiplications are one example of this, as long as you encrypt the model but not the inputs.

Exploring the connection of Crypto and AI: opportunities and challenges image 93

You may have heard of a similar thing happening with homomorphic encryption if you are a cryptographer: adding to encrypted ciphertexts is very easy, but multiplying them is very hard, and we didn’t find a way to do it with unlimited depth until 2009.

A algorithm like this one from 2013 is like ZK-SNARKs in that it takes less than 4 times longer to prove matrix multiplications. It’s too bad that the overhead on the non-linear layers is still pretty big. In fact, the best implementations show overhead of around 200x. But there is hope that this can be greatly reduced through more study. For an example of a new approach based on GKR, see this presentation from Ryan Cao. I also made a simpler version of how the main part of GKR works.

We don’t just want to show that an AI output was calculated properly, though. In many cases, we also want to hide the model. One simple way to do this is to divide the model into layers and store each one on a separate set of servers. Then, you can hope that the servers that leak some layers don’t leak too much data. But there are also specialized forms of multi-party computation that work amazingly well.

Exploring the connection of Crypto and AI: opportunities and challenges image 94 — A simplified picture of one of these methods, where the inputs are open to everyone but the model is kept secret. It’s possible to hide the model and the inputs, but it’s a bit trickier than it seems (see pages 8–9 of the paper).

The moral of the story is the same in both cases: matrix multiplications take up most of an AI’s processing time, but they can be done very quickly with ZK-SNARKs, MPCs, or even FHE. This means that putting AI in cryptographic boxes doesn’t add much in the way of extra work. Most of the time, the non-linear layers are the slowest, even though they are smaller. Maybe newer methods like lookup arguments can help.

Black-box adversarial machine learning

Now let’s talk about the other big problem: the attacks you can launch against a model even if its details are kept secret and you only have “API access” to it. Taking from a 2016 paper:

A lot of machine learning models can be fooled by hostile examples, which are inputs that are designed to make the model give the wrong result. When adversarial examples happen to one model, they usually happen to another model as well. This is true even if the two models were trained on different architectures or training sets, as long as they were both trained to do the same job. So, an attacker could train their own substitute model, make examples that are bad for the substitute, and then move those examples to a victim model without knowing much about the target.

Exploring the connection of Crypto and AI: opportunities and challenges image 95 — You can learn and improve your own “inferred classifier” by having “black-box access” to a “target classifier.” Then, make locally optimized strikes on the inferred classifier. It turns out that these hacks often also work on the classifier that was they were meant to target. Source of diagram.

You might even be able to make attacks with just the training data, even if you don’t have much or any access to the model you want to target. As of 2023, these kinds of strikes are still a big issue.

We need to do two things to stop these kinds of black-box threats in their tracks:

Just say how many times and who can ask the model. Black boxes with full API access are not safe, but black boxes with very limited API access might be.
You should hide the training data while still being sure that the process used to make it is not broken.
Worldcoin may be the group that has done the most work on the first one. I talk at length about an earlier version of this protocol, along with other protocols. At the protocol level, Worldcoin uses AI models a lot to (i) turn iris scans into short “iris codes” that can be easily compared to see if they are the same and (ii) make sure that the person or thing being scanned is a real person. Worldcoin’s biggest defense is that it doesn’t let anyone just call into the AI model. Instead, it uses trusted hardware to make sure that the model will only accept inputs that have been digitally signed by the camera on the orb.

Though this method might not always work, it has been found that biometric AI can be attacked with real items like patches or jewelry that you can put on your face:

If you put something extra on your face, you can hide yourself or even pretend to be someone else. The source.

Putting all of these defenses together—hiding the AI model, limiting the number of queries, and making sure that each question is authenticated in some way—should make adversarial attacks hard enough that the system can be safe. In the case of Worldcoin, making these other defenses stronger could also make them less reliant on known hardware, which would make the project less centralized.

This brings us to the second part: how can we hide the study data? “DAOs to democratically govern AI” might make sense in this case: we can make an on-chain DAO that controls who can submit training data (and what proofs are needed for the data itself), who can make queries, and how many can be made. We can then use cryptography tools like MPC to protect the whole process of building and running the AI, from the training data submitted by each user to the final output of each query. This DAO could meet the very common goal of paying people for data submission at the same time.

Again, it’s important to stress that this plan is very bold, and there are several ways it could not work:

Cryptographic overhead might still be too high for this fully black-box design to be able to compete with closed “trust me” methods that have been around for a long time.
As it turns out, there might not be a good way to make the process of sending training data dispersed and safe from poisoning attacks.
Multi-party computation tools could break their safety or privacy promises if users work together. This has happened many times with cross-chain cryptocurrency bridges.
I didn’t start this section with more big red warning labels that say “DON’T DO AI JUDGES, THAT’S DYASTOPIAN” because our society already relies on centralized AI judges that can’t be held accountable. For example, the algorithms that decide which political posts and opinions get boosted and deboosted, or even deleted, on social media are examples of these judges. At this point, I do think that expanding this trend even more is a bad idea, but I don’t think that the blockchain community playing with AIs more will make things worse.

There are, in fact, some pretty simple, low-risk ways that crypto technology can improve even these controlled systems that are already in place. One easy method is verified AI with delayed publication. When a social media site ranks posts using AI, it could print a ZK-SNARK that proves the hash of the model that made that ranking. The website might promise to show off its AI models after etc. A one-year holdup. After a model is released, users can check the hash to make sure they got the right model, and the community can test the model to make sure it’s fair. Because of the delay in release, the model will be out of date by the time it is shown.

So the question is not if we can do better than the controlled world, but how much better? For the autonomous world, however, it is important to be careful: if someone builds eg. It’s possible for an AI predictor in a prediction market or stablecoin to be hacked, which would mean that a huge amount of money could be lost in an instant.

AI as the objective of the game

If the methods described above for making a scalable, decentralized private AI whose contents are kept secret by everyone do work, then they could also be used to make AIs that are useful for things other than blockchains. This is one of the main goals of the NEAR protocol team’s ongoing work.

This needs to be done for two reasons:

There are many uses for “trustworthy black-box AIs” that could benefit from using a mix of blockchains and MPC for the training and inference process. These uses include ones where users are scared that the system might be biased or cheat them. A lot of people want AIs that are important to the system that we will depend on to be governed democratically. Cryptography and blockchain-based methods could help us get there.
From the point of view of AI safety, this would be a way to make a decentralized AI that has a natural kill switch and could stop questions that try to use the AI in a bad way.

It’s also important to note that “using crypto incentives to incentivize making better AI” is possible without encrypting everything. For example, BitTensor is an example of this.

In conclusion

Blockchains and AIs are both getting stronger, which means there are more and more uses where the two can work together. Not all of these use cases are strong, though. Some make a lot more sense than others. Most of the time, the most hopeful and easiest to get right use cases are those in which the basic mechanism stays mostly the same but the individual players are replaced by AIs. This lets the mechanism work at a much more micro level.

Applications that try to use blockchains and cryptography to make a “singleton”—a single stateless trusted AI that another application would use for some reason—are the hardest to get right. These apps show promise for both their usefulness and their ability to make AI safer in a way that doesn’t involve the risks of control that come with more common solutions. But there are also many ways that the theories behind these apps could be wrong. Because of this, you should be very careful when using them in high-value and high-risk situations.

I hope that more tries are made to find useful ways to use AI in all of these areas so that we can see which ones can really work on a large scale.