Function selectors are a way of identifying functions in EVM (Ethereum Virtual Machine) smart contracts. These function signatures are hashed using the keccak256 algorithm, and the first 4 bytes of the resulting hash are used as the function selector. This allows for the identification of the function and the parameters it takes, so that the caller can provide the correct information and the correct function can be called by the smart contract.
In this post I will go over how it is posible to change the function selectors of smart-contracts after they have been compiled to bytecode. I will also go over some of the reasons someone might want to do this, or how it could potentially be used in real situations.
What is a Function Selector
Function selectors are the first four bytes of the Keccak-256 (SHA-3) hash of a function’s signature. A function signature is made up of the function’s name and the types of its arguments. In Ethereum, the Keccak-256 hash of a function signature is used to identify which function is being called when a contract receives a message. This allows multiple functions with the same name but different argument types to exist within a contract and be distinguished from one another.
For example, the function signature
transfer(address sender, uint256 amount) would be hashed with Keccak-256 to produce a function selector of
0xa9059cbb. Similarly, the function signature
withdraw(uint256 wad) would be hashed to produce a function selector of
Function selectors are important because they allow the caller of a contract to specify the exact function they want to call, along with the necessary arguments. This ensures that the correct function is called and that the contract has the necessary information to execute it.
You can find more information and examples of function selectors on the Solidity By Example website, which has a section dedicated to this topic: Here.
I recently learned that it’s possible to modify function selectors in order to make it difficult to detect the use of unknown smart contracts. This can be useful for mitigating the risk of using unknown code, but it’s important to note that this approach is not practical or safe as it relies on flying blind without access to the source code. However, for those who are interested in learning more about this technique and my personal use case, here’s how it works:
First, it’s important to understand that all bytecode is available on the blockchain network. This means that it’s possible to use already deployed smart contract bytecode for your own purposes, although this comes with its own set of risks. In order to make it less obvious that you’re using unknown code, you can create another smart contract that calls your first contract, changing the function hashes used. This makes it difficult to detect with tools like etherscan.
However, there’s a catch: smart contract addresses are not the same as normal user signing addresses, so any functions that use require
_param1 == addr(_param1) will revert instantly. This means that you need to find a different way to modify the function selectors.
One approach is to use the fact that hashed functions don’t care about their original names or parameters. Because hashes are a one-way operation, you can change them to anything you want and call them with the new function hash. This will work for some hex codes but not others, and the reason for this is that the compiler does a small optimization for function calls. By using a decompiler like ethervm.io, you can see that for most smart contracts the compiler finds the function hash that is numerically between the others and uses it to search for the matching function selector above and below it. This cuts the search time roughly in half.
Once you understand this, you can change your function hashes, deploy your smart contracts, and hopefully be undetected. However, it’s important to keep in mind that this approach is not practical or safe, and is only recommended as a learning experience. As you gain more knowledge and experience in smart contract design and efficiency, you can move on to creating your own robust smart contracts.
Identifying Function Selectors
First, you will need the bytecode of your target contract that was used in the smart contract creation. There are two types of bytecode, usually named “bytecode” and “deployed bytecode.” If you are not familiar with the difference between the two, you can read about it in this Medium post: The difference between bytecode and deployed bytecode.
To obtain the function hashes, you can use a tool like ethervm.io. For this example, we will be using a unknown MEV Bot smart contract on the Ethereum network. Take the deployed bytecode from etherscan and paste it into the text box at the bottom of ethervm.io. Alternatively, you can use the contract address and hit “decompile.” This will give you a decompiled version of the contract, which you can view here.
Once you have the decompiled contract, you can identify the functions in the hex-compiled smart contract code, also known as the bytecode. In this example, we will change the
0x78e111f6 Unknown, and
0x9c52a7f1 deny(address) functions.
Note: Using the ethervm.io decompiler requires “deployed bytecode” as input but for the rest of this post we will be working with “bytecode”.
Note: “Bytecode” can be found in the Input Data field of the transaction that created the smart contract. For example the MEV bot’s creation transaction Here
Shortened Bytecode With Functions Highlighted 0x608060405234801561001057600080fd5b50604051602080610d628339810180604052602081101561003057600080fd5b5051336000908152602081905260409020600190556100578164010000000061005e810204565b5050610114565b3360009081526020819052604081205460011461007a57600080fd5b600160a060020a03821615156100f157604080517f08c379a000000000000000000000000000000000000000000000000000000000815260206004820152601f60248201527f64732d70726f78792d63616368652d616464726573732d726571756972656400604482015290519081900360640190fd5b5060018054600160a060020a038316600160a060020a0319909116178155919050565b610c3f806101236000396000f3fe6080604052600436106100a3576000357c010000000000000000000000000000000000000000000000000000000090048063
Note: 78e111f6 gets used twice in the bytecode we will change both.
Changing Function Selectors
We can change the 4-byte function selectors in our bytecode to any desired value, with one notable exception. Depending on how the bytecode is compiled, the compiler may optimize it in order to improve the speed at which functions are called. If the bytecode you are modifying has a small number of functions, you may be able to ignore this optimization. However, this is often not the case.
For example, the Weth9 contract uses a
if (var0 == "function selector" operation to find the correct function selector, while our example Mev Bot contract uses an
if (0x78e111f6 > var0) statement first then a
if (var0 == "function selector" to find the function selector. This is because the Mev Bot contract was optimized to use less gas when looking up functions.
Because of this optimization, we must take it into account when modifying our bytecode.
Note: Complicated and heavily optimized contracts may have nested if statements adding to complexity.
In our example, we need to pick a new function that is either smaller or larger than the numerical midpoint function, depending on whether the original function was smaller or larger than the midpoint function. For instance, if the midpoint function is
0x78e111f6 and we want to replace
0x1cff79cd, we need to choose a hex number that is smaller than
0x78e111f6. If we want to replace
0x9c52a7f1, we need to choose a hex number that is larger than
0x78e111f6. However, we can also change the midpoint function itself. Keep in mind, though, that because the bytecode has already been compiled, changing the midpoint function will not change the order of the functions or which functions are greater than or less than the midpoint function. For example, if we change the midpoint function to the largest possible hex number
0xffffffff, we can choose any new hash function for
0x1cff79cd, but we won’t be able to use any hex number for
0x9c52a7f1, because there is no function that is greater than
0xffffffff. Similarly, if we set the midpoint function to
0x00000000, only functions that were originally bigger than the original midpoint would be changeable. It’s also worth noting that changing two or more hashes to the same hash can create unique scenarios that may not be desirable.
Now that we have changed one or all of the function hashes in our smart contract, we can deploy it using a tool that allows us to send a transaction with the data field. Some popular options include ethersJS with hardhat or 0xPhaze’s ABI Playground. Once the transaction is mined, you can use your contract with the updated function hashes.
You can call the functions in your contract by creating your own transaction data, or by using a devloper-friendly tool such as hardhat. Keep in mind that the functions will still perform the same tasks as before, but will now have different function selectors.
Overall, changing the function hashes in a smart contract allows you to deploy an updated version of the contract, while still retaining its original functionality.
The main reason for changing a function selector in bytecode which I detailed in the My Usecase section is to obfuscate the functions and their purposes. This can help prevent others from easily identifying and interacting with your contract, and make it harder for them to attack it. For example, there are databases like the 4byte database that contain hashes of well-known functions from verified code and GitHub projects, which makes it easier for people to see what functions a contract uses and interact with them without having the source code. By changing the function selectors, you can make it more difficult for others to detect and access your contract while still using the same bytecode.
Note: An easier way to change function selectors if you have the source code is to change the function names before compilation.
One potential consequence of this knowledge is the potential for malicious use. Because we can change the function hash to any value we want, it is possible to change it to a common function found in databases. This could be a problem if you trust a contract solely based on the function name appearing in a service like Etherscan, MetaMask, or any other wallet provider. Etherscan will only look at the function hash in the transaction data field and assign the function name without verifying the contract. Additionally, this vulnerability can be exploited by creating a function with the same name and parameters in the contract’s source code. Anyone aware of this possibility should be able to avoid falling victim to this trick, but it is important to keep this potential vulnerability in mind for those who use and trust these tools.
I hope you enjoyed this novel approach to modifying post-compiled EVM bytecode function selectors. I couldn’t find any other information on this topic online, so I decided to share what I’ve learned. If anyone else has discovered this, or if you have any questions or feedback, feel free to reach out to me. Thanks for reading!