Compiling and Smart Contracts: ABI Explained

Most smart contracts are developed in a high-level programming language. The most popular currently is Solidity, with Vyper hoping to take the throne in the near future.

However, the mechanism driving Ethereum can’t understand the high-level languages, but instead talks in a much lower-level language.

The Ethereum Virtual Machine (EVM)

Ethereum smart contracts are sets of programming instructions being run on all the nodes running a full Ethereum client. The part of Ethereum that runs the smart contract instructions is called the EVM. It’s a virtual machine not unlike Java’s JVM. The EVM reads a low-level representation of smart contracts called the Ethereum bytecode.

The Ethereum bytecode is an assembly language made up of multiple opcodes. Each opcode performs a certain action on the Ethereum blockchain.

The question is, how do we go from this:

pragma solidity 0.4.24;

contract Greeter {

    function greet() public constant returns (string) {
        return "Hello";
    }

}

to this:

PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x4 CALLDATASIZE LT PUSH2 0x41 JUMPI PUSH1 0x0 CALLDATALOAD PUSH29 0x100000000000000000000000000000000000000000000000000000000 SWAP1 DIV PUSH4 0xFFFFFFFF AND DUP1 PUSH4 0xCFAE3217 EQ PUSH2 0x46 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST CALLVALUE DUP1 ISZERO PUSH2 0x52 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH2 0x5B PUSH2 0xD6 JUMP JUMPDEST PUSH1 0x40 MLOAD DUP1 DUP1 PUSH1 0x20 ADD DUP3 DUP2 SUB DUP3 MSTORE DUP4 DUP2 DUP2 MLOAD DUP2 MSTORE PUSH1 0x20 ADD SWAP2 POP DUP1 MLOAD SWAP1 PUSH1 0x20 ADD SWAP1 DUP1 DUP4 DUP4 PUSH1 0x0 JUMPDEST DUP4 DUP2 LT ISZERO PUSH2 0x9B JUMPI DUP1 DUP3 ADD MLOAD DUP2 DUP5 ADD MSTORE PUSH1 0x20 DUP2 ADD SWAP1 POP PUSH2 0x80 JUMP JUMPDEST POP POP POP POP SWAP1 POP SWAP1 DUP2 ADD SWAP1 PUSH1 0x1F AND DUP1 ISZERO PUSH2 0xC8 JUMPI DUP1 DUP3 SUB DUP1 MLOAD PUSH1 0x1 DUP4 PUSH1 0x20 SUB PUSH2 0x100 EXP SUB NOT AND DUP2 MSTORE PUSH1 0x20 ADD SWAP2 POP JUMPDEST POP SWAP3 POP POP POP PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH1 0x60 PUSH1 0x40 DUP1 MLOAD SWAP1 DUP2 ADD PUSH1 0x40 MSTORE DUP1 PUSH1 0x5 DUP2 MSTORE PUSH1 0x20 ADD PUSH32 0x48656C6C6F000000000000000000000000000000000000000000000000000000 DUP2 MSTORE POP SWAP1 POP SWAP1 JUMP STOP LOG1 PUSH6 0x627A7A723058 KECCAK256 SLT 0xec 0xe 0xf5 0xf8 SLT 0xc7 0x2d STATICCALL ADDRESS SHR 0xdb COINBASE 0xb1 BALANCE 0xe8 0xf8 DUP14 0xda 0xad DUP13 LOG1 0x4c 0xb4 0x26 0xc2 DELEGATECALL PUSH7 0x8994D3E002900

Solidity Compiler

For now, we’ll be focusing on the Solidity compiler, but the same principles apply for Vyper or any other high-level language for the EVM.

First things first: install Node.js.

After you’ve done this, go to your terminal and run this:

npm install -g solc

This will install solc — the Solidity compiler. Now make an empty directory. In that directory create a file called SimpleToken.sol and put the following code:

pragma solidity ^0.4.24;

contract SimpleToken {

    mapping(address => uint) private _balances;

    constructor() public {
        _balances[msg.sender] = 1000000;
    }

    function getBalance(address account) public constant returns (uint) {
        return _balances[account];
    }

    function transfer(address to, uint amount) public {
        require(_balances[msg.sender] >= amount);

        _balances[msg.sender] -= amount;
        _balances[to] += amount;
    }
}

This is the simplest token smart contract, but it has several important features that will be useful for this tutorial. They are:

  • public functions
  • private functions
  • properties

After you’ve done this, run the newly installed solc on your file. You do this by running the following:

solcjs SimpleToken.sol

You should get an output similar to this:

Invalid option selected, must specify either --bin or --abi

And your compilation should fail.

What just happened? What is bin and what is abi?

bin is simply a compact binary representation of the compiled bytecode. The opcodes aren’t referenced by PUSH, PULL or DELEGATECALL, but their binary representations, which look like random numbers when read by a text editor.

ABI — Application Binary Interface

Once our bin output is deployed to the blockchain, the contract will get its address and the bytecode will be pushed into Ethereum storage. But a large problem remains: how do we interpret the code?

There’s no way of knowing, from bytecode only, that the contract has functions transfer(:) and getBalance(:). It’s even less clear whether these functions are public, private or constant. The contract is deployed without context.

Calling such a contract would be next to impossible. We don’t know where each function is in the bytecode, which parameters it takes, or whether we’ll be allowed to call it at all. This is where the ABI comes into play.

The ABI is a .json file that describes the deployed contract and its functions. It allows us to contextualize the contract and call its functions.

Let’s try running our solcjs once again. Run the following commands:

solcjs SimpleToken.sol --abi
solcjs SimpleToken.sol --bin

Your directory should now have a structure like this:

.
├── SimpleToken.sol
├── SimpleToken_sol_SimpleToken.abi
└── SimpleToken_sol_SimpleToken.bin

The SimpleToken_sol_SimpleToken.abi file should look like this:

[{
    "constant": false,
    "inputs": [{
        "name": "to",
        "type": "address"
    }, {
        "name": "amount",
        "type": "uint256"
    }],
    "name": "transfer",
    "outputs": [],
    "payable": false,
    "stateMutability": "nonpayable",
    "type": "function"
}, {
    "constant": true,
    "inputs": [{
        "name": "account",
        "type": "address"
    }],
    "name": "getBalance",
    "outputs": [{
        "name": "",
        "type": "uint256"
    }],
    "payable": false,
    "stateMutability": "view",
    "type": "function"
}, {
    "inputs": [],
    "payable": false,
    "stateMutability": "nonpayable",
    "type": "constructor"
}]

We can see that the file describes the functions of the contract. It defines:

  • their name: the name of the functions
  • their payability: whether you can send ether to them
  • the outputs: the return value(s) of the function
  • their state mutability: whether the function is read-only or has write access.

This is all reasonably easy to understand from reading it. But earlier I mentioned that the ABI also defines how the user can call the functions — that is, the location of the function in relation to the smart contract address.

Knowing the name of the function is not enough; we also need to know how (where) to call it.

This is done by running a deterministic algorithm on the function properties we mentioned earlier (the name, the payability, the outputs etc.). The details of this function can be found here.

Example

The ABI is the description of the contract interface. It contains no code and cannot be run by itself. The bytecode is the executable EVM code, but by itself it is without context.

In order to call functions in smart contracts, we need to use both the ABI and the bytecode. Luckily for us, this is all abstracted away when we’re interacting with smart contracts by using one of the provided frameworks.

An example using the web3.js framework would look like this:

var simpletokenContract = web3.eth.contract([{"constant":false,"inputs":[{"name":"to","type":"address"},{"name":"amount","type":"uint256"}],"name":"transfer","outputs":[],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":true,"inputs":[{"name":"account","type":"address"}],"name":"getBalance","outputs":[{"name":"","type":"uint256"}],"payable":false,"stateMutability":"view","type":"function"},{"inputs":[],"payable":false,"stateMutability":"nonpayable","type":"constructor"}]);
var simpletoken = simpletokenContract.new(
   {
     from: web3.eth.accounts[0],
     data: "0x608060405234801561001057600080fd5b50620f42406000803373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002081905550610252806100666000396000f30060806040526004361061004c576000357c0100000000000000000000000000000000000000000000000000000000900463ffffffff168063a9059cbb14610051578063f8b2cb4f1461009e575b600080fd5b34801561005d57600080fd5b5061009c600480360381019080803573ffffffffffffffffffffffffffffffffffffffff169060200190929190803590602001909291905050506100f5565b005b3480156100aa57600080fd5b506100df600480360381019080803573ffffffffffffffffffffffffffffffffffffffff1690602001909291905050506101de565b6040518082815260200191505060405180910390f35b806000803373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020541015151561014257600080fd5b806000803373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002060008282540392505081905550806000808473ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020600082825401925050819055505050565b60008060008373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000205490509190505600a165627a7a72305820c9da07d4976adbf00a4b5fe4e23330dbaf3cdcbfd4745eed78c702bf27d944060029",
     gas: '4700000'
   }, function (e, contract){
    console.log(e, contract);
    if (typeof contract.address !== 'undefined') {
         console.log('Contract mined! address: ' + contract.address + ' transactionHash: ' + contract.transactionHash);
    }
 })

First we defined the simpelTokenContract, which is a description of how the contract looks from the outside. We did this by passing it the ABI of the SimpleToken.sol.

Then we created an instance of the contract simpletoken by calling the simpletokenContract.new(...) and passing into it the data of the contract (the executable code).

web3.js combined the two in the background and now has all the required information to call functions on our smart contract.

Conclusion

In this short overview of smart contract compilation, we explained ABI and how smart contracts deployed on the Ethereum blockchain can get invoked. While you’ll never actually have to use this directly, it’s worth being aware of it, as too much abstraction can lead to bugs.

Sponsors