Introduction
The inimitable Chris Kim recently tweeted about data types in smart contracts:
And it got me thinking about friction points for onboarding new developers.
Like Chris, I don't come from a computer science background.
I taught myself programming initially to automate parts of my job that were dull and repetitive.
I knew there had to be a way to get the computer to complete the tasks for me.
The first program I wrote was only five lines of Python, but it saved me countless hours of effort.
And from there I was hooked.
Python as a First Language
Python is a high-level, dynamically-typed language.
It's often taught in schools and universities because it has simple syntax and an incredible ecosystem of open-source libraries.
But there is a trade-off between ease of learning and depth of learning.
And I see many parallels for this in other areas of life.
Thoughts On Difficulty
If you learn to drive a manual car first, then driving an automatic is just easier than what you already know.
If you learn your first chords on an acoustic guitar, you'll find they're just easier to play on an electric.
Knowing that the inverse doesn't hold, you might be tempted to conclude that it's always better to learn the harder way first.
But most people give up in the first stage of learning.
The stage in which it dawns on you that your fingers might sooner drop off your hands than deliver you a sound worth hearing.
Time to Satisfaction
I've come to think of this as 'time to satisfaction'.
The amount of time it takes, when learning something new, to get your first dopamine hit.
The longer that takes, the more likely you are to quit.
Python gives you that satisfaction pretty quickly, and for that reason alone, I think it's a wonderful first language for most people.
Dynamic Typing
A data type is a set of values and a set of operations defined on those values.
Source: Princeton
Honestly, this is something I gave very little thought to when I started programming.
The Python interpreter assigns the type of a variable at runtime, based on its value at the time.
x
can be an integer initially, and then a string:
>>> x = 5
>>> print(type(x).__name__, f"{x = }")
int x = 5
>>> x = "hello"
>>> print(type(x).__name__, f"{x = }")
str x = 'hello'
This is called dynamic typing, and it's all fine and dandy until you encounter the dreaded TypeError
:
>>> x ** 2
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'
Languages like Haskell and Rust prevent these errors through static typing.
Static Type Checking
PEP-484 introduced new Python syntax for type hints (sometimes called type annotations).
Type hints are optional and they are ignored by the Python interpreter.
For example, if I declare x
to be an integer, but assign it a string:
>>> x: int = "hello"
>>> print(type(x).__name__, f"{x = }")
str x = 'hello'
The interpreter still treats it as a string.
So why is this even useful?
Type hints enable the use of static type checkers like mypy, which can analyse your code and help catch bugs.
For example, if I run mypy on this file:
x: int = "hello"
It tells me:
Incompatible types in assignment (expression has type "str", variable has type "int")
And it can also infer types a lot of the time, without needing annotations:
x = "hello"
print(x ** 2)
Unsupported operand types for ** ("str" and "int") [operator]
In some ways, this is the best of both worlds.
You can use a static type checker if you want to, but it's not mandatory.
Many Python projects use mypy or other similar tools in CI/CD pipelines, to prevent these kinds of bugs being committed to production.
So It's Like TypeScript...?
Not exactly.
The key difference between the two is that type annotations are ignored by the Python interpreter.
So you can use and leave them in your regular Python files.
This enables a really powerful use case: runtime type checking based on type hints.
Type annotations are not valid in JavaScript, so there is no equivalent.
TypeScript parses .ts
files, validates the code, and compiles it to JavaScript (without type annotations).
What About Algorand Python?
Algorand Python compiles a statically-typed subset of Python to bytecode that runs on the Algorand Virtual Machine.
Static typing is not optional here - it's a fundamental part of the framework.
The compiler uses mypy to parse .py
files into an abstract syntax tree, and eventually transforms it into bytecode.
This step is really important to understand.
Algorand Python code cannot currently be executed by the Python interpreter.
The constructs you interact with as a developer are interfaces, not objects.
These interfaces are defined using mypy stub files, which provide information to your IDE and type checker about classes, functions, variables, and types.
Let's walk through an example from the Algorand Python repository.
UInt64
UInt64
is one of the most important types you'll use in Algorand Python.
It represents one of the two AVM types: uint64
and bytes
.
The stub file shows:
class UInt64:
"""A 64-bit unsigned integer, one of the primary data types on the AVM"""
def __init__(self, value: int = 0, /) -> None:
"""A UInt64 can be initialized with a Python int literal, or an int variable
declared at the module level"""
...
Remember, this file isn't executed by the Python interpreter.
All it's doing is providing information about the interface.
When we're writing a contract, the IDE should pop up with helpful hints based on this information:
If you want to know how UInt64
behaves, you can see its definition in the stub file.
For example, I might want to know what happens if I write:
x = UInt64(1)
if x:
...
In Python, the truth value of an object is defined in its __bool__
dunder method.
The stub file tells me:
def __bool__(self) -> bool:
"""A UInt64 will evaluate to `False` if zero, and `True` otherwise"""
Bits, Bats, and Binary Cats
It's quite rare that you need to think about bits and bytes in Python.
The nasty ones and zeroes are abstracted away.
But those abstractions come at a cost, and most of them are far too computationally expensive to use inside a smart contract.
The AVM only has two fundamental types: uint64
and bytes
.
Everything else is an alias or a wrapper around one of those two types.
If you want to develop smart contracts, you will need to understand the basics of binary, encoding, and decoding.
This RealPython article is a great starting point: Python Bitwise Operators.
ARC-4
Algorand's ARC-4 defines a standard way to encode and decode ABI data types.
This makes it much easier to invoke smart contract methods, either from off-chain code or using cross-contract calls.
But it also gives us another set of data structures to work with inside a smart contract.
For example, the Struct
type gives us a way to access items by name, rather than index:
from algopy import arc4
class Member(arc4.Struct, kw_only=True):
name: arc4.String
age: arc4.UInt64
Working with ARC-4 types is always less efficient than using the native uint64
or bytes
types, but these structures can be incredibly useful.
Algorand Python also provides a neat way to access the native equivalent of an ARC-4 type (where possible), using the native
property:
x = arc4.UInt64(1)
x.native
Tips for Beginners
If you're just starting out with Python and are interested in using it for smart contract development, I would recommend a few things:
Get familiar with type annotations and start using a static type checker in all your projects (mypy is the most popular).
It's incredibly useful, and you'll need to understand the basics before developing with Algorand Python.Learn the basics of binary, encoding, and decoding.
It's critical for smart contract development, and understanding blockchains more broadly.Try writing automated tests with pytest.
These will help you verify that your smart contract is working as expected.Don't be afraid to learn another language!
It's one of the best ways to improve your skills.
Stricter languages like Haskell are difficult to learn, but they teach you important lessons about how to write reliable code, and how to use a type system to your advantage.