Helpful Bash Scripts For Working with Byte Arrays and Hex Representations (in Bitcoin)

I've been working heavily with raw level Bitcoin transactions lately, mainly so I can understand what's going at the protocol level where a lot of the fun innovation is happening. In particular, I'm building a Pay To Script Hash (P2SH) M-of-N Multisig transaction implementation in Go, as a stepping stone to building a micropayments channel implementation in Go, and it's been a lot of fun.

When working on Bitcoin at the raw transactional level, you spend a lot of time working with byte arrays either in the binary, decimal or hex representations. Take a quick glance at Bitcoin's scripting language, Script, or an example raw transaction to see what I mean.

Working with byte arrays is not so bad after a while, but there are a few headaches. One productivity suck is constantly doing binary <-> decimal <-> hex conversions during debugging as you compare test transactions to protocol specifications. Another is doing byte counts as you push things onto the Bitcoin Script stack, since even one byte out of place can cause a transaction to be invalid or undecodable. To make my life easier, I ended up writing just a few short bash scripts that saved me hours of time, so I'd like to share them today with anybody doing similar things.

Hex to Decimal to Hex Conversions in Bash

Converting from hex to decimal representation in your head is far from difficult with a little practice, but gets pretty unwieldy if you're doing it constantly. Writing a little Go code or using Google is good too, but at some point, you're doing it so often you don't even want to spend that time being unproductive. Instead, simply add these functions to your ~/.bash_profile file:

# Hex to Decimal Conversion Helper functions
h2d(){  
  str=$(echo $@ | awk '{print toupper($0)}')
  echo "ibase=16; $str" | bc
}
d2h(){  
  echo "obase=16; $@" | bc
}

Don't forget to reload your ~/.bash_profile by restarting terminal or by running:

$ source ~/.bash_profile

And now you can simply type:

$ h2d AF
>>> 175

$ h2d ab
>>> 171

$ d2h 233
>>> E9

Convert away! The h2d function will take uppercase or lowercase letters. Mine is just a small improvement on nixcraft's great article on this.

Character Counts (and Word Counts For the Fun of It) in Bash

Another common headache is byte counts. When you're pushing things onto the stack in Script, you specify the size of the data to be pushed, and then the bytes themselves. When you're debugging your output transactions, that means seeing a byte representation of the size (eg. 8B to let you know you have 139 bytes coming up), and then looking at the next n bytes. At first I was using Javascript Kit's nice and easy character count tool, but at some point I wanted something faster I could run from my terminal. So here it is:

# Character/Word Count Helper. Simply give string as first argument. Eg. chrcount "hello" >> 5
function chrcount(){  
    echo -n $1 | wc -m
}
function wordcount(){  
    echo -n $1 | wc -w
}

Just add to ~/.bash_profile and source as before. Now we can simply run:

$ chrcount "04865c40293a680cb9c020e7b1e106d8c1916d3cef99aa431a56d253e69256dac09ef122b1a986818a7cb624532f062c1d1f8722084861c5c3291ccffef4ec6874"
>>> 130

And voila, we see that our pushed bytes are 130 characters long (or 65 bytes long since each hex representation is 2 characters), and we have a valid length for a public key to be pushed to the stack. This also works for sanity checks on specified lengths on various inputs and outputs from cryptographic signature and hash functions, like the public key above.

Just for the hell of it, I implemented a word count too. Never know when that could come in handy (online applications anyone?):

$ wordcount "wow I never knew bash scripts could be so useful!"
>>> 10

Other Helpful Tools

I'll be documenting more of the challenges and solutions I've run into working with raw Bitcoin transactions, but just as a closing remark, here are a few other tools that have been very helpful:

  • SCADACore Hex Converter: Just type in any hex representation of little and big endian byte arrays, and this will give you back the decimal representation. Great for debugging larger byte arrays that represent a length or other amounts in your raw transaction.
  • Blockchain.info's and Coinb.in's raw Bitcoin transaction decoders: Great for decoding your hex transactions to see if you get what you expect. Warning though, these do not give much in the way of debugging information when your transaction fails to decode, and they will not test the validity of scripts or signatures either.

Hope that was helpful for anyone working with byte arrays in any context or specifically raw Bitcoin transactions. Feel free to tweet to @soroushjp if anything doesn't work the way it should or if I can help debug your raw transactions :)