Cryptocurrencies work using blockchains, which are basically growing lists of records of encrypted information, each chained to the next with crypto. Those ‘records’ are the blocks in which transactions are recorded and validated, but that’s not the only thing inside them.

Next, we will take a walk through the content of a block in a blockchain, taking the Bitcoin blockchain as a reference. In this way, we can get closer to its operation.

Before you start, you should keep in mind that this is an article for users who have at least mastered the basics of blockchain and cryptocurrency concepts. If you haven’t already, it’s best to start with more basic information.

Contents

  • Crypto hash
    • Hash as a one-way function
    • Properties of a secure hash function
    • About hash types
  • Merkle tree‘s
    • Merkle root
  • digital signatures
    • Process of a digital signature
    • Properties of a secure digital signature
  • Transactions
    • Transactions pending expense (UTXO)
    • Structure of a transactionfull block
  • Full Block
    • Nonce and mining
    • Block header
    • Other data

Crypto hash

A cryptographic hash function is an algorithm that has certain useful properties for data encryption, that is, protecting content through the use of keys. Applying it takes a message of any size, encrypts it, and gets back a single, fixed-length alphanumeric string (called a digest or just a hash), regardless of the size of the original message. It works to verify that it is, in fact, about that particular message (or transaction) and that it was not modified before it was delivered. If a single part, even a single point of the original message changes, the hash (digest) also changes radically.

For example, using an online tool to encrypt with the SHA256 algorithm (which we will talk about later) we can enter the following message:

Bitcoin is the first cryptocurrency

And get the following result:

91D3081626672039C99F27323895D06B88376312706BF7AB035A662B6C5C0B1B

If you added words or changed even one period in the original message, the obtained hash would also change, although it would continue to be the same length (64 characters). Let’s see:

AE0B40E7FC912BCE189A06F3A8069776FB24DCCC493332F2D349F0A470DE1254

Notice that we only add a period at the end of the sentence. Even so, the result is completely different from the first. Otherwise, if the phrases were re-encrypted with this same algorithm, but with another tool, their digests would remain the same: a particular input always produces the same unique result.

In this way, the messages are transmitted safely and completely, since it is almost impossible to find out the original message from the digest, and, therefore, it would not be possible to modify it either. These are known as one-way or one-way functions. We can go deeper into this.

Hash as a one-way function

A one-way function, in mathematics, is defined as a function (relation between the elements of two sets) that has the characteristic of being easy to calculate, but difficult to invert. Note that it is said “difficult”, but not impossible. Actually, fully one-way functions in computer science are still just conjecture.

Hash functions, however, are made to be difficult enough to reverse. Only then is it possible for them to be useful for cryptography, since reverting them would take a counterproductive amount (for the attacker) of time and resources.

Building a hash is a complex mathematical process, but one of the ways to do it is through modular functions, which would ensure its ‘unidirectionality’. Simply put, modular functions produce the remainder of a division. So, for example, 10 mod 3 = 1, because 10 divided by 3 is 3 plus a remainder of 1. In another way, 3 goes into 10 3 times, leaving an addition of 1.

Now, let’s say that to build a hash we have a private key (X) mod 5 = 2. Only you would know the value of X, which, let’s say, is 27, because divided by 5 it equals 5 plus a remainder of 2. Suppose further that 5 is the data from a transaction you made and 2 is the resulting hash of that transaction. Although these last data are public, it is almost impossible to find out that your private key is 27, because there are infinite possibilities to get the result of 2 also using 5. Your “X” could be 7, 52, 23390787 or something else: finding out is almost impossible.

In practice, this same principle is applied to more sophisticated algorithms and much larger amounts of data, so the difficulty of finding out the source data increases much more. Data protected by a hash function is safe.

Properties of a secure hash function

There are several types of hash functions, but all of them, to be safe, must have four main characteristics:

computationally efficient

Hash functions are used in computers, so although it sounds a bit obvious, these computers must be able to perform the mathematical work required to create a hash in a very short period of time. If this were not the case, each process that involves issuing a hash would take too long and it would be impractical to use them. Currently, this is not a problem, as an average computer can complete the task in less than a second.

deterministic

This implies that the same message (input) must always produce the same digest (output) each time it is used or consulted. If the function produced a random result each time it would be useless, since it would not be useful to verify that it is the original message. The point of a hash, for the case at hand, is to confirm that a digital signature is authentic without having access to the private key.

resistant to preimage

It means that the output should not reveal any data at all about the input. This is why a hash should always have the same length in the digest, regardless of the size of the message. Nor should any clue be given about the content of such a message, so at the slightest change, the resulting hash should be completely different.

collision resistant

Two (or more) different inputs should not produce the same output (digest). It is worth mentioning that no hash function is completely collision free: it is a simple mathematical probability. The outputs have a certain length, unlike the inputs which can be of any size, so the number of results is finite and therefore collision prone. However, this probability is quite small, and the goal of any hash function is to make it as small as possible.

About hash types

There are numerous types of hashing algorithms on different platforms, with various functions, from document authentication and password verification, to digital signature verification, and of course, cryptocurrency mining. Among those that continue to be effective, we can mention BLAKE2, MD6, Streebog and, especially, the SHA (Secure Hash Algorithm) series.

The SHA series was designed by the US National Security Agency (NSA) and includes SHA-256, widely used in the crypto world because it was the algorithm chosen by Satoshi Nakamoto to run the Bitcoin blockchain, from which, In turn, other cryptocurrencies have been derived that have preserved the same algorithm (such as Peercoin).

SHA-256 produces a digest of 256 bits and 64 characters. On the other hand, we have SHA-3, included in the same standard, but with a different structure, since it produces digests of arbitrary size. This is the algorithm used for the Ethereum Ethash system.

Merkle tree’s

Knowing what a hash is, it should be noted that transactions in a blockchain use hashes to encrypt data, but these alphanumeric lines do not just “float” freely in a digital cloud. They are strictly ordered and summarized as the chain grows, providing a secure, fast, and lightweight method of verifying data. For this purpose, the Merkle Tree is implemented.

This concept, also known as Hash Tree, is a tree data structure, that is, one that simulates a hierarchical order from top to bottom, starting with a single “root” value that is divided into “leaf” values. ”, like a tree upside down. In the case of the Merkle Tree, these values are all hashes, and in a blockchain, those hashes would come from transaction data.

The name of the cryptographic tree comes from its inventor, Ralph Merkle, an American computer scientist who patented this structure in 1979. It should be noted that Merkle, as of 2019 at 67 years of age, is also the inventor of the cryptographic hash and one of the inventors of public key cryptography.

Merkle root

The main purpose of the hash tree is to create a Merkle root, that is, the root value that we mentioned before. In this case, you might think that the “leaf” values come from this value, but in reality, the root value is a summary of all the leaf values.

This digest is created by grouping all the hashes of the transactions into pairs to which, in turn, the relevant cryptographic hash function will be applied again to create a new digest that is equivalent to both. If the number of entries were odd, the last one would copy itself and pair with that copy to allow processing. The resulting digests will be reorganized in pairs and repeat the same technique, until there is only a single hash line left as a summary of all the ones that went through this merge process. That is the Merkle root, and there is only one root per block in a blockchain.

Thus, for example, if 512 transactions are contained in a block, the Merkle tree would take care of grouping them into 256 pairs, which would then be reduced to 128, then to 64, from there to 32, then 16, 8, 4, 2 and the last one. A single alphanumeric line is left there to represent those 512 transactions, rather than reloading the block with 512 hashes.

Let’s remember that each transaction in a blockchain has its own hash, and, if we talk about hashes created with SHA-256, each one weighs 32 bytes. Going back to the 512 hashes example, that block would then hold (in addition to other data) 16,384 bytes when, using the Merkle root, it can hold only 32 bytes. She contains the rest mathematically, so it is not necessary to include them all.

As we can see, in the long term, with thousands and thousands of transactions carried out by users from all over the world, the space to store a complete blockchain could become a problem were it not for the Merkle root, which summarizes a large number of data in a single hash.

Satoshi Nakamoto himself in the Bitcoin White Paper explains that once a transaction is buried under enough blocks, transactions before that one can be discarded to save space. To achieve this without breaking the hash that secures and connects them to the rest of the blockchain, the old blocks are compacted with the Merkle tree, while the inner hashes that formed the root do not need to be preserved.

A block header with no transactions would be about 80 bytes. Assuming blocks are generated every 10 minutes, then 80 bytes * 6 * 24 * 365 = 4.2MB per year. With computer systems typically being sold with 2 GB of RAM as of 2008, and Moore’s Law predicting current growth of 1.2 GB per year, storage should not be an issue even if block headers are to be kept on the fly. memory.

“Satoshi Nakamoto, Bitcoin white paper”

Digital signatures

Before directly defining what digital signatures themselves are, it is necessary to explore the concept of asymmetric or public key cryptography, as this is an essential part of its operation. It is a cryptographic system that generates for its users, through the application of specific algorithms, two “keys” or “keys”: a public one, which can be distributed to anyone without risk, and a private one, which should only be known by your owner. These “keys” are alphanumeric lines of a certain length. There is no big difference between the formats of one and the other: for an average user, which is public and which is private is decided by the system used to create them.

Using this system, the sender can encrypt any message using the recipient’s public key. Once that message is encrypted with that public key, only the private key of that recipient can decrypt it, since both keys are mathematically related. In this sense, the public key can be compared to an email address, while the private key would be the password for that email.

In the case of cryptocurrencies, public key cryptography is used in any wallet to exchange funds. The public wallet addresses that we provide to receive funds are a hashed version of the public key, while the twelve (or more) words that many software provide as a “seed” to retrieve the funds serve to derive the private key.

In the middle of this process is where digital signatures come in. These are basically the combination between a private key and a hash of the data to be signed (such as a transaction), which provides a unique digital identification to establish the authenticity and integrity of the message, without revealing the private key of the signer. .

To achieve that purpose, it typically uses three algorithms: one for the generation of a random private key, from which a corresponding public key is derived; another to produce the signature itself based on the private key and the data, and a last one that determines whether or not the message is authentic based on the data, the public key and the signature. Each transaction carried out in a blockchain needs, in addition to other requirements, the signature of its sender and the verification of that signature by the recipient and the network to become valid.

Process of a digital signature

In this example, we will focus on the creation and traversal of the digital signature as such. It is important to mention that other elements are involved in a transaction with cryptocurrencies and other data is validated, which we will mention later.

For now, let’s say that Alice has already generated her key pair (that is, she got a digital wallet) and wants to make a bitcoin transaction to Bob. For this transaction to be valid, it must be signed by Alice’s private key, and Bob must verify that this signature is authentic. The steps that the system follows to complete this process are as follows:

On Alice’s part, the transaction data is taken and the SHA-256 algorithm (since we are talking about Bitcoin) is used to encrypt it into a 64-character hash.
The obtained hash is “combined” or “signed” with Alice’s private key, resulting in two numbers known as r and s, with a variable weight between 71 and 73 bytes. That, more specifically, is the digital signature.
The transaction data, digital signature, and Alice’s public key are then sent to Bob.
Using Alice’s public key, the system on Bob’s side will be able to decrypt the digital signature (without revealing Alice’s private key) to get the 64-character hash of the transaction data, which Alice had previously encrypted with SHA- 256 and combined with your private key.
Since the transaction data was also received by Bob, the system repeats the process of encrypting it with SHA-256 to get the corresponding hash.
It is verified that the hashes of steps 4 and 5 are exactly the same. If they are not, this would indicate that someone tampered with the data or Alice’s public key does not match her private key. Therefore, the transaction would be invalid, since it was modified during its transit or it does not correspond to the owner of the funds.

Properties of a secure digital signature

There are several algorithms for creating digital signatures: in the case of Bitcoin, the Elliptic Curve Digital Signature Algorithm (ECDSA) is used, which takes the math behind finite fields and elliptic curves to generate the public keys from the private. All algorithms, however, should provide the following features to provide the necessary security between participants:

Authentication

Using a digital signature should assure the recipient that the message (or transaction) comes from a specific sender, whose identity can be verified beyond a written signature: with mathematics. The generated signature is based on precise data and is almost impossible to forge.

Integrity

This property guarantees that the data will arrive intact to the sender, that is, that it will not be modified in any way during its transfer. In theory, the data could be modified without being seen by a skilled hacker, but if this happens the signature would also change and, therefore, would no longer be valid.

I do not repudiate

The user who used his personal digital signature cannot deny that he did so. Non-repudiation is usually a legal concept: once a document is signed, the author should not be able to deny that he committed himself by signing it. Therefore, digital signatures are also binding and auditable. Among the countries that legally recognize them are the United States, Switzerland, Brazil, Mexico, India, Turkey and the European Union.

Transactions

Transactions are digitally signed pools of data that store fund transfers between senders (known as inputs within those pools) and recipients (outputs). They are transmitted to the network and, as they are validated, they are put together and ordered to form the blocks in a blockchain.

In Bitcoin (and many other cryptocurrencies), transactions must be authorized with the use of digital signatures, but they are not encrypted; so it’s possible to see the data they include through a chain explorer.

Transactions pending expense (UTXO)

Before going further into the structure of a transaction in the blockchain, we need to know what a UTXO (Unspent Transaction Output) is about. A UTXO, or pending spend transaction, is an output (of funds) that a user receives to be able to spend in the future as an input for someone else. The total balance in the wallet of any user is made up of UTXOs of different sizes, or by a single one that can be received in exchange when it is spent less than its entirety.

To put it into perspective, we can think of UTXOs as the equivalent of individual coins or bills within the blockchain. Just as the cash system is designed only with bills or coins of a certain amount (in the dollar, for example, there are only six different bills), which are combined to form new amounts or are delivered in their entirety waiting to receive the remainder of the purchase in exchange, different UTXOs are also combined in the blockchain, or a change to a larger UTXO is granted when a transfer is made.

So let’s say you have $100 worth of bitcoins in your wallet and you need to pay $45. As a common user you don’t see it with the naked eye, but those $100 may be made up of several UTXOs: maybe two $50, or four $25. In the latter case, when making the $45 payment, you would actually be sending two $25 UTXOs and receiving another one in exchange for $5.

Unlike cash, UTXOs can be of any amount, depending on the amount exchanged between participants. Another difference with cash is that, in addition to the funds we give for a payment or a purchase, we must also include a commission payment for those who maintain the network by helping to validate the transactions to include them in new blocks (the miners).

Thus, continuing with the previous example, in reality, the UTXO that you would receive back at the end would not be $5, but $4.5, since the total to pay would be $45.5 ($45 payout + $0.5 payout). commission, for example). Depending on the cryptocurrency and also depending on the circumstances at the time (because, sometimes, the block chains become saturated), the amount paid in commission varies, although it is usually much less than a dollar.

Since a single UTXO can include any amount, even some very small ones that are repeated indefinitely with their respective weight; in the long run, chain space can become a problem. Therefore, developers of digital wallets should keep them in efficient sizes that take up as little space as possible to allow for better processing speeds. Thus, if your balance is $100, it is better to have it divided into two $50 UTXOs or four $25 UTXOs than into 100 $1 UTXOs. If we re-imagine UTXOs as bills in a physical wallet, the problem of having too many low-value UTXOs on a single balance becomes clearer.

It is important to note that, rather than coins, the entire blockchain is a network of UTXOs waiting to be unlocked and sent to someone else as a new UTXO.

Structure of a transaction

To get an idea of the structure of a transaction, we can take a closer look at how a transaction is composed on the Bitcoin blockchain. It can be divided into three parts: the header, the inputs, and the outputs.

The header is made up of four parts: the hash of the transaction, the software version that should be used to validate that block, the number of inputs and outputs, and either a date or a block height (either) to indicate when that transaction was added to the chain.

The inputs include the hash of the previous output pointing to the available UTXO(s), an index to the list of outputs from the previous transaction to identify the one that can be spent on the new input, and the ScriptSig, an unlock “simple program” that asks for a certain condition to access the funds. The main condition is the recipient’s personal private key.

The outputs, in turn, include the amount to be paid in satoshis and the ScriptPubKey, the opposite pair of ScriptSig that is in charge of blocking the funds with the recipient’s public key so that only he can unlock them later with his private key.

Once the transaction is made, it is sent to the miners, who are in charge of validating it, among other steps, comparing both Scripts.

Full block

The block of a blockchain is a “container” of data of variable size. Most of this data is made up of transactions (in Bitcoin, an average of 2,188), which in turn use hashes, digital signatures and UTXO to complete, as we have already seen. In addition, within a block we find its header, in which it records the metadata of the block itself; that is, technical information about its composition and validation within the chain.

Soon we will see what makes up that header, but first it is necessary to go through another concept: the nonce.

Nonce and mining

We must start by stating that each block has a unique ID in the form of a hash. This is created by passing the block header through the SHA-256 algorithm (in the case of Bitcoin). Within that header is the hash of the previous block, so both blocks are automatically intertwined.

However, it is not enough to just give the block any hash for it to become valid: it has to be a very specific hash, starting with a consecutive number of zeros, since it must be equal to or below a certain value called Target ( target): a 256-bit (rather long) number determined by the difficulty set by the system. Finding that acceptable hash to make a block valid is what is called cryptocurrency mining, and it is implemented with the intention of making modification of the hash chain almost impossible.

Now, how can that acceptable hash be found so that the block becomes valid? As we mentioned, that hash is supposed to come out of the block header. But there is a problem there, because all the data in that header is essential and cannot be modified: it would imply changing, for example, the amounts of a transaction. What if then the hash doesn’t match the Target? For it to match, the data in the header would have to be changed in some way, but each piece of data is vital.

That’s where the nonce comes in: it’s a completely random number that is added to the block header as a little extra data, with no other purpose than to be changed over and over again by miners in order to find a valid hash. If the first nonce does not work, it is removed and a new one is added, until a valid hash is found that satisfies the network difficulty condition (is within the target). Once this process is done, if the block hash were changed, the network would easily notice it thanks to the properties of the hash, the block would be invalidated and it would be triggered from the blockchain.

More than the number of attempts, mining is about luck, since each nonce is random, so it also gives a random number. However, those miners with better equipment will have more opportunities, because, after all, the number and speed of those attempts increases their chances of finding the correct hash in the shortest possible time.

Block header

Knowing all of the above, we can then understand how the block header is formed in a blockchain. It includes six facts:

Version: This is the number that indicates the level of software development at the time the block was mined. Computers use it to read the content of each block correctly. The Bitcoin software, for example, has had about 56 versions in 2019 only in the Bitcoin Core client.

Hash of the previous block: it is a long alphanumeric line that begins with several zeros. In Bitcoin, it has 64 characters. We have already seen how it is formed.

Merkle root: As we explained before, all the transactions in the block are joined into a single hash, which is this root.

Timestamp: Indicates the exact moment the block was mined. In Bitcoin, you put the number of seconds passed since January 1970.

Target: This is the 256-bit number that tells miners what the correct hash may be.

Nonce: Additional random number miners use to find a valid hash for the block.

Other data

In addition to the header and transactions, a block also contains other data for its operation within it. Let’s see:

Magic number: In programming, this is a constant number used to identify the format of a file or protocol. In the case of a blockchain, this number is used to identify when a block begins and when it ends. In Bitcoin it is always the same: 0xD9B4BEF9, and it weighs 4 bytes.

Block size: number in bytes to indicate the volume of the block. In turn, that number weighs 4 bytes in Bitcoin.

Transaction counter: it is represented by a positive integer of variable length. In Bitcoin, it weighs from 1 to 9 bytes.

In total and in order, a block then has five sections:

magic number
block size
Header (which in turn contains 6 sections)
transaction counter
Transactions (can be thousands)

In this sense, we could say that a block is less a “block” as such and more a certain group of data chained to others with cryptography. In turn, these blocks of information (financial, in the case of cryptocurrencies) are chained sequentially to infinity, creating a basically unbreakable cryptographic chain.

Tagged in:

About the Author

SAKHRI Mohamed

The blog of a computer enthusiast who shares news, tutorials, tips, online tools and software for Windows, macOS, Linux, Web designer and Video games.

View All Articles