Public-key cryptography

Understanding Hashing in Crypto: How Data Becomes Immutable

Pomegra Learn

What is Cryptographic Hashing in Cryptocurrency?

Hashing is one of the three pillars of blockchain security, alongside public-key cryptography and digital signatures. A hash function takes any input—a transaction, a block, a message, a file—and produces a fixed-length output (a "hash") that represents that data. Change even one character in the input, and the hash becomes completely different. This property makes hashing the foundation of blockchain's immutability.

If you've ever seen a Bitcoin block hash like 0000000000000000000f9cf8c63e38f132eac19c8f7165577b7afff670000000, that's the output of a hash function applied to all the data in that block. It's also what miners are racing to find, what nodes use to verify blocks, and what ties blocks together in an unbreakable chain.

Quick definition: A cryptographic hash function is a mathematical algorithm that converts any input data into a fixed-length string of characters (the hash), making it impossible to reverse the process or find two inputs that produce the same hash.

Key Takeaways

A hash function takes any input and produces a fixed-length output called a hash or digest
One-way function: You cannot reverse a hash to recover the original data—it's computationally infeasible
Deterministic: The same input always produces the same hash; different inputs produce different hashes
Avalanche effect: Changing one character in the input completely changes the hash
Collision-resistant: Finding two inputs that produce the same hash is practically impossible
Bitcoin uses SHA-256, producing 64-character hexadecimal hashes (256 bits)
Ethereum uses Keccak-256 for hashing, a variant of the SHA-3 algorithm
Hashes secure transactions, link blocks together, and enable proof-of-work mining

How Hash Functions Work

A cryptographic hash function is a mathematical algorithm that transforms data. Think of it as a digital fingerprint—unique, fixed-size, and impossible to forge.

Example: SHA-256 (Bitcoin's hash function)

Input: "Hello, World!" Output: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f

Input: "Hello, World!" Output: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f (identical)

Input: "Hello, World!!" Output: 315f5bdb76d078c43b8ac0064e4a0164612b1fce77c869345bfc94c75894edd3 (completely different—one character changed)

The same input always produces the same hash. But a single character change produces a completely different hash. This is the avalanche effect—a tiny input change creates a dramatic output change.

Properties of Cryptographic Hash Functions

Secure hash functions have specific mathematical properties that make them useful for blockchain.

1. Deterministic

The same input always produces the same output. If you hash "Bitcoin" a million times, you get the same hash every time. This consistency allows everyone on the network to verify the same transactions produce the same hashes.

2. One-Way (Preimage Resistance)

Given a hash, you cannot compute the original input. If I show you dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f, you cannot figure out it came from "Hello, World!" by working backward. You'd have to try every possible input until you found one that hashes to that value—which would take longer than the universe has existed.

This one-way property is why hashes are useful for storing passwords (you never store the password itself, only its hash) and for securing data.

3. Fixed Output Size

Whether you hash one character or a gigabyte of data, the output is the same size. SHA-256 always produces a 256-bit output (64 hexadecimal characters). Keccak-256 produces the same. This fixed size makes hashes portable and uniform.

4. Avalanche Effect (Sensitivity)

A tiny change in the input produces a completely different hash. In a secure hash function, changing one bit of the input changes roughly half the bits of the output. This extreme sensitivity means you cannot approximate a hash—you either have the exact input, or your hash is wrong.

5. Collision Resistance

A collision occurs when two different inputs produce the same hash. Secure hash functions are designed so that finding a collision is computationally infeasible. SHA-256 has never had a collision found (and mathematically, finding one would require trying more combinations than atoms in the observable universe).

6. Speed

Hash functions must be fast to compute. Bitcoin hashes blocks millions of times per second during mining. If hashing were slow, the network would grind to a halt. Modern hash functions are optimized for speed.

Common Hash Functions in Cryptocurrency

SHA-256 (Bitcoin)

256-bit output (64 hexadecimal characters)
Designed by the NSA, approved by NIST
Bitcoin hashes transactions and blocks using SHA-256
Also used in difficulty adjustment and address generation
Widely considered highly secure; no collisions found

Keccak-256 (Ethereum)

256-bit output (64 hexadecimal characters)
Similar to SHA-256 but different design
Ethereum uses Keccak-256 for transaction hashing and smart contracts
Faster than SHA-256 on some hardware
Also secure; no practical collisions known

RIPEMD-160 (Bitcoin Addresses)

160-bit output (40 hexadecimal characters)
Used in Bitcoin address generation
Applied after SHA-256 (Bitcoin addresses are SHA-256 output hashed again with RIPEMD-160)
Less common than SHA-256 or Keccak-256 but still secure

Scrypt and Argon2 (Password Hashing)

Deliberately slow hash functions
Used for password-based key derivation (turning a password into a usable key)
Designed to resist brute-force attacks by making each hash attempt expensive
Not used for blockchain hashing (too slow for mining)

How Hashing Secures Blockchain

Blockchain's security comes from linking blocks together using hashes. Each block contains the hash of the previous block—like a chain of fingerprints.

The process:

Transactions are gathered. Miners or validators collect pending transactions.
Transactions are hashed. Each transaction is hashed individually using SHA-256 (Bitcoin) or Keccak-256 (Ethereum).
Hashes are combined. Multiple transaction hashes are combined into a Merkle tree—a structure where hashes are paired, combined, and hashed again until a single root hash remains (the Merkle root).
Block header is created. The block header includes:
- The Merkle root (hash of all transactions)
- The previous block's hash
- A timestamp
- Difficulty target
- A nonce (a number used in mining)
Block is hashed. The entire block header is hashed, producing the block hash.
Hash is checked against target. In Bitcoin, the block hash must be smaller than a difficulty target. Miners adjust the nonce and rehash millions of times until they find a hash below the target.
Block is linked. The next block's header includes the previous block's hash, creating a chain. If someone tries to modify an old block, its hash changes, breaking the chain.

Why this creates immutability:

To modify a transaction in an old block, you'd need to recalculate that block's hash
This breaks the chain; the next block points to the old hash, not the new one
You'd have to modify the next block too, which breaks the chain again
This cascades: modifying one transaction requires modifying every subsequent block
To do this faster than the network can create new blocks, you'd need more computing power than the entire network combined (a 51% attack)

Hash Rate and Proof of Work

In Bitcoin, hashing isn't just for security—it's the foundation of mining and consensus.

Mining process:

Miners collect pending transactions and create a block
They create a block header including a random nonce
They hash the block header: SHA-256(block_header) = some hash
They check: is this hash smaller than the target? (Smaller hashes have leading zeros: 0000abcd...)
If yes, they've found a valid block and broadcast it
If no, they increment the nonce and try again: SHA-256(block_header_with_nonce_2)
They repeat millions of times per second until they find a valid hash

Difficulty adjustment:

Bitcoin's difficulty adjusts every 2,016 blocks (roughly two weeks) to keep block time at 10 minutes. If miners add more hashing power, the difficulty increases—requiring more leading zeros in valid hashes, making them harder to find.

This is why Bitcoin mining requires enormous electricity: miners are racing to hash billions of combinations until one produces a valid hash.

Hash Functions and Privacy

Hashing plays a crucial role in cryptocurrency privacy, though with limitations.

One-way hashing for security:

Smart contract code is often hashed before deployment, with the hash stored on-chain
The actual code is revealed only when needed, keeping sensitive logic private
Passwords can be hashed and stored without revealing the original password

Limitations for privacy:

Bitcoin and Ethereum transactions are hashed and stored transparently
While the hash is one-way, the original transaction data is visible on the blockchain
Hashing the transaction doesn't hide its content—it just creates a fixed-size fingerprint
Privacy-focused cryptocurrencies (Monero, Zcash) use additional techniques beyond hashing

Merkle Trees and Block Verification

A Merkle tree is a tree of hashes where leaf nodes are transaction hashes, and each parent node is the hash of its children. The root of the tree (Merkle root) is a single hash that represents all transactions in a block.

Why Merkle trees matter:

Efficiency: A node can verify a transaction without downloading the entire block. They need only the transactions on the path from leaf to root.
Simplicity: Bitcoin SPV (Simplified Payment Verification) clients—light wallets on phones—can verify transactions by checking headers and a small branch of the Merkle tree.
Structure: Merkle trees organize transactions hierarchically, making verification fast and scalable.

Bitcoin example:

If a block has 2,000 transactions, a phone wallet can verify a specific transaction by downloading only about 12 hashes (log₂ of transaction count) instead of the entire block.

Real-World Examples

Example 1: Verifying Block Integrity

A Bitcoin node receives a block from a miner. It:

Reads all transactions in the block
Calculates the hash of each transaction
Combines them into a Merkle tree
Calculates the Merkle root
Hashes the block header (which includes the Merkle root)
Verifies the resulting hash matches the announced block hash

If every hash matches, the block is legitimate. If even one transaction was altered, the entire Merkle root changes, the block hash changes, and the node rejects it.

Example 2: Double-Spending Prevention

Alice tries to spend the same Bitcoin twice:

Transaction 1: Send 1 BTC to Bob, hash = a1b2c3d4...
Transaction 2: Send the same 1 BTC to Carol, hash = e5f6g7h8...

Both transactions are broadcast. Bitcoin nodes verify the second one tries to spend an input already spent in the first. One is accepted; one is rejected. The blockchain's immutability (maintained by hashes linking blocks) ensures there's no ambiguity about which transaction happened first.

Example 3: Light Client Verification

A phone wallet (light client) wants to verify a transaction with minimal data:

It downloads block headers (80 bytes each), not full blocks (1-2 MB each)
For a transaction it cares about, it requests only the Merkle path
It verifies the Merkle path hash up to the Merkle root
It confirms the Merkle root matches the block header
If the block headers are valid (sufficiently difficult hashes), the transaction is verified

This allows phones to verify transactions using kilobytes of data instead of gigabytes.

Common Mistakes to Avoid

Mistake 1: Thinking hashes encrypt data. Hashes are one-way functions; they don't encrypt. Encryption (with a key) is reversible; hashing is not. A hash proves data integrity and creates a fingerprint, but it doesn't hide the data. Anyone seeing the original transaction on the blockchain knows its contents.

Mistake 2: Assuming two inputs can have the same hash. Collisions are theoretically possible but practically impossible to find. SHA-256 has 2^256 possible outputs—more than atoms in the universe. Finding a collision would require trying more combinations than exist, making it infeasible.

Mistake 3: Thinking you can "guess" a hash. Hashes appear random. Given a target hash, you cannot compute an input that produces it any faster than trying every possible input. This is why mining is computationally expensive.

Mistake 4: Confusing hash functions with encryption. Hashing is one-way; encryption is two-way (reversible with a key). Bitcoin uses hashing for immutability; encryption is used elsewhere, such as private communication or key derivation.

Mistake 5: Believing changing one character in a hash matters. Hashes are fixed-length, complete outputs. You cannot "partially match" a hash. Either the hash is exactly correct, or it's wrong. There's no in-between.

FAQ

Why does Bitcoin hash blocks twice? Bitcoin actually hashes transactions once to create Merkle trees, then hashes the block header once. This double-hashing (applying SHA-256 twice) prevents certain types of attacks and was a design choice. Some protocols use single hashing; double-hashing is a Bitcoin convention.

Can quantum computers break SHA-256? Quantum computers could theoretically threaten certain cryptographic systems (like elliptic-curve cryptography used for signatures), but SHA-256 (hashing) is more resistant. Quantum attacks on hashing would require more qubits and time than attacks on key cryptography. But cryptocurrency researchers are preparing alternatives.

Why is Bitcoin's hash rate so high? Bitcoin's hash rate (the total computing power mining blocks) reached exahashes per second (10^18 hashes per second) because mining is profitable and difficulty increases with miner participation. Higher hash rate makes the network more secure (harder to 51% attack) but uses more electricity.

Is there a faster hash function than SHA-256? Yes. Keccak-256 is often faster on modern CPUs. Other functions like Blake2 are faster still. But speed isn't the only consideration—Bitcoin chose SHA-256 for its strength and maturity. Changing it would require consensus across thousands of nodes.

Can I use the same hash for two different purposes? Yes, but it's not best practice. If a hash is used for multiple purposes and one is compromised, others might be too. In Bitcoin, different hash functions (SHA-256 vs. RIPEMD-160) are used for different purposes (blocks vs. addresses).

What happens if someone finds a SHA-256 collision? It would be catastrophic for Bitcoin and most of the internet. A collision would break the security of blocks and addresses. But finding one would require solving a computational problem harder than breaking encryption, so it's considered secure for the foreseeable future.

Why can't I recover data from a hash? Because hash functions are one-way. They deliberately discard information about the input (compression). You cannot reverse the process mathematically. To "recover" data, you'd have to know what the input might be, hash it, and compare—essentially guessing.

Public vs Private Keys in Crypto — How keys work alongside hashing in cryptographic systems
Digital Signatures for Beginners — How transaction data is hashed before signing
Crypto Addresses Explained — How addresses are derived using hash functions
What is a Crypto Wallet? — How wallets use hashing to derive addresses
Proof of Work Basics — How mining uses hashing to secure blocks
Seed Phrases Explained — How hashing converts seed phrases into keys

Summary

Cryptographic hashing is the process of converting any input data into a fixed-length output (hash) that acts as a digital fingerprint. Hash functions are one-way, deterministic, and collision-resistant—core properties that make them invaluable for blockchain security. Bitcoin uses SHA-256; Ethereum uses Keccak-256. Hashes link blocks together in an immutable chain, prevent double-spending, enable proof-of-work mining, and secure addresses. Understanding hashing is essential to understanding how blockchain remains tamper-proof: modifying an old transaction would require recalculating every subsequent hash, work that would require more computing power than the entire network combined. Hashing isn't encryption or privacy—it's a one-way fingerprint function that creates immutability through mathematics.

Continue to What is a Crypto Wallet? to learn how wallets use keys and hashing to manage your cryptocurrency.

Key Takeaways​

How Hash Functions Work​

Properties of Cryptographic Hash Functions​

Common Hash Functions in Cryptocurrency​

How Hashing Secures Blockchain​

Hash Rate and Proof of Work​

Hash Functions and Privacy​

Merkle Trees and Block Verification​

Real-World Examples​

Common Mistakes to Avoid​

FAQ​

Related Concepts​

Summary​

Next​