Stax
Tools
developer-toolssecurityhashingcryptography

SHA-256 vs MD5: which hash algorithm should you use?

MD5 is everywhere — and broken for security. SHA-256 is the modern default. Here's what the difference actually means for your code, files, and passwords.

Harshil
Harshil
··8 min read
🌐

This article is currently only available in English. A हिन्दी translation is coming soon.

SHA-256 vs MD5: which hash algorithm should you use?

If you've verified a downloaded file, stored a password, or deduplicated records, you've used a hash function. The question developers hit eventually: MD5 is fast, it's everywhere, and it's easy to use — so why does everyone say to use SHA-256 instead?

The short answer: MD5 is cryptographically broken for security-sensitive uses, and has been for over 20 years. SHA-256 is not. The longer answer — what "broken" actually means, what the practical consequences are, and when the choice genuinely matters — is what this guide covers.

What a cryptographic hash function does

A hash function takes any input — a string, a file, a blob of bytes — and produces a fixed-length output called a digest or hash. The properties that make it useful:

  • Deterministic — the same input always produces the same output. Run SHA-256 on the same file a million times; you get the same 64-character hex string every time.
  • One-way — you cannot recover the original input from the hash. There's no unhash() function.
  • Fixed output length — regardless of whether the input is 1 byte or 1 gigabyte, the output length is the same. MD5 always produces 128 bits (32 hex chars); SHA-256 always produces 256 bits (64 hex chars).
  • Avalanche effect — a single-bit change in the input produces a completely different output. Changing "hello" to "hellp" should produce a totally unrelated hash.
  • Collision resistant — it should be computationally infeasible to find two different inputs that produce the same hash. This is the property that MD5 loses.

What a collision attack actually means

A hash collision happens when two different inputs produce the same output. Since hash functions compress arbitrary-length inputs to a fixed-length output, collisions must theoretically exist — by the pigeonhole principle, if you have more inputs than outputs, some inputs must share an output.

The security guarantee is that it should be computationally infeasible to find a collision on purpose. For a 128-bit hash like MD5, a naive birthday attack would require roughly 2^64 attempts — which sounds large but, for motivated adversaries with modern hardware, is not.

Worse: researchers found structural weaknesses in MD5 that allow collisions to be found much faster than brute force. In 2004, Xiaoyun Wang and Hongbo Yu published a method to find MD5 collisions in hours on a single PC. By 2006, collisions could be found in under a minute. By 2008, researchers used MD5 collisions to forge a rogue Certificate Authority certificate — a real attack that could have allowed them to impersonate any HTTPS website on the internet.

NIST formally deprecated MD5 for any cryptographic use. It's not "a little weaker than SHA-256." It's cryptographically broken for any use case where an adversary could benefit from finding a collision.

SHA-256: what the standard says

SHA-256 is part of the SHA-2 family, standardised by NIST in FIPS 180-4 (2012, revised 2015). It produces a 256-bit (64 hex character) digest.

No practical collision attacks against SHA-256 have been published. The theoretical collision resistance is approximately 2^128 operations — even if every computer on Earth was running at full speed searching for a collision, it would take far longer than the age of the universe. For context: MD5 went from "theoretically safe" to "practically broken in hours" in about a decade of research. SHA-256 has been subject to intense cryptanalysis since the early 2000s without a practical break.

SHA-256 is approved by NIST for all current cryptographic uses and is required or recommended by TLS 1.3, code signing certificates, S3 request signatures, HMAC authentication in most APIs, and blockchain proof-of-work systems.

Side-by-side comparison

Property MD5 SHA-1 SHA-256
Output length 128 bits (32 hex chars) 160 bits (40 hex chars) 256 bits (64 hex chars)
Speed (relative) Fastest Fast Fast (slightly slower than MD5)
Collision resistance Broken (practical attacks since 2004) Broken (practical attack 2017) Strong — no known practical attacks
NIST status Deprecated for crypto Deprecated for crypto Approved (FIPS 180-4)
TLS 1.3 support Removed Removed Required
Code signing Rejected by browsers since 2017 (SHA-1) Rejected Required
Password storage Never appropriate Never appropriate Not appropriate — too fast

SHA-1: the middle ground you should also stop using

Between MD5 and SHA-256 sits SHA-1 (160-bit output). Many systems that migrated off MD5 stopped at SHA-1 — and that's not far enough.

In 2017, researchers at CWI Amsterdam and Google produced the first real-world SHA-1 collision, publishing two different PDF files with identical SHA-1 hashes. This was the SHAttered attack. The researchers estimated the collision required roughly 9.2 × 10^18 SHA-1 computations — expensive, but within the reach of a well-funded adversary. Major browsers rejected SHA-1 TLS certificates immediately.

If you're auditing legacy code: find MD5 → replace with SHA-256. Find SHA-1 → also replace with SHA-256. Find SHA-256 → keep.

When MD5 is still fine

MD5 being "broken for cryptography" doesn't mean you need to rip it out of every system. For non-security uses, the practical implications of a collision are zero or close to it.

MD5 is acceptable for:

  • Detecting accidental file corruption — if a transmission error flips a few bits in a file, the MD5 will change. You're not worried about an adversary crafting a malicious file with the same MD5; you're checking for transmission noise. MD5 is fine here.
  • Cache keys and ETags — if two different strings of CSS produce the same MD5, the worst case is a cache miss. Not a security issue.
  • Content-addressed deduplication — in a system that deduplicates files by hash, MD5 collisions mean a very rare false deduplication (two different files stored as one). For most systems, this is an operational edge case, not a vulnerability.
  • Non-cryptographic IDs generated from content — if you're generating short identifiers and collision resistance against a motivated adversary doesn't matter, MD5 is still fast and convenient.

MD5 is not acceptable for:

  • Any use where an adversary benefits from finding a collision — digital signatures, file integrity in security contexts, certificate fingerprints
  • Password storage (and neither is SHA-256 directly — more below)
  • HMAC keys or authentication tokens
  • Anything described in a security standard or compliance requirement

The password hashing exception: why you shouldn't use SHA-256 for passwords either

Neither MD5 nor SHA-256 is appropriate for hashing passwords — even though SHA-256 is secure for other uses. The reason is the opposite problem: both are designed to be fast, and fast hashing is catastrophically bad for password security.

A modern GPU can compute billions of SHA-256 hashes per second. If an attacker steals a database of SHA-256 password hashes, they can try every word in a dictionary plus millions of variations in seconds. A 6-character password is crackable in minutes even with a salted SHA-256 hash.

Password hashing requires algorithms specifically designed to be slow and memory-intensive:

Algorithm Notes
bcrypt Well-established, widely supported, configurable cost factor (work factor). Good default.
Argon2id Winner of the Password Hashing Competition. Memory-hard. Recommended by NIST SP 800-63B for new systems.
scrypt Memory-hard, good for environments where bcrypt is unavailable.
PBKDF2 Widely supported (including by FIPS), but weaker than bcrypt/Argon2 at the same time cost.

The pattern: SHA-256 for integrity, HMAC, and signatures. bcrypt or Argon2id for passwords.

Practical migration guide

Replacing MD5 checksums in build pipelines

If you're publishing a file with an MD5 checksum for users to verify:

# Before
md5sum myapp-2.1.0.tar.gz > myapp-2.1.0.tar.gz.md5

# After
sha256sum myapp-2.1.0.tar.gz > myapp-2.1.0.tar.gz.sha256

No data migration needed — just regenerate the checksums and update your download page.

Replacing MD5 in database deduplication

If your system uses MD5 as a deduplication key in a database column, the migration path is:

  1. Add a new sha256_hash column
  2. Backfill it for all existing records
  3. Start writing both columns for new records
  4. Migrate application code to use the new column
  5. Drop the old column after validation

This is a standard column migration — no data is lost.

Replacing MD5 password hashes (legacy PHP apps)

Many early web applications (particularly PHP apps using md5($password)) stored passwords as raw, unsalted MD5. Migration:

  1. Add a new password_hash column alongside the old password_md5 column
  2. When a user successfully logs in (so you have their plaintext password): verify their password against the old MD5, then immediately re-hash with bcrypt and store it in the new column
  3. After a migration period (or forced re-login), disable the MD5 fallback path
  4. Drop the old column

Never migrate the MD5 hashes directly to bcrypt — you'd be hashing an MD5 hash, not the original password. The migration must happen on the plaintext, which is only available at login time.

How to generate and compare hashes in your browser

The Stax Hash Generator computes MD5, SHA-1, SHA-256, SHA-384, SHA-512, and others entirely client-side. Use cases:

  • Verify a downloaded file — generate SHA-256 of the local file, compare against the hash the project publishes alongside the download
  • Identify an algorithm from an existing hash — the length tells you the algorithm: 32 chars is MD5, 40 is SHA-1, 64 is SHA-256, 96 is SHA-384, 128 is SHA-512
  • Generate deterministic content IDs — hash a string or file to get a consistent fingerprint
  • Test your own hashing implementation — verify that your code produces the same output as a known-good implementation

All computation happens locally. File contents and strings are never uploaded — relevant when you're hashing files that contain source code, credentials, certificates, or customer data.


Harshil writes about privacy-first tools, developer productivity, and the trade-offs between browser-based and uploaded utilities.


Sources & methodology

Last reviewed: 2026-05-14. Hash algorithm guidance is stable; verify NIST advisories for compliance-sensitive applications.

Harshil

Harshil

Developer & Founder, stax.tools

Harshil is the developer behind stax.tools, building privacy-first tools that run entirely in your browser.

More by Harshil →

🛠️

Found this useful?

Browse 235+ free privacy-first tools — no login, no uploads, instant results.

Browse tools →
← Back to all posts