Building a Bitcoin wallet is one of the best ways to deeply understand how Bitcoin works. Not a wrapper around an API — an actual wallet that derives keys, constructs transactions, manages UTXOs, and signs. Here's what I learned building Invebit, a cross-platform Bitcoin wallet with React Native, Chrome Extension, and Node.js backend.
HD Key Derivation: BIP-32/39/44
BIP-39: Entropy to Mnemonic to Seed
Everything starts with random entropy. BIP-39 defines a precise algorithm to convert raw randomness into human-readable words, then into a cryptographic seed:
Step 1 — Generate entropy (128-256 bits, must be a multiple of 32):
| Entropy Bits | Checksum Bits | Total Bits | Words |
|---|---|---|---|
| 128 | 4 | 132 | 12 |
| 160 | 5 | 165 | 15 |
| 192 | 6 | 198 | 18 |
| 224 | 7 | 231 | 21 |
| 256 | 8 | 264 | 24 |
Step 2 — Compute checksum: Take the first ENT/32 bits of SHA-256(entropy) and append to the entropy.
Step 3 — Split into 11-bit groups: Each 11-bit segment maps to a word in a 2048-word wordlist (index 0-2047). The wordlist is sorted to enable binary search and trie compression.
Step 4 — Derive seed via PBKDF2:
PBKDF2(
password: mnemonic_sentence (UTF-8 NFKD normalized),
salt: "mnemonic" + passphrase (UTF-8 NFKD normalized),
iterations: 2048,
PRF: HMAC-SHA512,
output: 512 bits
)
The passphrase acts as a "25th word." Same mnemonic with different passphrase = completely different wallet. Every passphrase produces a valid seed — there's no way to tell if a passphrase is "correct" without deriving keys and checking for funds. This enables plausible deniability: a coerced user can reveal a decoy passphrase that opens an empty wallet.
Critical detail: The mnemonic-to-seed conversion is independent from mnemonic generation. This means you can use any wordlist or even raw sentences — the seed derivation doesn't care. The checksum only matters for validating that a mnemonic was generated correctly.
BIP-32: Hierarchical Deterministic Derivation
From the 512-bit seed, BIP-32 derives an infinite tree of key pairs using HMAC-SHA512.
Master key generation:
I = HMAC-SHA512(key = "Bitcoin seed", data = seed)
I_L (left 256 bits) = master private key (must be valid: non-zero and less than curve order n)
I_R (right 256 bits) = master chain code
An extended key is a key + chain code pair. The chain code adds entropy to derivation — without it, anyone with a child key could derive siblings.
Child key derivation (CKD) — the core algorithm:
For normal (non-hardened) child i (where i < 2^31):
I = HMAC-SHA512(
key: parent_chain_code,
data: serialize_point(parent_public_key) || serialize_32(i)
)
child_private_key = parse_256(I_L) + parent_private_key (mod n)
child_chain_code = I_R
For hardened child i (where i >= 2^31, written as i'):
I = HMAC-SHA512(
key: parent_chain_code,
data: 0x00 || serialize_256(parent_private_key) || serialize_32(i)
)
child_private_key = parse_256(I_L) + parent_private_key (mod n)
child_chain_code = I_R
Why hardened derivation matters: Normal derivation uses the parent public key as input. This means anyone with an extended public key (xpub) can derive all non-hardened child public keys — useful for watch-only wallets. But it also means: if a single non-hardened child private key leaks alongside the parent xpub, an attacker can compute the parent private key and derive every child key.
Hardened derivation uses the parent private key as input, breaking this chain. Rule: always use hardened derivation for account-level separation.
Extended key serialization (78 bytes, Base58Check encoded):
4 bytes: version (xpub: 0x0488B21E, xprv: 0x0488ADE4)
1 byte: depth (0x00 for master)
4 bytes: parent fingerprint (first 4 bytes of Hash160(parent_pubkey))
4 bytes: child number
32 bytes: chain code
33 bytes: key data (compressed public key OR 0x00 || private key)
This produces the xpub/xprv strings (111 characters) you see in wallet software.
BIP-44: Multi-Account Hierarchy
BIP-44 standardizes the derivation path into 5 levels:
m / purpose' / coin_type' / account' / change / address_index
m / 44' / 0' / 0' / 0 / 0
| Level | Value | Hardened | Purpose |
|---|---|---|---|
purpose |
44' |
Yes | BIP-44 compliance marker |
coin_type |
0' (Bitcoin), 1' (testnet) |
Yes | Prevents cross-chain address reuse |
account |
0', 1', ... |
Yes | Independent user accounts |
change |
0 (external), 1 (internal) |
No | Receiving vs change addresses |
address_index |
0, 1, ... |
No | Sequential address generation |
Account discovery algorithm: When importing a seed, scan account 0 first. Check its external chain addresses up to the gap limit (20 consecutive unused addresses). If account 0 has transactions, increment to account 1 and repeat. Stop when an account has no transaction history.
The gap limit of 20 means: never show a user more than 20 unused receiving addresses at once. If they skip addresses, the wallet import might not find their funds.
Key insight: Always generate a new address for each transaction. Address reuse hurts privacy (observers can link your transactions) and complicates UTXO management.
UTXO Management
This is where most wallet tutorials fall short. Bitcoin doesn't have "balances" — it has Unspent Transaction Outputs (UTXOs). When your wallet shows "0.5 BTC", it's actually summing up discrete UTXOs you control.
How UTXOs Work
Every Bitcoin transaction consumes inputs (references to previous UTXOs) and creates outputs (new UTXOs). Each output has:
- An amount in satoshis
- A locking script (scriptPubKey) that defines spending conditions
An input provides:
- A reference to a previous output (txid + vout index)
- An unlocking script (scriptSig/witness) proving authorization to spend
UTXOs are atomic — you can't spend half a UTXO. If you have a 1 BTC UTXO and want to send 0.3 BTC, the transaction consumes the entire 1 BTC UTXO and creates two outputs: 0.3 BTC to the recipient and ~0.6999 BTC back to yourself as change (minus fee).
Output Types We Supported
| Type | Script Pattern | vBytes per Input | Use Case |
|---|---|---|---|
| P2PKH | OP_DUP OP_HASH160 <hash> OP_EQUALVERIFY OP_CHECKSIG |
~148 | Legacy — widest compatibility |
| P2SH-P2WPKH | OP_HASH160 <hash> OP_EQUAL (wrapping SegWit) |
~91 | Nested SegWit — backward compatible |
| P2WPKH | OP_0 <20-byte-hash> |
~68 | Native SegWit — cheapest, modern |
| P2TR | OP_1 <32-byte-key> |
~57.5 | Taproot — cheapest, latest |
We defaulted to P2WPKH (native SegWit) — ~54% cheaper than P2PKH and supported by all modern wallets. P2TR (Taproot) is even cheaper but had limited ecosystem support when we launched.
Coin Selection
When constructing a transaction, you need to select which UTXOs to spend. This is a constrained optimization problem: minimize fees while avoiding dust creation.
The three algorithms I evaluated:
1. Largest-first (what we shipped):
function selectUtxos(
utxos: UTXO[],
targetAmount: number,
feeRate: number // sat/vByte
): { selected: UTXO[]; fee: number; change: number } {
const sorted = [...utxos].sort((a, b) => b.value - a.value);
let selected: UTXO[] = [];
let total = 0;
for (const utxo of sorted) {
selected.push(utxo);
total += utxo.value;
const fee = estimateFee(selected.length, 2, feeRate);
if (total >= targetAmount + fee) {
const change = total - targetAmount - fee;
if (change < DUST_LIMIT) {
// Donate dust to miner — cheaper than creating a change output
return { selected, fee: fee + change, change: 0 };
}
return { selected, fee, change };
}
}
throw new Error("Insufficient funds");
}Pros: Few inputs (lower fees), simple. Cons: Creates small change UTXOs that accumulate over time.
2. Branch and bound (Bitcoin Core's approach): Searches for a combination that exactly matches target + fee, eliminating the change output entirely. Falls back to random selection if no exact match exists. Better long-term UTXO health, but computationally expensive for wallets with thousands of UTXOs.
3. Random selection: Pick UTXOs randomly until the target is met. Better privacy (no predictable spending pattern), but unpredictable fees and change sizes.
We shipped largest-first + periodic dust consolidation — a background job that combines small UTXOs into larger ones during low-fee periods (weekends, typically 1-5 sat/vByte).
Fee Estimation
Fee estimation is critical for UX. Too low = stuck transaction (can be unconfirmed for days). Too high = wasted money.
function estimateFee(
numInputs: number,
numOutputs: number,
feeRate: number // sat/vByte
): number {
// P2WPKH transaction weight calculation
const overhead = 10.5; // version (4) + marker/flag (0.5) + locktime (4) + input count (1) + output count (1)
const inputWeight = numInputs * 68; // P2WPKH: ~68 vBytes per input
const outputWeight = numOutputs * 31; // P2WPKH: ~31 vBytes per output
const vSize = overhead + inputWeight + outputWeight;
return Math.ceil(vSize * feeRate);
}Fee sources: We pull estimates from multiple providers and take the median:
- mempool.space API: Real-time mempool-based estimates (fastest, economy, minimum)
- Bitcoin Core
estimatesmartfee: Historical block-based estimates - Blockstream/Esplora API: Backup source
Always show the user an estimated fee before signing. For Invebit, we offered three tiers: "Fast" (next block, ~10 min), "Normal" (within 3 blocks, ~30 min), "Economy" (within 6 blocks, ~1 hour).
Dust Limit
A UTXO is "dust" if it costs more to spend than it's worth. The current dust threshold is 546 satoshis for P2PKH/P2SH outputs and 294 satoshis for P2WPKH. Creating dust outputs wastes block space and burdens the UTXO set.
In our coin selection, if the change would be below the dust limit, we add it to the miner fee instead of creating a tiny UTXO that's uneconomical to spend later.
RGB Protocol: Assets on Bitcoin
RGB is a protocol for issuing and transferring assets on Bitcoin using client-side validation. Unlike Ethereum's ERC-20 tokens where all state lives on-chain, RGB stores state off-chain and commits only cryptographic proofs to Bitcoin transactions.
Three Core Concepts
1. Client-Side Validation: Parties validate only their relevant transaction history, not the entire global state. When Bob receives RGB tokens, he validates the chain of ownership from genesis to his transfer — he doesn't need to validate every transfer that ever happened in the contract.
2. Single-Use Seals: A cryptographic primitive that proves a message was published exactly once. In RGB, Bitcoin UTXOs serve as seals — spending a UTXO "closes" the seal and anchors the state change to Bitcoin's blockchain. This prevents double-spending without requiring a global ledger.
3. Deterministic Bitcoin Commitments (DBC): RGB data is committed to Bitcoin transactions via two schemes:
- Opret: 34-byte commitment in an
OP_RETURNoutput — simple but visible - Tapret: 64-byte commitment hidden in a taproot script path — invisible to chain analysis, no additional blockchain footprint
How Transfers Work
Alice wants to send 200 RGB tokens to Bob:
- Alice creates a state transition: New state assigns 200 tokens to Bob's UTXO (concealed — the UTXO is blinded so observers can't link it)
- Alice builds a witness transaction: Spends her UTXO (closing the single-use seal), embedding the RGB commitment via Opret or Tapret
- Alice creates a consignment: A package containing the full history from genesis to Bob's transfer — every state transition in the chain
- Alice sends the consignment to Bob (off-chain, directly)
- Bob validates: Verifies every transition in the consignment against Bitcoin's blockchain, confirming seals were closed correctly
- Bob's wallet stores the validated state: He now owns the tokens, anchored to his UTXO
Genesis (1000 tokens → Alice's UTXO_A)
│
├── Transition 1: 200 tokens → Bob's UTXO_B (committed via Tapret in tx spending UTXO_A)
│ └── Bob validates: genesis → transition 1 (checks UTXO_A was spent, commitment is valid)
│
└── Transition 2: 800 tokens → Alice's UTXO_C (change, same transaction)
Key difference from ERC-20: On Ethereum, the token contract maintains a global balanceOf mapping visible to everyone. In RGB, there's no on-chain state — an observer sees a normal Bitcoin transaction with no indication that an asset transfer occurred. The privacy comes from the fact that contract data exists only between the parties involved.
Multi-Protocol Commitments (MPC)
Multiple RGB contracts can batch their state transitions into a single Merkle tree, committed in one Bitcoin transaction. This means one transaction can carry state changes for hundreds of different contracts — massive scalability compared to one-contract-per-transaction models.
Why RGB Over Alternatives
| Omni Layer | Liquid | RGB | |
|---|---|---|---|
| State storage | On-chain (OP_RETURN) | Sidechain | Client-side |
| Consensus | Bitcoin miners | Federation (11/15 multisig) | Individual validation |
| Privacy | Transparent | Confidential transactions | Private by default |
| Scalability | Limited by block space | Sidechain throughput | Unbounded (off-chain) |
| Trust model | Bitcoin security | Federation trust | Bitcoin security + individual validation |
| Smart contracts | Limited | Simplicity (limited) | AluVM (Turing-equivalent) |
RGB's smart contracts run on AluVM — a register-based virtual machine that's Turing-equivalent but designed for formal verification. Simple contracts use Rust macros; complex ones use AluAssembly (with a higher-level language called Contractum in development).
Security Architecture
Building a wallet means handling private keys. The attack surface includes the device, the network, the supply chain, and the user.
Key Storage
| Platform | Storage | Encryption | Biometric Unlock |
|---|---|---|---|
| iOS | Keychain (Secure Enclave) | Hardware-backed AES-256 | Face ID / Touch ID |
| Android | Keystore (TEE/StrongBox) | Hardware-backed | Fingerprint / Face |
| Chrome Extension | chrome.storage.local |
AES-256-GCM (app-level) | N/A |
Never store keys in plaintext, localStorage, or SharedPreferences. For the Chrome Extension, we implemented app-level encryption since Chrome's storage API doesn't provide hardware-backed encryption.
Signing Isolation
Transaction signing happens in an isolated context:
- Load private key from secure storage
- Deserialize the PSBT (Partially Signed Bitcoin Transaction)
- Sign all inputs
- Wipe the private key from memory immediately
- Return the signed transaction
We used PSBT (BIP-174) throughout — it separates transaction construction from signing, enabling hardware wallet support and multi-sig flows.
Supply Chain Security
Every dependency is a potential attack vector. Our approach:
- Pinned all dependency versions — no
^or~in package.json for crypto packages - Audited critical packages:
bip39,bitcoinjs-lib,tiny-secp256k1,@noble/secp256k1 - Reproducible builds — same source always produces the same binary
- No post-install scripts for crypto dependencies
- Subresource Integrity (SRI) for the Chrome Extension
The 2024 xz backdoor and multiple npm supply chain attacks validated this paranoia.
Transaction Malleability
Before SegWit, transaction signatures weren't covered by the txid hash — an attacker could modify the signature script without invalidating it, changing the txid. This breaks any system that tracks unconfirmed transactions by txid.
Mitigation: We use SegWit (P2WPKH) exclusively, which moves signature data to the witness and computes txid without it. For any remaining edge cases, we track transactions by the UTXOs they spend (inputs), not by txid.
Address Verification
A clipboard hijacker changing one character in a Bitcoin address can drain funds. Our approach:
- Display the full address (never truncate in the send flow)
- Show a QR code for cross-device verification
- For large amounts: require the user to send a small test transaction first
Testing Strategy
Bitcoin transactions are irreversible. Our testing pipeline:
- Unit tests: Key derivation against BIP-32/39/44 test vectors (from the specs)
- Integration tests on testnet: 500+ automated transaction tests — send, receive, UTXO consolidation, fee estimation accuracy, edge cases (dust, max UTXO count)
- Regtest environment: Private Bitcoin network for deterministic testing — we control block production
- Fuzzing: Random inputs to transaction construction, PSBT parsing, and address validation
- Security audit: Two independent audits before mainnet launch, third audit 6 months after
Key Takeaways
-
UTXOs are the hard part. Key derivation is well-documented with test vectors. UTXO management — coin selection, consolidation, fee estimation, dust handling, change output strategy — is where wallet quality differs.
-
Hardened derivation isn't optional. The xpub + leaked child private key = full compromise attack is real. Use hardened derivation for account separation, always.
-
Fee estimation is a product decision. The algorithm is simple; the UX is hard. Users don't understand sat/vByte — they understand "fast", "normal", "economy."
-
RGB is production-ready for issuance, early for trading. The protocol is sound and the privacy properties are exceptional. But the tooling ecosystem (wallets, explorers, DEXes) is still maturing compared to EVM.
-
Security is never "done." We did two independent security audits and still found issues in the third review. Budget for continuous security work, especially around dependency updates.
-
Test on testnet obsessively. Mainnet bugs cost real money. We required 100% testnet pass rate for 48 hours before any mainnet deployment.
Related Posts
- Tokenizing Real Estate on Algorand: A MiCA-Compliant Architecture — ASA tokenization, regulatory compliance, and escrow design for real-world assets
- MCP (Model Context Protocol): Connecting AI Agents to Real Tools — the protocol standard for connecting agents to databases, CRMs, and blockchain APIs
Need a Bitcoin wallet, RGB integration, or custom blockchain infrastructure? I've built it end-to-end. Reach out or book a call.