Before Quantum Advantage: Certifying Entanglement on Real Quantum Hardware

Ideal versus measured density matrices. Structured deviations reveal coherence suppression and population leakage rather than random noise.

❝

TL;DR

Before running optimisation algorithms on quantum hardware, you need to know whether the system is actually behaving quantum-mechanically and where it fails.

I ran a Bell state certification on IBM superconducting hardware to establish three things:
(1) that the hardware exhibits genuinely non-classical correlations (CHSH S = 2.68),
(2) that entanglement is strong but imperfect (≈93% fidelity), and
(3) that the remaining error is dominated by gate/control effects rather than decoherence or readout.

The value wasn’t proving quantum advantage. It was establishing interpretability. This is the step that gates whether optimisation results mean anything at all.

Unfamiliar terms? Jump to the Key Terms section at the end.

Most conversations about quantum computing jump straight to advantage: when it will happen, what it will beat, and who will win. That skips a more basic question: one that matters if you actually plan to run code on hardware rather than speculate about it.

How do we know a quantum processor is behaving quantum-mechanically, and how well does it do so under real noise and operational constraints?

I answered that question by running Bell state experiments on IBM superconducting hardware. Not as a demonstration, but as a certification exercise, the kind you run when you want to understand a system's behaviour before asking it to do useful work.

This is not an argument for quantum advantage. It's the work you do before you're allowed to make one.

What I found, and why it matters

I structured the experiment around three escalating questions. Each one removes a different class of false positive.

1. Do see genuine quantum correlations?

Yes. CHSH parameter S = 2.68 ± 0.02, well above the classical bound of 2.

This matters for one reason: no local classical model can reproduce this behaviour. The system is exhibiting non-classical correlations. That's necessary, but not sufficient.

2. How strong is the entanglement?

Correlation tells you something quantum is happening. It doesn't tell you how good the state is. State tomography does:

Fidelity ≈ 0.93 (after readout mitigation)
Concurrence ≈ 0.84 (strong entanglement)
Bell coherence ≈ 89% of ideal

The entanglement is real, measurable, and robust under shallow execution. What remains unclear is where the remaining ~7% degradation comes from.

3. Where does fidelity degrade, and why?

This is where most demonstrations stop. I kept going. I decomposed the ~7% fidelity loss into three components:

0.8% — readout error (measured directly, partially corrected)
1.5% — decoherence (bounded using T₁/T₂ and the 3.26 μs circuit duration)
5.7% — gate/control residual

Readout loss is calibrated. Decoherence is bounded from physics, not fitted. What remains is labeled as residual, not over-attributed, not hand-waved.

The density matrix comparison makes this concrete. Measured vs ideal shows diagonal leakage from |00⟩, |11⟩ into |01⟩, |10⟩ consistent with control and readout error. Off-diagonal coherence is suppressed, consistent with dephasing. No large imaginary components, so coherent miscalibration isn't the problem.

The agreement between predicted noise channels and observed degradation matters more than the fidelity number itself. It confirms the error model is physically sound, not retrofitted.

Why this experimental design

Each tier filters out a different failure mode.

Tier 1: Bell correlation check

Prepare |Φ⁺⟩ = (|00⟩ + |11⟩)/√2, measure in the computational basis.

For an ideal Bell state, almost everything lands in |00⟩ and |11⟩. On hardware, this is a sanity filter. If P(00) + P(11) is low, something fundamental is broken. If it's high, proceed but correlation alone doesn't prove entanglement.

Bell state measurement counts. The dominance of |00⟩ and |11⟩ confirms strong Bell correlations on real hardware.

Tier 2: CHSH Bell inequality test

To rule out classical explanations, I ran a CHSH test: measure correlations along rotated axes. Classical systems satisfy S ≤ 2. Quantum mechanics allows up to 2.83.

S = 2.68 ± 0.02 exceeds the classical bound. This violation can't be reproduced by any local classical model. At this point, non-classical correlations are confirmed.

But I still don't know how good the state is.

CHSH parameter S measured across repeated runs. The observed violation exceeds the classical bound (S ≤ 2), confirming non-classical correlations.

Tier 3: Full state tomography

Correlation tests answer whether entanglement exists. Tomography answers what state was prepared.

Reconstructing the full density matrix exposes coherence structure, noise signatures, and degradation mechanisms invisible in raw counts. This turns a demonstration into a diagnostic.

Building the circuit on real hardware

The Bell circuit is minimal: Hadamard on qubit 0, controlled-entangling gate to qubit 1, measurement.

On ibm_fez (156 qubits, T₁ ≈ 97 μs, T₂ ≈ 107 μs on the selected pair), this gets transpiled into native gates (rz / sx / ecr) and mapped to the device coupling graph.

Two things matter here:

Transpilation determines circuit depth and duration
Duration determines decoherence exposure

I tracked scheduled circuit timing (3.26 μs) explicitly rather than inferring from gate counts. Each tomography basis required 10,000 shots across 16 measurement configurations 160,000 shots total per characterisation.

That scale matters physically, operationally, and financially.

Error attribution: where fidelity is actually lost

Instead of reporting one fidelity number, I broke down the loss:

1. Readout error: 0.8%

Measurement error was characterised using calibration matrices, then partially corrected. Raw fidelity ~0.92 improved to ~0.93 post-mitigation. Small, but measurable and correctable.

2. Decoherence bound: 1.5%

Using scheduled duration and measured coherence times, I computed a physics-based bound on fidelity loss from relaxation and dephasing. This yields ~1.5% a bound, not a fit, tied directly to measured timing.

3. Gate and control residual: 5.7%

After removing readout and bounding decoherence, what remains goes to:

Two-qubit gate infidelity
Control noise and crosstalk
Calibration drift between runs
Tomography rotation error

This aligns with current superconducting hardware specs. The point isn't perfect attribution, it's bounding what can't be blamed on the algorithm.

Fidelity loss decomposition. Readout error, decoherence bounds, and gate/control residuals are quantified separately.

Running hardware responsibly

Early in this work, a misconfigured tomography run consumed over $1,000 of quantum time before I caught it. No safety layer, just expensive mistakes.

A tomography experiment, 16 bases, 10,000 shots each, multiple iterations; accumulates runtime fast. At US$1.60/second, jobs that complete in seconds cost hundreds of dollars. Jobs that loop can blow through budgets.

The fix was architectural:

Pre-flight validation (depth limits, qubit checks)
Runtime estimation (duration × cost/second)
Explicit confirmation before execution
Replay mode for post-hoc analysis

Quantum hardware is metered. Treating it like a local simulator isn't a learning phase, it's an engineering error. (Fortunately, I was on IBM's free allocation.)

The certification harness

I wrapped everything: Bell correlations, CHSH violation, tomography, attribution; into one reproducible harness:

Repeatable execution
Persistent artifacts
Replay without re-running hardware
Structured, auditable reports

The report includes CHSH margins, fidelity (raw and mitigated), concurrence, Bell coherence, error attribution, and density matrix visualisations.

One-off results aren't evidence. Repeatable pipelines are.

What comes next

Now I know: the hardware is quantum, it behaves predictably under noise, and gate/control effects dominate degradation.

Next question: Can it estimate something physical via expectation values?

Bell states prove the hardware works. VQE (Variational Quantum Eigensolver) probes whether it's useful: where certification meets computation.

That's the next article.

This article is part of the "Quantum Optimisation Before Quantum Advantage" series. Code and artifacts are available in the accompanying repository.

Article setting up the series: Quantum Optimisation Before Quantum Advantage

First in the business problem series: Testing Quantum Optimisation on a Classic Problem

❝

Key Terms

Bell State: Maximally entangled two-qubit state |Φ⁺⟩ = (|00⟩ + |11⟩)/√2. Standard certification target for quantum hardware.

CHSH Inequality: Test distinguishing quantum from classical correlations. Classical systems: S ≤ 2. Quantum allows up to 2.83. Violations prove genuine quantum behavior.

State Tomography: Protocol that reconstructs the full quantum state (density matrix) by measuring in multiple bases, revealing coherence structure and error signatures.

Fidelity: How closely a prepared state matches the target (0 = wrong, 1 = perfect). Quantifies total degradation from all noise sources.

Concurrence: Entanglement strength (0 = separable, 1 = maximally entangled). Independent of fidelity—low-fidelity states can still be strongly entangled.

Decoherence: Loss of quantum coherence over time. Characterised by T₁ (energy relaxation) and T₂ (phase coherence). Circuit duration relative to these determines information survival.

Readout Mitigation: Classical post-processing that corrects measurement errors using calibration data.

Error Attribution: Decomposing performance degradation into specific mechanisms (readout, decoherence, gate errors) rather than reporting aggregate "noise."