Skip to content

Alcove

About
Thanks

Science, culture, complexity

Tag: probability

What does a quantum Bayes’s rule look like?

Bayes’s rule is one of the most fundamental principles in probability and statistics. It allows us to update our beliefs in the face of new evidence. In its simplest form, the rule tells us how to revise the probability of a hypothesis once new data becomes available.

A standard way to teach it involves drawing coloured balls from a pouch: you start with some expectation (e.g. “there’s a 20% chance I’ll draw a blue ball”), then you update your belief depending on what you observe (“I’ve drawn a red ball, so the actual chance of drawing a blue ball is 10%”). While this example seems simple, the rule carries considerable weight: physicists and mathematicians have described it as the most consistent way to handle uncertainty in science, and it’s a central part of logic, decision theory, and indeed nearly every field of applied science.

There are two well-known ways of arriving at Bayes’s rule. One is the axiomatic route, which treats probability as a set of logical rules and shows that Bayesian updating is the only way to preserve consistency. The other is variational, which demands that updates should stay as close as possible to prior beliefs while remaining consistent with new data. This latter view is known as the principle of minimum change. It captures the intuition that learning should be conservative: we shouldn’t alter our beliefs more than is necessary. This principle explains why Bayesian methods have become so effective in practical statistical inference: because they balance a respect for new data with loyalty to old information.

A natural question arises here: can Bayes’s rule be extended into the quantum world?

Quantum theory can be thought of as a noncommutative extension of probability theory. While there are good reasons to expect there should be a quantum analogue of Bayes’s rule, the field has for a long time struggled to identify a unique and universally accepted version. Instead, there are several competing proposals. One of them stands out: the Petz transpose map. This is a mathematical transformation that appears in many areas of quantum information theory, particularly in quantum error correction and statistical sufficiency. Some scholars have even argued that it’s the “correct” quantum Bayes’s rule. Still, the situation remains unsettled.

In probability, the joint distribution is like a big table that lists the chances of every possible pair of events happening together. If you roll a die and flip a coin, the joint distribution specifies the probability of getting “heads and a 3”, “tails and a 5”, and so on. In this big table, you can also zoom out and just look at one part. For example, if you only care about the die, you can add up over all coin results to get the probability of each die face. Or if you only care about the coin, you can add up over all die results to get the probability of heads or tails. These zoomed-out views are called marginals.

The classical Bayes’s rule doesn’t just update the zoomed-out views but the whole table — i.e. the entire joint distribution — so the connection between the two events also remains consistent with the new evidence.

In the quantum version, the joint distribution isn’t a table of numbers but a mathematical object that records how the input and output of a quantum process are related. The point of the new study is that if you want a true quantum Bayes’s rule, you need to update that whole object, not just one part of it.

A new study by Ge Bai, Francesco Buscemi, and Valerio Scarani in Physical Review Letters has taken just this step. In particular, they’ve presented a quantum version of the principle of minimum change by showing that when the measure of change is chosen to be quantum fidelity — a widely used measure of similarity between states — this optimisation leads to a unique solution. Equally remarkably, this solution coincided with the Petz transpose map in many important cases. As a result, the researchers have built a strong bridge between classical Bayesian updating, the minimum change principle, and a central tool of quantum information.

The motivation for this new work isn’t only philosophical. If we’re to generalise Bayes’s rule to include quantum mechanics as well, we need to do so in a way that respects the structural constraints of quantum theory without breaking away from its classical roots.

The researchers began by recalling how the minimum change principle works in classical probability. Instead of updating only a single marginal distribution, the principle works at the level of the joint input-output distribution. Updating then becomes an optimisation problem, i.e. finding the subsequent distribution that’s consistent with the new evidence but minimally different from the evidence from before.

In ordinary probability, we talk about stochastic processes. These are rules that tell us how an input is turned into an output, with certain probabilities. For example if you put a coin into a vending machine, there might be a 90% chance you get a chips packet and a 10% chance you get nothing. This rule describes a stochastic process. This process can also be described with a joint distribution.

In quantum physics, however, it’s tricky. The inputs and outputs aren’t just numbers or events but quantum states, which are described by wavefunctions or density matrices. This makes the maths much more complex. The resulting stochastic processes also become sequences of events called completely positive trace-preserving (CPTP) maps.

A CPTP map is the most general kind of physical evolution allowed: it takes a quantum state and transforms it into another quantum state. And in the course of doing so, it needs to follow two rules: it shouldn’t yield any negative probabilities and it should ensure the total probability adds up to 1. That is, your chance of getting a chips packet shouldn’t be –90% nor should it be 90% plus a 20% chance of getting nothing.

These complications mean that, while the joint distribution in classical Bayesian updating is a simple table, the one in quantum theory is more sophisticated. It uses two mathematical tools in particular. One is purification, a way to embed a mixed quantum state into a larger ‘pure’ state so that mathematicians can keep track of correlations. The other is Choi operators, a standard way of representing a CPTP map as a big matrix that encodes all possible input-output behaviour at once.

Together, these tools play the role of the joint distribution in the quantum setting: they record the whole picture of how inputs and outputs are related.

Now, how do you compare two processes, i.e. the actual forward process (input → output) and the guessed reverse process (output → input)?

In quantum mechanics, one of the best measures of similarity is fidelity. It’s a number between 0 and 1. 0 means two processes are completely different and 1 means they’re exactly the same.

In this context, the researchers’ problem statement was this: given a forward process, what reverse process is closest to it?

To solve this, they looked over all possible reverse processes that obeyed the two rules, then they picked the one that maximised the fidelity, i.e. the CPTP map most similar to the forward process. This is the quantum version of applying the principle of minimum change.

In the course of this process, the researchers found that in natural conditions, the Petz transpose map emerges as the quantum Bayes’s rule.

In quantum mechanics, two objects (like matrices) commute if the order in which you apply them doesn’t matter. That is, A then B produces the same outcome as B then A. In physical terms, if two quantum states commute, they behave more like classical probabilities.

The researchers found that when the CPTP map that takes an input and produces an output, called the forward channel, commutes with the new state, the updating process is nothing but the Petz transpose map.

This is an important result for many reasons. Perhaps foremost is that it explains why the Petz map has shown up consistently across different parts of quantum information theory. It appears it isn’t just a useful tool but the natural consequence of the principle of minimum change applied in the quantum setting.

The study also highlighted instances where the Petz transpose map isn’t optimal, specifically when the commutativity condition fails. In these situations, the optimal updating process depends more intricately on the new evidence. This subtlety departs clearly from classical Bayesian logic because in the quantum case, the structure of non-commutativity forces updates to depend non-linearly on the evidence (i.e. the scope of updating can be disproportionate to changes in evidence).

Finally, the researchers have shown how their framework can recover special cases of practical importance. If some new evidence perfectly agrees with prior expectations, the forward and reverse processes become identical, mirroring the classical situation where Bayes’s rule simply reaffirms existing beliefs. Similarly, in contexts like quantum error correction, the Petz transpose map’s appearance is explained by its status as the optimal minimal-change reverse process.

But the broader significance of this work lies in the way it unifies different strands of quantum information theory under a single conceptual roof. By proving that the Petz transpose map can be derived from the principle of minimum change, the study has provided a principled justification for its widespread use rather than being restricted to particular contexts. This fact has immediate consequences for quantum computing, where physicists are looking for ways to reverse the effects of noise on fragile quantum states. The Petz transpose map has long been known to do a good job of recovering information from these states after they’ve been affected by noise. Now that physicists know the map embodies the smallest update required to stay consistent with the observed outcomes, they may be able to design new recovery schemes that exploit the structure of minimal change more directly.

The study may also open doors to extending Bayesian networks into the quantum regime. In classical probability, a Bayesian network provides a structured way to represent cause-effect relationships. By adapting the minimum change framework, scientists may be able to develop ‘quantum Bayesian networks’ where the way one updates their expectations of a particular outcome respects the peculiar constraints of CPTP maps. This could have applications in quantum machine learning and in the study of quantum causal models.

There are also some open questions as well. For instance, the researchers have noted that if different measures of divergence other than fidelity are used, e.g. the Hilbert-Schmidt distance or quantum relative entropy, the resulting quantum Bayes’s rules may be different. This in turn indicates that there could be multiple valid updating rules, each suited to different contexts. Future research will need to map out these possibilities and determine which ones are most useful for particular applications.

In all, the study provides both a conceptual advance and a technical tool. Conceptually, it shows how the spirit of Bayesian updating can carry over into the quantum world; technically, it provides a rigorous derivation of when and why the Petz transpose map is the optimal quantum Bayes’s rule. Taken together, the study’s finding strengthens the bridge between classical and quantum reasoning and offers a deeper understanding of how information is updated in a world where uncertainty is baked into reality rather than being due to an observer’s ignorance.

2025.10.04
The hunt for supersymmetry: Reviewing the first run – 2

I’d linked to a preprint paper [PDF] on arXiv a couple days ago that had summarized the search for Supersymmetry (Susy) from the first run of the Large Hadron Collider (LHC). I’d written to one of the paper’s authors, Pascal Pralavorio at CERN, seeking some insights into his summary, but unfortunately he couldn’t reply by the time I’d published the post. He replied this morning and I’ve summed them up.

Pascal says physicists trained their detectors for “the simplest extension of the Standard Model” using supersymmetric principles called the Minimal Supersymmetric Standard Model (MSSM), formulated in the early 1980s. This meant they were looking for a total of 35 particles. In the first run, the LHC operated at two different energies: first at 7 TeV (at a luminosity of 5 fb^-1), then at 8 TeV (at 20 fb^-1; explainer here). The data was garnered from both the ATLAS and CMS detectors.

In all, they found nothing. As a result, as Pascal says, “When you find nothing, you don’t know if you are close or far from it!”

His paper has an interesting chart that summarized the results for the search for Susy from Run 1. It is actually a superimposition of two charts. One shows the different Standard Model processes (particle productions, particle decays, etc.) at different energies (200-1,600 GeV). The second shows the Susy processes that are thought to occur at these energies.

Cross sections of several SUSY production channels, superimposed with Standard Model process at s = 8 TeV. The right-handed axis indicates the number of events for 20/fb.

The cross-section of the chart is the probability of an event-type to appear during a proton-proton collision. What you can see from this plot is the ratio of probabilities. For example, stop-stop* (the top quark’s Susy partner particle and anti-particle, respectively) production with a mass of 400 GeV is 10¹⁰ (10 billion) less probable than inclusive di-jet events (a Standard Model process). “In other words,” Pascal says, it is “very hard to find” a Susy process while Standard Model processes are on, but it is “possible for highly trained particle physics” to get there.

Of course, none of this means physicists aren’t open to the possibility of there being a theory (and corresponding particles out there) that even Susy mightn’t be able to explain. The most popular among such theories is “the presence of a “possible extra special dimension” on top of the three that we already know. “We will of course continue to look for it and for supersymmetry in the second run.”

2014.05.03
The philosophies in physics

As a big week for physics comes up–a July 4 update by CERN on the search for the Higgs boson followed by ICHEP ’12 at Melbourne–I feel really anxious as a small-time proto-journalist and particle-physics-enthusiast. If CERN announces the discovery of evidence that rules out the existence of such a thing as the Higgs particle, not much will be lost apart from years of theoretical groundwork set in place for the post-Higgs universe. Physicists obeying the Standard Model will, to think the snowclone, scramble to their boards and come up with another hypothesis that explains mass-formation in quantum-mechanical terms.

For me… I don’t know what it means. Sure, I will have to unlearn the Higgs mechanism, which does make a lot of sense, and scour through the outpouring of scientific literature that will definitely follow to keep track of new directions and, more fascinatingly, new thought. The competing supertheories–loop quantum gravity (LQG) and string theory–will have to have their innards adjusted to make up for the change in the mechanism of mass-formation. Even then, their principle bone of contention will remain unchanged: whether there exists an absolute frame of reference. All this while, the universe, however, will have continued to witness the rise and fall of stars, galaxies and matter.

It is easier to consider the non-existence of the Higgs boson than its proven existence: the post-Higgs world is dark, riddled with problems more complex and, unsurprisingly, more philosophical. The two theories that dominated the first half of the previous century, quantum mechanics and special relativity, will still have to be reconciled. While special relativity holds causality and locality close to its heart, quantum mechanics’ tendency to violate the latter made it disagreeable at the philosophical level to A. Einstein (in a humorous and ironical turn, his attempts to illustrate this “anomaly” numerically opened up the field that further made acceptable the implications of quantum mechanics).

The theories’ impudent bickering continues with mathematical terms as well. While one prohibits travel at the speed of light, the other allows for the conclusive demonstration of superluminal communication. While one keeps all objects nailed to one place in space and time, the other allows for the occupation of multiple regions of space at a time. While one operates in a universe wherein gods don’t play with dice, the other can exist at all only if there are unseen powers that gamble on a secondly basis. If you ask me, I’d prefer one with no gods; I also have a strange feeling that that’s not a physics problem.

Speaking of causality, physicists of the Standard Model believe that the four fundamental forces–nuclear, weak, gravitational, and electromagnetic–cause everything that happens in this universe. However, they are at a loss to explain why the weak force is 10³²-times stronger than the gravitational force (even the finding of the Higgs boson won’t fix this–assuming the boson exists). An attempt to explain this anomaly exists in the name of supersymmetry (SUSY) or, together with the Standard Model, MSSM. If an entity in the (hypothetical) likeness of the Higgs boson cannot exist, then MSSM will also fall with it.

Taunting physicists everywhere all the way through this mesh of intense speculation, Werner Heisenberg’s tragic formulation remains indefatigable. In a universe in which the scale at which physics is born is only hypothetical, in which energy in its fundamental form is thought to be a result of probabilistic fluctuations in a quantum field, determinism plays a dominant role in determining the future as well as, in some ways, contradicting it. The quantum field, counter-intuitively, is antecedent to human intervention: Heisenberg postulated that physical quantities such as position and particle spin come in conjugate quantities, and that making a measurement of one quantity makes the other indeterminable. In other words, one cannot simultaneously know the position and momentum of a particle, or the spins of a particle around two different axes.

To me, this seems like a problem of scale: humans are macroscopic in the sense that they can manipulate objects using the laws of classical mechanics and not the laws of quantum mechanics. However, a sense of scale is rendered incontextualizable when it is known that the dynamics of quantum mechanics affect the entire universe through a principle called the collapse postulate (i.e., collapse of the state vector): if I measure an observable physical property of a system that is in a particular state, I subject the entire system to collapse into a state that is described by the observable’s eigenstate. Even further, there exist many eigenstates for collapsing into; which eigenstate is “chosen” depends on its observation (this is an awfully close analogue to the anthropic principle).

xkcd #45

That reminds me. The greatest unsolved question in my opinion is whether the universe houses the brain or if the brain houses the universe. To be honest, I started writing this post without knowing how it would end: there were multiple eigenstates it could “collapse” into. That it would collapse into this particular one was unknown to me, too, and, in hindsight, there was no way I could have known about any aspect of its destiny. Having said that, the nature of the universe–and the brain/universe protogenesis problem–with the knowledge of deterministic causality and mensural antecedence, if the universe conceived the brain, the brain must inherit the characteristics of the universe, and therefore must not allow for freewill.

Now, I’m faintly depressed. And yes, this eigenstate did exist in the possibility-space.

2012.07.01
Eigenstates of the human mind
1. Would a mind’s computing strength be determined by its ability to make sense of counter-intuitive principles (Type I) or by its ability to solve an increasing number of simple problems in a second (Type II)?
2. Would Type I and Type II strengths translate into the same computing strength?
3. Does either Type I or Type II metric possess a local inconsistency that prevents its state-function from being continuous at all points?
4. Does either Type I or Type II metric possess an inconsistency that manifests as probabilistic eigenstates?
2012.07.01

Indice –––––––––

Designed with WordPress

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website