Tuesday, December 9, 2014

D-Wave, Boss, D-Wave!



Some of my colleagues (and my followers on both social and anti-social media) have been asking me to comment on the D-Wave Systems (so-called) quantum computer, in light of all the seemingly contradictory press releases, scientific papers, and blog posts that have appeared over the years. This request of me is similar to asking me to assess and comment on the status of a pond filled with ravenous blood-thirsty piranhas by stripping down to my Speedos, slathering myself with the blood of a fatted calf, and then diving into the pond and thrashing about. However, I can assure you, that the sight of me wearing Speedos would keep even the most ravenous piranha — or perhaps even Scott Aaronson? — at bay.

Just kidding Scott! Love ya bro! (Let the flames begin….)

In my recent book, I do indeed have a section on the D-Wave phenomenon, but that was written in 2012 and perhaps needs a light-hearted updated. When writing the book my friend and colleague and known Grand Poobah of fine D-Wave Systems hardware, Daniel Lidar, asked me if I wanted him to read over my section on D-Wave Systems computers. I told him, flat out, no. He pressed on and said, with a wry grin, “You don’t want to say something you might later regret.” I replied, “Dan I can imagine many things I would like to have engraved on my tombstone but one of them certainly would not be…

HERE LIES JONATHAN P. DOWLING —
HE NEVER SAID ANYTHING HE WOULD LATER REGRET!

In addition, for reasons I will leave nebulous for the moment, in the past few months I have had to give myself a crash course on the status and performance of the D-Wave Systems machines and perhaps I have a clearer view of what is (or is not) going on.

Firstly let me say that this vigorous and open debate about the D-Wave machines is good for science and is the hallmark of the scientific method at work. At times I wish the tone of the debate were more civil but history is filled with uncivilized scientific debates and perhaps civility is too much to wish for. Who can forget the baleful Boltzmann-Loschmidt debates on the statistical mechanics reversibility question? How about the forceful Huxley-Wilberforce debate on evolution? (“So tell us, Dr. Huxley, is it from your father’s side of the family that you claim your descent from an ape or is it from your mother’s?”) What about Cantor’s cantankerous feud with Kronecker over the nature of mathematical infinities?

Anybody? Anybody? (Tap, tap, tap....) Is this thing even on!?

Like scientific debates of yore this D-Wave business is not without its polemics. I attended a workshop at Harvard in March where another invited speaker, a friend and colleague of mine, when asked about what he thought about D-Wave, replied in no uncertain terms, “I think they’re a bunch of charlatans.”(“So tell us, D-Wave Two computer, is it from your father’s side of the family that you claim your descent from an abacus, or is it from your mother’s?”) On the other hand my friends and colleagues like Colin Williams are pro (or at least not totally anti) or like Daniel Lidar who is steadfastly neutral.   I can assure that particular Harvard-workshop-speaker colleague of mine that neither of them is (at least currently known to be) a charlatan.

I will try to give here an abridged history of what is going on, where I will fill in the gaps in my knowledge with the usual wild-asset conjectures. So as far as I can tell, as usual, this all can be blamed on us theorists. In 2001, Ed Farhi at MIT and his collaborators published a paper in Science that proposed a new model for quantum computation called Adiabatic Quantum Computing (AQC). This model was radically different than the standard circuit model for quantum computing, which we’ll call Canonical Quantum Computing (CQC), developed by Deutsch and others.

The standard circuit-gate model describes how a universal quantum computer can be built up out of circuits of one- and two-qubit gates in a way analogous to how a classical digital computer is built. It is called ‘universal’ as it can any quantum algorithm, such as Shor’s factoring algorithm, can be programmed into its circuitry. (By contrast it is strongly believed that the factoring algorithm cannot be programmed into a classical computer’s circuitry without suffering an exponential slowdown in run time.) While the circuit-gate model is very likely more powerful than a classical computer for this problem, to factor large numbers of interest to, say, the National Security Agency (NSA), the circuit would have to contain billions or even trillions of qubits and gates, allowing for quantum error correction and all that. To date, the record number of qubits constructed in the lab in such a way, the ion trap quantum computer holds the record, is about 16 and not a trillion. Every few years or so, the ion trappers carefully polish up and add another qubit to their trap, but the pace is very slow.

Adiabatic Quantum Computing (AQC) seemed completely different than the circuit-gate model or Canonical Quantum Computing (CQC). In AQC, instead of building gates and circuits in the usual sense, one identifies a large physical quantum system, say a collection of interacting quantum spins, with a particular Hamiltonian structure. The problem to be solved is then “programmed” into the state of these spins. Then the system is allowed to adiabatically evolve by slowly tuning some parameter, such as temperature or time, allowing the system to evolve into its ground state. If set up properly then the spin-state of the ground state of the Hamiltonian contains the answer to the computational question you programmed in at the higher temperature (or earlier time).

What set the quantum computing community off into a paroxysm was that Farhi and the gang claimed, in their original Science paper, that ACQ could solve NP-complete (travelling-salesman-type problems) with an exponential speedup over a classical computer. This was an astounding claim, as it is widely and strongly believed that the CQC model gives no exponential speedup on NP-complete problems over a classical computer. And so, in a nutshell, Farhi and collaborators were claiming that AQC approach was exponentially more powerful than CQC, at least on this one particular class of important optimization problems.

A vigorous scientific debate ensued, rotten tomatoes were thrown, and finally, in 2007, Dorit Aharanov and collaborators proved that AQC was computationally equivalent to ordinary circuit-gate quantum computing. That is AQC was not more powerful than CQC. (They are polynomially isomorphic hence both equally universal models of quantum computation.) Hence the claim that AQC would give an exponential speedup on NP-complete problems was very likely incorrect since it is strongly believed that no such speedup occurs on the CQC.

Got it?  

The situation reminds me of the situation in quantum mechanics in 1926 where you had the Heisenberg matrix mechanics and Schrödinger wave mechanics and nobody knew if they were the same theory or gave different predictions or what until Dirac (and others) showed that they were transformable into each other and hence made exactly the same physical predictions and hence were not two different theories of quantum mechanics but rather two different pictures of quantum mechanics. However, as is well known to the quantum mechanic on the street, some quantum theory problems are much easier to work out in the Heisenberg picture than in the Schrödinger picture (and vice versa). As a consequence, due to the Gottesman-Knill theorem the Heisenberg and Schrödinger pictures are not computationally equivalent even though they are physically equivalent. That is a quantum computing circuit, one composed of only gates from the Clifford algebra, that looks like it would be impossible to efficiently simulate on a classical computer in the Schrödinger picture is in fact efficiently simulatable in the Heisenberg picture.

Okay, so AQC model of quantum computation is neither more nor less powerful than circuit-gate quantum computing, but because of its mapping into fairly simple spin systems, it looks more like a design for a straight-forward physics experiment than a design for a gigantic digital quantum computer. The open question is that maybe AQC is easier to implement than the CQC. That is, instead pursing the circuit-gate paradigm, adding one ion qubit to your trap every few years until you hit a trillion of them, perhaps you could cook up some simple spin-like physical system with a whole lot of qubits (maybe hundreds but not trillions) interacting in a certain way, code in your problem, cool the whole thing down slowly, and read out your answer — build your quantum computer that way! And based on that potential, I surmise, is how and why D-Wave Systems got involved in the quantum computing game.

In 2007 D-Wave systems demonstrated a 16-qubit-prototype system called The Orion and used it to solve some computational problems such as a pattern-matching problem related to drug design and a Sudoku puzzle. In 2011 they unveiled their first commercial machine, the D-Wave One, a 128-qubit machine that sold for $1,000,000. They have sold precisely one of these and that was to Lockheed-Martin, which then immediately handed it over to a research group at the University of Southern California, consisting of Daniel Lidar and collaborators, who operate the machine with complete autonomy from that company. (USC does provide the physical infrastructure for the machine and its operation but that is role that is compartmentalized and separate from the research activities.) In 2013 D-Wave systems produced the D-Wave Two, a 512-qubit machine, and they have sold also precisely two of those. The first was also bought by Lockheed-Martin and is run by USC and the second was bought by Google and is located at the Quantum Artificial Intelligence Lab at the NASA Ames Research Center at Moffett Field, California (near San Jose). This particular NASA machine has a front-end user interface run on a Google proxy server that allows the user to submit batch jobs to the D-Wave Two hardware via that route.

So just what are these machines and what are they doing? This question is difficult to answer, as we are dealing not with a science experiment but a product produced by an industrial concern. If a scientist, say at a university, claims they see such and such a result in an experiment; then they are required to publish sufficient details about the experiment so that independent researchers at other universities can construct identical experiments and confirm or refute the result. But the D-Wave machines are industrial products, not science experiments, and details about their design, construction, performance, and even the software that runs them is proprietary. They don’t want other companies to know all the details of how they are built.

So scientists, instead of constructing their own duplicate of the machine, are forced to use the D-Wave machine to test itself — to run tests or “experiments” treating the thing as a black box. One bit of front-end software that interfaces with the machine is in fact called “black box”. And for some years now the game has been to choose some computationally hard problem, run it on the D-Wave machine, and then compare the run time when the problem is run on a fast classical computer, and then publish the results of the race. The results greatly depend on which problem you choose, and hence this is why there are so many conflicting reports appearing in the popular press. One day we hear that the D-Wave machine is more powerful than a classical computer and the next we hear it is no more powerful. Well this is because the term “powerful” is problem dependent. It makes little sense to say one machine is more powerful than another in general. It only makes sense to say one machine is more powerful than another when computing a specific problem or class of problems. When computer scientists say a quantum computer is more powerful than a classical one, they mean on a specific set of problems. For example factoring. For some problems, say computing the permanent of a matrix, a quantum computer is likely not any more powerful than a classical one.

Let’s take an analogy. Suppose in 1984, ten years before Shor discovered his factoring algorithm, an alien spacecraft lands on the roof of the physics building at the University of Colorado (where I was a grad student at that time) and out from the UFO emerges a luminous, one-eyed, green-skinned, alien named Xiva the Decompiler of Words. Xiva hands me a pocket-sized device that she calls the Xblox and claims is a universal quantum computer and tells me to try it out. I decide to pit it against the nearby Cray X-MP supercomputer. If I first run the oracular ordered search problem (don’t ask), the Xblox will perform better than the X-MP by at most a factor of 3, and I would exclaim, “This quantum computer sucks!” If instead I randomly chose to try and factor a large number with the two machines, I’d find the Xblox has an exponentially shorter run time than the X-MP.  I then would exclaim, “This quantum computer really rocks!” On other problems (NP-complete or unordered search) the Xblox would show only a quadratic improvement at all over the X-MP. The question, which computer has the better performance, is meaningless unless you specify the particular problem they are performing the computation on.

What is inside the D-Wave machines? I suspect that D-Wave had originally hoped to implement something like full-blown Adiabatic Quantum Computing on their machines. Since the 2007 result of Aharonov, et al., we know that AQC is equivalent to ordinary QC, as discussed above, but AQC is perhaps easier to implement. Hence if the D-Wave machine turned out to be a general-purpose AQC machine it would be truly a universal quantum computer. This hope may have been the source of early (and somewhat naive) press reports from D-Wave that they were building a true universal quantum computer with entanglement and the whole nine yards.

The speed of hype may often greatly exceed the speed of write.

As mentioned above, AQC could be implemented by a building a controllable lattice of coupled spin-half particles. However, given that the spin-half particle (a neutron for example) is a quantum two-level system, any array of coupled two-level systems will do. D-Wave decided to go with superconducting flux qubits. These are little micron-size loops of superconducting current-carrying wires where all the electrons can be made to circulate counterclockwise (the qubit |0 state) or all clockwise (the qubit |1 state). If the little buggers are sufficiently cold and sufficiently protected from the environment then they have a long enough coherence time so you can make a cat state or the superposition |0+|1⟩, which is then the starting point for generating massive amounts of entanglement across the lattice, via qubit-qubit interactions, and away you go to implement AQC (or even CQC). The particular Hamiltonian system D-Wave has chosen is something called the Ising spin model (ISM) whose Hamiltonian has a binary and tunable coupling between each pair of qubits

This Hamiltonian shows up in the theory of quantum spin glasses, for example. There is one additional hardware implementation step. This Ising (pronounced “ee-sing” not “icing”) Hamiltonian presupposes, via qubit-qubit coupling, any qubit (in say a square lattice) can be coupled to any other qubit in the lattice. That is not the case in the current D-Wave One and Two. Only nearest neighbors are coupled. In addition some of the qubits don’t work due to manufacturing issues, but you know in advance which of them is going to be offline as D-Wave characterizes the chip and tells you.

To handle this hardware issue an additional wiring diagram is needed to map the Ising problem (with arbitrary qubit-qubit coupling with all the qubits in play) onto the D-Wave hardware (with only nearest-neighbor coupling and where some of the qubits are no-shows). That wiring diagram is called a Chimera graph. (I suppose because the graph looks the fire-breathing hybrid of a goat and a lion and a serpent to the unaided eye.) It is important to note that each different computational problem will require a different Chimera mapping, which can be quite hard to work out for a large number of qubits, but the D-Wave black-box front end will help you find one if you are to dumb to figure it out on your own.

So the idea is that you program your input as an initial state of this Hamiltonian (time t = 0), then slowly tune the time parameter t until the thing reaches its ground state (time t = T), and then read out your answer and you have a type of AQC. The problem is for this protocol to work as universal AQC the qubits must have coherence times that are at least as long (and preferably longer) than the adiabatic run time T. This might be the case if the qubits were cold enough and the protected from the environment enough but that is not the case. The qubits in the current D-Wave machines are quite noisy and have very short coherence times, due to temperature fluctuations, imperfections, and other technical noise. Hence the qubits are not sufficiently coherent so that long-range entanglement can be spread across the lattice and hence full-blown AQC cannot yet be implement and by the Aharonov et al. proof neither is it then universal for quantum computing. That is the current D-Wave machines cannot solve all the problems a general quantum computer can solve. Well if they can’t do that, what can they do?

There is a watered-down version of AQC that is called Simulated Quantum Annealing and the good folks at D-Wave Systems claim their machines can do this. Annealing is a physical process, used for example in the manufacturing of swords, where by the metal alloy of the sword is cooled very slowly into its ground state. The idea is that if you cool the sword too rapidly it will freeze in some higher state of energy than its ground state, a local minimum of the alloy structure, instead of the global minimum. These higher-energy local minima produce defects that weaken the final product. Hence you want to cool it slowly to get a strong sword. The physics is that while the alloy is cooling, if it does find itself in a local minimum energy well, because it is still at a relatively high temperature, the thermal fluctuations will cause it to leap over the well’s barrier height and allow it to leave the local minimum and continue the search for the true minimum.

Computer scientists have emulated this physical process by designing an optimization procedure called simulated annealing. You are seeking the optimal state of some large computational problem and so you simulate the annealing process with a slowly adjustable parameter (usually time and not temperature) and you add in by hand random fluctuations to mimic the thermal fluctuations of the sword-making process. Simulated annealing often works much better than other optimization methods, such as steepest descent, in finding the true global minimum of a large problem.

In Quantum Simulated Annealing (QSA) the bits are replaced with qubits and the thermal fluctuations are replaced with quantum fluctuations inherit in any quantum system. Qualitatively, it is claimed that quantum simulated annealing is more efficient than classical simulated annealing since, instead of fluctuations driving the up and over the potential barrier of the local minimum the system is stuck in, quantum mechanics allows also for the tunneling of the system through the potential barrier. For this reason claims are made that quantum simulated annealing should find the global minimum of a system faster than classical simulated annealing. Quantum simulated annealing should be viewed as strictly contained in AQC, and hence is not universal for quantum computing. What remains to be seen is it more powerful than a classical computer on a specific problem?

And so we come the core idea. As far as I can tell, D-Wave Systems claims their machines implement quantum simulated annealing. The question then is then, is this version of QSA with noisy qubits, short coherence times, and no entanglement good for anything. To answer this question, D-Wave farms out time on their machines to various researchers to see if there are any computational problems, mapped into QSA, for which the D-Wave machines out perform a comparable classical machine. So in addition to being problem dependent, the answer also depends on what you mean ‘comparable classical machine’. It is due to these ambiguities that we have what appear to be contradictory publications and press releases on the power of the D-Wave machines.

I’m not going to go through the list of all the computational problems that have been tried on the D-Wave computers but focus on the one that seems currently to show the most promise. This computational problem is known as QUBO. A few months ago I had never heard of QUBO and I was shocked to find out, after I had, that the Q in QUBO does not stand for ‘quantum’ but rather ‘quadratic’ as in Quadratic Unconstrained Binary Optimization.

Borzhemoi! This I knew from nothing!

Well there was a steep learning curve but without going into all the details, a QUBO problem is a type of optimization problem whose mathematical structure maps directly onto the Ising Hamiltonian that is native to the D-Wave hardware. It seems likely that if the D-Wave machine gives any computational advantage, and if that advantage is quantum in origin — the jury is still out on both these points — then it will only be on problems of the QUBO type. However this is an interesting class of problems with applications that seem primarily have applications to image recognition. In particular certain types of synthetic radar image processing, of NASA Earth Science interest, maps onto the QUBO form.

If the D-Wave machine gives an advantage it will be only on practical problems that can be mapped (with little computational overhead) to a QUBO problem. Whether that set of problems is the empty set, is still unknown.

Here is a summary of the step required to map a practical problem like image recognition onto the D-Wave machine:

1.)  Find a practical problem that has an efficient mapping to a QUBO problem and carry out that mapping. 

2.)  Next the QUBO problem must be mapped to the specifics of the D-Wave circuitry via a Chimera graph that is essentially a wiring diagram for the quantum computer. This is done by an intermediate mapping to a Quantum Ising Spin Model (QuIS) (for which the D-Wave machine was designed to solve in the generic AQC model. In this way programming the D-Wave machine is akin to programming a classical computer in assembly or machine language, as was done decades ago, before the development of higher level compilers and programming languages. That is why the programming process is somewhat convoluted to the uninitiated. To carry out the Chimera mapping for nontrivial problems, D-Wave provides the user access to a front-end server called “black box” that carries out a search for an optimal Chimera mapping.

3.)  Run the problem on the D-Wave machine. For the D-Wave 2, purchased by Google and located at the NASA Ames Research Center, Google has developed a proxy server that interfaces with the actual D-Wave machine that will allow the user to submit batch jobs to the quantum machine.

So we can see that there is a lot of overhead between the physical problem to be solved and the final run on the machine. If that overhead is too large, any quantum advantage of the D-Wave machine may very well be swamped.

In particular, my colleagues ask, “What’s the deal with these announcements that come out every few weeks or so? Some claim to see a speedup and some claim they do not.” We are now prepared to answer this. Experiments on the D-Wave machine that test algorithms that are not of the QUBO form show no speedup. Some problems that have an efficient mapping to QUBO seem, at least in a number of publications, to show some speed up. Whether you see a speedup depends on what problem you choose and how efficiently you can map your problem to a QUBO problem.

Let me be frank and say that it is still possible that D-Wave machines will show no speedup of a quantum nature, but the jury is out. If they are proved to show a quantum-enabled speedup it will be only on a specific narrow class of important  problems that have an efficient QUBO implementation. This means that the current D-Wave machines are clearly not a universal quantum computer but potentially are a very narrow scope, special purpose, possibly quantum, QUBO problem solving engine.

 I should say that I'm in favor of a top-down approach like this to compete, say, with the ion trappers that add a very high-fidelity qubit to their trap once every year or two. I think the competition and the associated discussions are good for the field. Let's avoid calling each other charlatans and instead let's all sit down and try to figure out what is going on and see how we can reach our common goal — a universal quantum computer.