Pages

Friday, August 27, 2010

Quantum Algorithms A Different View

A view of quantum algorithms via linear algebra 


David Hilbert is famous, beyond famous. He is one of the greatest mathematicians of all time. I wish to quote him on theories:
A mathematical theory is not to be considered complete until you have made it so clear that you can explain it to the first man whom you meet on the street.

Algorithms Not Physics
My goal was to remove QA’s from any physical considerations. If this offends the quantum experts, then I am sorry. No I am not sorry. I think we need to do this to make progress.
We have done this decoupling almost completely with classical algorithms. When we explain a new classical algorithm we talk about high level operations—we do not talk about transistors and wires. Nor gates and memory cells. I would venture that few theorists could explain the details of how an L2 cache works, or how a modern arithmetic unit does pipelining, or in general how processors really work. The beautiful point is, it just does not matter for most of theory.
It is this ability to think at a high level for the creation and analysis of classical algorithms that, I believe, is the reason theory has been able to create such powerful algorithms. These algorithms are often described at a level way above any implementation details. I believe this is fundamental, and one of the cornerstones of why theory works.
Here are two classical algorithms: RSA Encryption and the Number Field Sieve (NFS). Both have changed the world, and were on Dick Karp’s list at the last Theory Day at Tech as great algorithms. Neither is described at a low level. For example, RSA is usually described in terms of operations on large integers—each of these operations requires millions of binary operations to perform. NFS is described also at a very high level. Without this ability I doubt that either algorithm would have been discovered.
I think that to make radical progress on QA’s we need to move away from a bit-level understanding and analysis and move to a high-level understanding. This is what I tried to supply in the previous post on quantum algorithms. If I failed in that goal, I apologize to you, but I believe that this goal is the right one. Perhaps someone else will be able to succeed in raising the level of detail—I hope so.
Geometry Not Quantum Mechanics
In that previous discussion I tried to make the assumptions needed to understand the Deutsch algorithm as simple as possible. This was to help a non-expert to “see” what was happening without any quantum mechanics details. I also believe that by stripping away tensor products, qubits, Hadamard Transforms, and all the rest we could lay bare what is really happening. I agree that none of these concepts are too difficult, but are they essential to make a person have a “feel” for what is happening with QA’s? I do not believe that they are.
I also hope that this view can be used by theory experts to make further progress on QA’s. If the details are pushed aside and all that is left is a simple question about unit vectors on the sphere, then I hope that we might be able to find new insights. I am working right now on doing this, and I believe that I may succeed. If QA are “just” questions about geometry, then I assert that we may be able to find new QA’s or prove new lower bounds. I still believe that is true.
Grover’s Algorithm
Lov Grover discovered one of the most important QA known in 1996. Or is that “invented?” The algorithm named after him is able to search a space of size {N} in cost {O(\sqrt N)}. Of course, this is much faster than any classical algorithm, which would take linear time, even for random algorithms.
Grover’s beautiful result has many applications. You can think of his algorithm as allowing the computation of an {N} sized “OR” in time square root of {N}. This is so general that it has had many applications in various areas of complexity. For example, it can be used to perform triangle detection faster than any known classical method—see the beautiful paper by Frederic Magniez, Miklos Santha, and Mario Szegedy.
It should be noted in this and many other applications there are nice interactions between Grover and standard methods. That is, the optimal results often follow from a “clever” use of Grover’s algorithm.
The Linear Algebra View
Let me explain Grover’s algorithm using just linear algebra. We begin in a state {s} of not knowing anything about the space of possible solutions—{s} is the all-{1} vector, but divided by {\sqrt{N}} to make it a unit vector. Let us suppose there are {k} solutions. We do not know {k} in advance. However, we will argue that it is enough to know {k} to within a factor of {2}. Then we will apply the standard idea of first trying {k = N/2}, then {k = N/4}, then {k = N/8} and so on, and this will not affect the order of the running time. So we may suppose {k} is known after all—and what we really do not know is the location {R \subseteq [N]} of the solutions. Our goal state is the vector {1_R} that has a {1} in those positions that belong to {R}, and {0} elsewhere, this time dividing by {\sqrt{k}} to make a unit vector {r = 1_R/\sqrt{k}}.
All states {q} that we reach along the way will be linear combinations of our initial ignorance state {s}and the unknown full-knowledge state {r}, that is
\displaystyle  q = as + br
where {|a|^2 + |b|^2 = 1} to preserve a unit vector. Initially we have {a = 1}{b = 0}. When we measure, the chance of getting the index of a solution is {|a|^2(k/N) + |b|^2}. Initially this is just {k/N}, which is just the chance of guessing a solution at random. If {|b|} is high, however, then we stand a good chance of getting a solution. So that’s the goal of Grover’s algorithm—to stay in the simple two-dimensional subspace {S} spanned by {s} and {r}, while moving from {s} close enough to {r}for a measurement to give a solution with high probability. We just need some operations that keep this subspace fixed but move points well enough inside it.
The first operation is an {N} by {N} matrix {A} that is the identity matrix—except that the diagonal entries corresponding to solutions are {-1} instead of {1}. That is, {A = I - 2\mathrm{diag}(1_R)}. This matrix is given as a “black box”—we are not allowed to inspect it. It really represents the verification predicate that an element is a solution. Since {A^2 = I}, clearly {A} is unitary. The second operation is simply the matrix
\displaystyle  B = \frac{2}{N}J - I
where {J} is the all {1} matrix. Each row has one entry of {(2/N) - 1} and {(N-1)} entries of {2/N}. Its norm is therefore
\displaystyle  (\frac{2}{N} - 1)^2 + (N-1)(\frac{2}{N})^2 = \frac{4}{N^2} - \frac{4}{N} + 1 + \frac{4}{N} - \frac{4}{N^2} = 1.
All pairs of different rows have dot product {2(2/N - 1)(2/N) + (N-2)(2/N)^2 = 8/N^2 - 4/N + 4/N - 8/N^2 = 0}. So {B} is also a unitary matrix. To verify that these operations conserve the subspace {S}, note that {Js = Ns}while {Jr = ks/\sqrt{k} = \sqrt{k}s}, and compute:
\displaystyle  \begin{array}{rcl}  As &=& (I - 2\mathrm{diag}(1_R))s = s - 2\sqrt{\frac{k}{N}}r \\  Ar &=& -r \\  Bs &=& \frac{2}{N}Js - s = 2s - s = s \\  Br &=& \frac{2}{N}Jr - r = 2\sqrt{\frac{k}{N}}s - r. \end{array}
The algorithm then simply computes
\displaystyle  q = (BA)^m s,
where {m \approx \sqrt{N/k}}. Then, it measures the vector {q}. The claim is with high probability the measurement yields some index {i} so that {A_{ii}=-1}.
Grover Analysis
The key here is to prove that the algorithm works: that is after applying the matrix {BA} about square root of {N} times the measurement yields a solution with high probability. Given our insight about preserving a two-dimensional subspace, this comes down to simple linear algebra on {2 \times 2} matrices. The algebra is the same as on Wikipedia’s page on Grover’s algorithm for the case {k = 1}, except we replace factors {\sqrt{1/N}} by {\sqrt{k/N}}.
We can see intuitively from the above action of {A} and {B} on {s} and {r} that {BA} effects a rotation through an angle of {\mathrm{arcsin}(2\sqrt{k/N})}, which is roughly {2\sqrt{k/N}} when {k} is small relative to {N}. Starting from {s} which is the point {(1,0)} in this co-ordinate system, it takes about {(\pi/2)(\sqrt{N/k}/2) = (\pi/4)\sqrt{N/k}} iterations to get near the point {(0,1)}. We get near within distance about {\sqrt{k/N}}, in fact, and the error when measuring is hence order-of the square of that, i.e. only order {k/N}. Note that as it iterates, the algorithm gets progressively closer to its goal—in the sense that if “Simon says: measure” at any time, its chance of finding a solution always improves.
Finally, as we noted above, if we do not know {k} then we make a “net” by guessing {k = N/2} then {k = N/4} then {k = N/8} and so on. When our “{k}” is off by a factor of 2 we will not get so near to the point {(0,1)}. However, there will be multiple iterations for values {k} in our net that are close enough to give a reasonable success probability. The full analysis involves a tradeoff between random guessing working well when {k} is large, versus the error {k/N} being small when {k} is small, and repeating some measurement trials to make the net a little finer than stepping down by factors of {2}. The details need to be worked out, but these details are not quantum mechanical—they are ordinary theory-of-algorithms details.
Open Problems
The goal here was not to supplant the current view of QA’s. Nor was it to not teach all the details to our students. The goal was both less and more important. It was to start to try to create a way to look at QA’s that was simpler, more accessible to many. And at the same time could lead us to discover new insights into QA’s. I believe that the more ways we can define and view a mathematical topic, the better is our understanding.
I hope that I helped in this regard. In any event thanks to all who read that discussion and this one. Thanks for being supportive—even those who disagree with this approach.

0 comments:

Post a Comment

Followers