Low rank approximation in simulations of quantum algorithms

doi:10.1016/j.jocs.2022.101561

Journal of Computational Science

Volume 59, March 2022, 101561

https://doi.org/10.1016/j.jocs.2022.101561 Get rights and content

Highlights

•
We discussed how to use low-rank CP decomposition to simulate quantum algorithms.
•
We proposed two rank reduction algorithms to enable efficient simulation.
•
We applied CP to QFT, phase estimation, Grover’s search, and quantum walks.

Abstract

Simulating quantum algorithms on classical computers is challenging when the system size, i.e., the number of qubits used in the quantum algorithm, is moderately large. However, some quantum algorithms and the corresponding quantum circuits can be simulated efficiently on a classical computer if the input quantum state is a low rank tensor and all intermediate states of the quantum algorithm can be represented or approximated by low rank tensors. In this paper, we examine the possibility of simulating a few quantum algorithms by using low-rank canonical polyadic (CP) decomposition to represent the input and all intermediate states of these algorithms. Two rank reduction algorithms are used to enable efficient simulation. We show that some of the algorithms preserve the low rank structure of the input state and can thus be efficiently simulated on a classical computer. However, the rank of the intermediate states in other quantum algorithms can increase rapidly, making efficient simulation more difficult. To some extent, such difficulty reflects the advantage or superiority of a quantum computer over a classical computer. As a result, understanding the low rank structure of a quantum algorithm allows us to identify algorithms that can benefit significantly from quantum computers.

Introduction

A quantum algorithm is often expressed by a unitary transformation

U

applied to a quantum state

| ψ 〉

. On a quantum computer,

| ψ 〉

can be efficiently encoded by

n

qubits, effectively representing

2^{n}

amplitudes simultaneously, and

U

is implemented as a sequence of one or two-qubit gates that are themselves 2 × 2 or 4 × 4 unitary transformations. To simulate a quantum algorithm on a classical computer, we can simply represent

| ψ 〉

as a vector in

ℂ^{2^{n}}

, and

U

as a

ℂ^{2^{n} \times 2^{n}}

matrix, and perform a matrix–vector multiplication

U | ψ 〉

. However, for even a moderately large

n

, e.g.,

n = 50

, the amount of memory required to store

U

and

| ψ 〉

explicitly far exceeds what is available on many of today’s powerful supercomputers, thereby making the simulation infeasible [1], [2], [3], [4]. Fortunately, for many quantum algorithms, both

| ψ 〉

and

U

have structures. In particular,

| ψ 〉

may have a low-rank tensor structure, and the quantum circuit representation of

U

gives a decomposition of

U

that can be written as

U = U^{(1)} U^{(2)} \dots U^{(D)},

where

U^{(i)}

is a linear combination of Kronecker products of 2 × 2 matrices, many of which are identities, and

D

is the depth of the circuit which is typically bounded by a (low-degree) polynomial of

n

. As a result, if the low rank structure of

| ψ 〉

can be preserved in the successive multiplication of

U^{(i)}

’s with the input, we may be able to simulate the quantum algorithm efficiently for a relatively large

n

When

| ψ 〉

is viewed as an order

n

tensor, there are several ways to represent it efficiently. One of them is known as a canonical polyadic (CP) decomposition [5], [6] written as

| ψ 〉 = \sum_{i_{1}, \dots, i_{n} \in {0, 1}} \sum_{k = 1}^{R} A_{i_{1} k}^{(1)} A_{i_{2} k}^{(2)} \dots A_{i_{n} k}^{(n)} | i_{1} i_{2} \dots i_{n} 〉,

where

A^{(i)} \in ℂ^{2 \times R}

and

R

is known as the rank of the CP decomposition. The second representation is known as matrix product state (MPS) [7] in the physics literature or tensor train (TT) [8] in the numerical linear algebra literature, which is a special tensor networks representation of a high dimensional tensor [9]. In this representation, the quantum state can be written as

| ψ 〉 = \sum_{i_{1}, \dots, i_{n} \in {0, 1}} \sum_{k_{1}, \dots, k_{n - 1}} A_{i_{1} k_{0} k_{1}}^{(1)} A_{i_{2} k_{1} k_{2}}^{(2)} \dots A_{i_{n} k_{n - 1} k_{n}}^{(n)} | i_{1} i_{2} \dots i_{n} 〉,

where

A^{(j)}

is a tensor of dimension

2 \times R_{j - 1} \times R_{j}

, with

R_{0} = R_{n} = 1

. The rank of an MPS is often defined to be the maximum of

R_{j}

for

j \in {1, 2, \dots, n - 1}

. The memory requirements for CP and MPS representations of

| ψ 〉

are

O (R n)

and

O (R^{2} n)

, respectively. When

R

is relatively small, such requirement is much less than the

O (2^{n})

requirement for storing

| ψ 〉

as an vector, which allows us to simulate a quantum algorithm with a relatively large

n

on a classical computer that stores and manipulates

| ψ 〉

in these compact forms.

For several quantum algorithms, the rank of the CP or MPS representation of the input

| ψ 〉

is low. However, when

U^{(i)}

’s are successively applied to

| ψ 〉

, the rank of the intermediate tensors (the tensor representation of the intermediate states) can start to increase. When the rank of an intermediate tensor becomes too high, we may not be able to continue the simulation for a large

n

. One way to overcome this difficulty is to perform rank reductions on intermediate tensors when their ranks exceed a threshold. When a CP decomposition is used to represent

| ψ 〉

, we can take, for example, (1.2) as the input and use the alternating least squares (ALS) [10], [11] algorithm to obtain an alternative CP decomposition that has a smaller

R

. The rank reduction of an MPS can be achieved by performing a sequence of truncated singular value decomposition (SVD).

Performing rank reduction on intermediate tensors can introduce truncation error. For some quantum algorithms, this error is zero or small, thus not affecting the final outcome of the quantum algorithm. For other algorithms, the truncation error can be large, which results in significant deviation of the computed result from the exact solution. For a specific quantum algorithm, understanding whether the intermediate tensors can be accurately approximated through low rank truncation is valuable for assessing the difficulty of simulating the algorithm on a classical computer. We attempt to investigate such difficulty for a few well known quantum algorithms in this paper both analytically and numerically.

In this paper, we examine the use of low-rank approximation via CP decomposition to simulate several quantum algorithms. We choose to focus on using CP decomposition instead of MPS or general tensor networks to represent the input and intermediate tensors, because the (low) rank product structure of the input and intermediate tensors in the quantum algorithm are relatively easy to see and interpret in CP terms. Furthermore, some of the unitary operations such as swapping two qubits are relatively easy to implement for a CP decomposed tensor. The use of low rank MPS and more general tensor networks in quantum circuit simulation can be found in [12], [13], [14], [15], [16].

The algorithms we examine include the quantum Fourier transform (QFT) [17] and quantum phase estimation [18], which are the building blocks of other quantum algorithms, the Grover’s search algorithm [19], [20], and quantum walk [21], [22] algorithms, which are quantum extensions of classical random walks on graphs.

For both QFT and phase estimation, we show that we can accurately approximate the intermediate states by low rank CP decomposition when the input states have special structures. For general input states, low-rank approximation can yield a large truncation error. For the Grover’s search algorithm, we show analytically that CP ranks of all the intermediate states are bounded by

a + 1

, where

a

is the size of the marked set to be searched. Therefore, Grover’s algorithm can, in principle, always be simulated efficiently by using low-rank CP decomposition when the size of the marked set is small. For quantum walks, we show that accurate low-rank approximation is possible when the walk is performed on some graphs. However, rank reduction can be difficult when the walk is performed on a general graph.

We discuss two numerical algorithms for performing rank reduction for intermediate tensors produced in the simulation of the quantum circuit, CP-ALS and an alternative algorithm called direct elimination of scalar multiples (DESM). CP-ALS is a general and widely used algorithm for performing CP decomposition, but it may suffer from numerical issues when the initial amplitudes associated with some of the terms in CP decomposition are significantly smaller than those associated with other terms. In this case, the direct elimination of scalar multiples is more effective.

We perform numerical experiments to test the feasibility of simulating these quantum algorithms using CP decomposition. Our results show that, by using CP decomposition and low rank representation/approximation, we can indeed simulate some quantum algorithms with a many-qubit input on a classical computer with high accuracy. Other quantum algorithms such as quantum walks on a general graph are more difficult to simulate, because the CP rank of the intermediate tensors grows rapidly as we move along the depth of the quantum circuit representation of the quantum algorithm.

In summary, this paper makes the following contributions.

•
We provide detailed analysis for simulating several quantum algorithms using CP decomposition.
•
We discuss the feasibility of two numerical algorithms for calculating the CP decomposition in the simulation of quantum algorithms. Our results show that CP-ALS is ineffective for some algorithms, and the direct elimination of scalar multiples (DESM), can be more effective.
•
We numerically show that we can accurately simulate some quantum algorithms to be run on devices consisting of 60 qubits by using CP decomposition and low rank representation/ approximation.

This paper is organized as follows. In Section 2, we introduce the notations for quantum states, gates and circuits that are used throughout the paper. Section 3 provides the background of quantum algorithm simulations. We describe algorithms for constructing and updating low-rank CP decompositions of a tensor in Section 4. In Sections 5 Quantum Fourier transform and phase estimation, 6 Grover’s algorithm, 7 Quantum walks, we examine the possibility of using low-rank approximations to simulate QFT and phase estimation, Grover’s search algorithm, and quantum walks, respectively. In Section 8, we compare the computational and the memory cost of simulating different quantum algorithms using CP decomposition. In Section 9, we report some numerical experimental results that demonstrate the effectiveness of using low-rank approximation to simulate quantum algorithms.

Access through your organization

Check access to the full text by signing in through your organization.

Access through your organization

Section snippets

Notations for quantum states, gates and circuits

Our analysis makes use of tensor algebra in both element-wise equations and specialized notation for tensor operations [23]. For vectors, lowercase Roman letters are used, e.g.,

v

. For matrices and quantum gates, uppercase Roman letters are used, e.g.,

M

. For tensors, calligraphic fonts are used, e.g.,

T

. An order

n

tensor corresponds to an

n

-dimensional array with dimensions

s_{1} \times \dots \times s_{n}

. In the following discussions, we assume that

s_{1} = \dots = s_{n} = 2

. Elements of tensors are denotes in subscripts, e.g.,

T_{i j}

Simulation of quantum algorithms

Although tremendous progress has been made in the development of quantum computing hardware [25], [26], enormous engineering challenges remain in producing reliable quantum computers with a sufficient number of qubits required for solving practical problems. However, these challenges should not prevent us from developing quantum algorithms that can be deployed once reliable hardware becomes available. Our understanding of many quantum algorithms can be improved by simulating these algorithms on

Low-rank approximation in quantum algorithm simulation

In this section, we discuss two techniques for reducing the rank of a CP decomposition of the tensor in the context of quantum algorithm simulation. Before we describe the details of these techniques, we first outline the basic procedure of using low rank approximation in the simulation of a quantum algorithm represented by a quantum circuit (1.1) in Algorithm 1. Rank reduction techniques are used in Line 6 of the algorithm.

We should note that for some quantum algorithms, the unitary

Quantum Fourier transform

The quantum Fourier transform (QFT) uses a special decomposition [17], [28] of the discrete Fourier transform

F^{(N)}

define by

F^{(N)} ≔ \frac{1}{\sqrt{N}} [\begin{bmatrix} ω_{N}^{0} & ω_{N}^{0} & ω_{N}^{0} & \dots & ω_{N}^{0} \\ ω_{N}^{0} & ω_{N}^{1} & ω_{N}^{2} & \dots & ω_{N}^{N - 1} \\ ω_{N}^{0} & ω_{N}^{2} & ω_{N}^{4} & \dots & ω_{N}^{2 (N - 1)} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ ω_{N}^{0} & ω_{N}^{N - 1} & ω_{N}^{2 (N - 1)} & \dots & ω_{N}^{(N - 1) (N - 1)} \end{bmatrix}] \in ℂ^{N \times N},

where

N = 2^{n}

, and the output is

y = F^{(N)} x

for the input vector

x \in ℂ^{N}

. We show the quantum circuit for QFT in Fig. 1. As is shown in the figure, a

n

-qubit QFT circuit consists of

n

1-qubit Hadamard gates,

⌊ N / 2 ⌋

SWAP gates, and

n - 1

controlled unitary (

R_{i}

) gates. Without rank reduction,

Grover’s algorithm

Search is a common problem in information science. Grover’s algorithm [19] achieves quadratic speed-up compared to the classical search algorithms. We first examine the possibility to simulate the Grover’s algorithm with only one marked item using the CP representation of the tensor, then generalize the analysis to the cases with multiple marked items.

Quantum walks

Quantum walks [32], [33] play an important role in the development of many quantum algorithms, including quantum search algorithms [34] and the quantum page rank algorithm [35]. A quantum walk operator is the quantum extension of a classical random walk operator that has been studied extensively in several scientific disciplines. A classical random walk is characterized by an

N \times N

Markov chain stochastic matrix

P

associated a graph with

N

vertices. There is an edge from the

j

th vertex to the

i

Summary of computational cost

The use of low rank CP decomposition to represent the input and intermediate states in the simulation of a quantum algorithm allows us to significantly reduce the memory requirement of the simulation. If the rank of all intermediate states can be bounded by a small constant, then the memory requirement of the simulation is linear with respect to the number of qubits

n

. This is significantly less than the memory required to simulate a quantum algorithm directly, which is exponential with respect

Experimental results

In this section, we demonstrate the efficacy of using low rank approximation in the simulation of four quantum algorithms: QFT, phase estimation, Grover’s algorithm and quantum walks. We implemented our algorithms on top of an open-source Python library, “Koala” [14], which is a quantum circuit/state simulator and provides interface to several numerical libraries, including NumPy [39] for CPU executions and CuPy [40] for GPU executions. All of our code is available at //github.com/LinjianMa/koala

Conclusions

In this paper, we examined the possibility of using low-rank approximation via CP decomposition to simulate quantum algorithms on classical computers. The quantum algorithms we have considered include the quantum Fourier transform, phase estimation, Grover’s algorithm and quantum walks.

For QFT, we have shown that all the intermediate states within the QFT quantum circuit and the output of the transform are rank-1 when the input is a standard (computational) basis. The same observation holds for

CRediT authorship contribution statement

Linjian Ma: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing – original draft, Writing – review & editing. Chao Yang: Conceptualization, Methodology, Formal analysis, Writing – original draft, Writing – review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported by the U.S. Department of Energy (DOE) under Contract No. DE-AC02-05CH11231, through the Office of Advanced Scientific Computing Research Accelerated Research (ASCR) for Quantum Computing Program, the Fundamental Algorithmic Research for Quantum Computing (FAR-QC) project and the SciDAC Program .

Linjian Ma is a Ph.D. student in the Department of Computer Science at University of Illinois at Urbana-Champaign (UIUC), advised by Edgar Solomonik. His research interests lie in the intersection of numerical algorithms, high performance computing, system and quantum simulation.

References (40)

ChenZhao-Yun et al.
64-Qubit quantum circuit simulation
Sci. Bull.
(2018)
SchollwöckUlrich
The density-matrix renormalization group in the age of matrix product states
Ann. Phys.
(2011)
LokeT et al.
Efficient quantum circuits for szegedy quantum walks
Ann. Phys.
(2017)
PednaultEdwin et al.
Breaking the 49-qubit barrier in the simulation of quantum circuits
(2017)
PednaultEdwin et al.
Leveraging secondary storage to simulate deep 54-qubit sycamore circuits
(2019)
Xin-Chuan Wu, Sheng Di, Emma Maitreyee Dasgupta, Franck Cappello, Hal Finkel, Yuri Alexeev, Frederic T Chong,...
HitchcockFrank L
The expression of a tensor or a polyadic as a sum of products
Stud. Appl. Math.
(1927)
HarshmanRichard A
Foundations of the PARAFAC procedure: models and conditions for an explanatory multimodal factor analysis
(1970)
OseledetsIvan V
Tensor-train decomposition
SIAM J. Sci. Comput.
(2011)
MarkovIgor L et al.
Simulating quantum computation by contracting tensor networks
SIAM J. Comput.
(2008)

CarrollJ Douglas et al.

Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition

Psychometrika

(1970)

HarshmanRichard A

Determination and proof of minimum uniqueness conditions for PARAFAC1

UCLA Working Pap. Phonetics

(1972)

ZhouYiqing et al.

What limits the simulation of quantum computers?

(2020)

GrayJohnnie et al.

Hyper-optimized tensor network contraction

(2020)

Yuchen Pang, Tianyi Hao, Annika Dugad, Yiqing Zhou, Edgar Solomonik, Efficient 2D Tensor Network Simulation of Quantum...

GuoChu et al.

General-purpose quantum circuit simulator with projected entangled-pair states and the quantum supremacy frontier

Phys. Rev. Lett.

(2019)

ChamonClaudio et al.

Virtual parallel computing and a search algorithm using matrix product states

Phys. Rev. Lett.

(2012)

CoppersmithDon

An approximate Fourier transform useful in quantum factoring

(2002)

CleveRichard et al.

Quantum algorithms revisited

Proc. R. Soc. London. Ser. A

(1998)

GroverLov K

A fast quantum mechanical algorithm for database search

Cited by (6)

EFFICIENT CP ROUNDING USING ALTERNATING LEAST SQUARES WITH QR DECOMPOSITION
2026, SIAM Journal on Matrix Analysis and Applications
Automated Synthesis of Quantum Algorithms via Classical Numerical Techniques
2025, ACM Transactions on Quantum Computing
Opening the Black Box inside Grover's Algorithm
2024, Physical Review X
Simulating the quantum Fourier transform, Grover's algorithm, and the quantum counting algorithm with limited entanglement using tensor networks
2024, Physical Review Research
Approximate contraction of arbitrary tensor networks with a flexible and efficient density matrix algorithm
2024, Quantum
Cost-efficient Gaussian tensor network embeddings for tensor-structured inputs
2022, Advances in Neural Information Processing Systems

Chao Yang is a senior scientist in the Applied Mathematics and Computational Research Division at Lawrence Berkeley National Laboratory (LBNL). He received his Ph.D. from Rice University in 1998. He was a Householder fellow at the Oak Ridge National Laboratory from 1999 to 2000. He joined LBNL in 2000. His research interests include numerical linear algebra with applications in electronic structure calculations and quantum many-body problems, inverse problems, high performance computing and quantum computing. He is a member of SIAM.

View full text

Low rank approximation in simulations of quantum algorithms

Highlights

Abstract

Introduction

Access through your organization

Section snippets

Notations for quantum states, gates and circuits

Simulation of quantum algorithms

Low-rank approximation in quantum algorithm simulation

Quantum Fourier transform

Grover’s algorithm

Quantum walks

Summary of computational cost

Experimental results

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Sci. Bull.

Ann. Phys.

Ann. Phys.

Breaking the 49-qubit barrier in the simulation of quantum circuits

Leveraging secondary storage to simulate deep 54-qubit sycamore circuits

The expression of a tensor or a polyadic as a sum of products

Stud. Appl. Math.

Foundations of the PARAFAC procedure: models and conditions for an explanatory multimodal factor analysis

Tensor-train decomposition

SIAM J. Sci. Comput.

Simulating quantum computation by contracting tensor networks

SIAM J. Comput.

Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition

Psychometrika

Determination and proof of minimum uniqueness conditions for PARAFAC1

UCLA Working Pap. Phonetics

What limits the simulation of quantum computers?

Hyper-optimized tensor network contraction

General-purpose quantum circuit simulator with projected entangled-pair states and the quantum supremacy frontier

Phys. Rev. Lett.

Virtual parallel computing and a search algorithm using matrix product states

Phys. Rev. Lett.

An approximate Fourier transform useful in quantum factoring

Quantum algorithms revisited

Proc. R. Soc. London. Ser. A

A fast quantum mechanical algorithm for database search