The idea is that both probability theory and quantum computing has the concept of tensor. What's more, the entanglement
happens in both fields. In this post, I would like to demonstrate it, and try to talk about quantum parallel.
What is a tensor? By definition, tensor is a multilinear function on some vector space \(\mathcal{V}\) over field \(\mathbb{F}\), and any multilinear function on some vector space \(\mathcal{V}\) over field \(\mathbb{F}\) is a tensor, that's it!
The definition of multilinear function:
\[ \begin{aligned} & T (..., \alpha \ket{x_0} + \beta \ket{x_1}, ...) = \alpha T (..., \ket{x_0}, ...) + \beta T (..., \ket{x_1}, ...) \\ & \forall \ket{x_0}, \ket{x_1} \in \mathcal{V}\\ & \forall \alpha, \beta \in \mathbb{F} \end{aligned} \]tensor is not just NDarray
, which machine learning community always abuses.
But tensor can be represented with NDarray
, given some basis.
Put a multilinear binary function \(T(\ket{a}, \ket{b})\) as an example:
if \(\ket{a}, \ket{b}\) each is in a two dimensional vector space, then they can be decomposed into linear combination of basis vector:
\[ \begin{aligned} & \ket{a} = a_0 \ket{0} + a_1 \ket{1} \\ & \ket{b} = b_0 \ket{0} + b_1 \ket{1} \end{aligned} \]We immediately get:
\[ T(\ket{a}, \ket{b}) = a_0 b_0 T(\ket{0}, \ket{0}) + a_0b_1 T(\ket{0}, \ket{1}) + a_1b_0 T(\ket{1}, \ket{0}) + a_1b_1 T(\ket{1}, \ket{1}) \]and this can be rearranged into an matrix multiplication form:
\[ \begin{aligned} T(\ket{a}, \ket{b}) = \\ & a_0 b_0 T(\ket{0}, \ket{0}) + a_0b_1 T(\ket{0}, \ket{1}) \\ + & a_1b_0 T(\ket{1}, \ket{0}) + a_1b_1 T(\ket{1}, \ket{1}) \end{aligned} \]for better visualization, we just rename function application such as \(T(\ket{0}, \ket{0})\), \(T(\ket{0}, \ket{1})\) ... into \(T_{00}\), \(T_{01}\) ...
\[ T(\ket{a}, \ket{b}) = \begin{bmatrix} a_0 & a_1 \\ \end{bmatrix} \begin{bmatrix} T_{00} & T_{01} \\ T_{10} & T_{11} \end{bmatrix} \begin{bmatrix} b_0 \\ b_1 \\ \end{bmatrix} \]We can easily see the matrix
\[ \begin{bmatrix} T_{00} & T_{01} \\ T_{10} & T_{11} \end{bmatrix} \]is the expansion of our tensor onto the computational basis.
And we can define tensor in such case(two variable) to be a matrix multiplication operation
of a row vector
, a 2d matrix
and a column vector
on the computational basis. This kind of view can be generalized to be NDarray
with multiply broadcast and sum reduce operation given more than 2 variables.
Conventionally, we say a tensor with N
variables(like a
, b
in the case above) to be a rank N tensor
. We also say a rank N tensor
to be tensor with N
indices(i.e. variables can also be refered as indices).
tensor can also be defined with variables on dual space, which means the joint space is
\[ V \times ... \times V \times V^* \times ... \times V^* \](remind that our tensor is a function:)
\[ T : V \times ... \times V \times V^* \times ... \times V^* \mapsto \mathbb{F} \]And conventionally, we use lower indices to represent prim space and use upper indices to represent dual space, which means our tensor could be represented as
\[ T_{ijk...}^{lmn...} = T(\ket{i}\ket{j}\ket{k}...\bra{l}\bra{m}\bra{n}...) \]from the example above, we can easily expect that, given a change of basis(rotation for example), the specific form of
\[ \begin{bmatrix} T_{00} & T_{01} \\ T_{10} & T_{11} \end{bmatrix} \]may change, however, the value of \(T(a,b)\) for any a
and b
still remain the same, which indicate that tensor(as a multilinear function) is invariant with the change of basis on vector space.
Now let's go to our main topic, understanding probability distribution and quantum computing with the concept of tensor. Let's start with probability distribution.
For simplicity, we mainly focus on discrete probability distribution, continuous distribution is more complex and may not fit into our discussion.
... to be continued