Tensor

September 4, 2018 6 分钟阅读

参考：

什么是张量 (tensor)？ - 玟清的回答 - 知乎

1. 预备知识：covectorPermalink

假定 $V$ is a vector space over a field $K$ 。以下四个名字：

linear functional
linear form
one-form
covector

指的是同一个对象：a function $f : V \to K$ which satisfies linearity:

$f (v + w) = f (v) + f (w), \forall v, w \in V$
$f (a v) = a f (v), \forall v \in V, \forall a \in K$

一般这个 field $K$ 就是 $R$ ，所以 covector 就是一个函数，它接收一个 (column) vector 作为参数，返回一个实数。

但本质上，我们也可以把 covector 理解为一个 (row) vector，因为我们可以定义 $f (v) = w^{T} v = a \in R$ 。进一步，我们集齐所有这样的函数 $f$ ，得到集合 $H o m_{K} (V, K) = {f : V \to K ∣ f is linear}$ ，这个集合可以构成一个 vector space over $K$ with operations of addition and scalar multiplication。我们称这个 vector space 为 $V^{*}$ ，它是 vector space $V$ 的 (algebraic) dual space。

注：

$H o m$ 的意思是 “homomorphism”, a transformation of one set, say $A$ , into another, say $A^{'}$ , that the relations between elements of $A$ are preserved in $A^{'}$ .
换个角度考虑，vector 可以看做一个 $g : K \to V$ ，但是研究这个似乎没有意义

2. 预备知识：Einstein summation conventionPermalink

其实就是个简写。比如把 $y = \sum_{i = 1}^{3} c_{i} x^{i} = c_{1} x^{1} + c_{2} x^{2} + c_{3} x^{3}$ 简写成 $y = c_{i} x^{i}$ 。注意这里 $x^{i}$ 不是表示 “ $i$ 次方” 而是 “第 $i$ 维”。

应用到 vector 的场合，假设有 $v \in R^{n}$ ，我们一般展开是：

v = \sum_{i = 1}^{n} v_{i} e_{i}

其中 $v_{i}$ 是分量， $e_{i}$ 是 basis (基)，即：

$e_{1} = [1 0 0 \dots 0]^{T} \in R^{n}$
$e_{2} = [0 1 0 \dots 0]^{T} \in R^{n}$
$\dots$
$e_{n} = [0 0 0 \dots 1]^{T} \in R^{n}$

用 Einstein summation convention 就可以简写成:

v = v_{i} e_{i}

为了以示区别，covector $w^{T}$ 我们一般写成：

w^{T} = w^{i} e^{i}

其中：

$e^{1} = [1 0 0 \dots 0]$
$e^{2} = [0 1 0 \dots 0]$
$\dots$
$e^{n} = [0 0 0 \dots 1]$

3. 预备知识：Banach space / Vector space from continuous $n$ -linear mapsPermalink

Wikipedia: Banach space

… a Banach space (pronounced [ˈbanax]) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and is complete in the sense that a Cauchy sequence of vectors always converges to a well defined limit that is within the space.

Let $V, K, \dots$ denote Banach spaces. We define $L^{n} (V_{1}, \dots, V_{n}; K)$ denotes the vector space of continuous $n$ -linear maps of $V_{1} \times \dots \times V_{n} \to K$ .

注意：

这里 $V_{i} \times V_{j}$ 的 $\times$ 是 cartesian product，也就是说： $f : V_{1} \times \dots \times V_{n} \to K$ 是一个函数，它接收一个 $n$ -tuple of vectors，返回一个 $K$ 的元素
- 即 $f (v_{1}, \dots, v_{n}) = k$ 其中 $v_{i} \in V_{i}, k \in K$
$f : V_{1} \times \dots \times V_{n} \to R$ 可以理解成一个 $n$ -tuple of covectors，比如 $(w_{1}^{T}, \dots, w_{n}^{T})$ ，因为我们可以定义 $f (v_{1}, \dots, v_{n}) = \prod_{i = 1}^{n} w_{i}^{T} v_{i} = a \in R$ 。
这么一来， $L^{n} (V_{1}, \dots, V_{n}; R)$ 其实是一个 “元素为 $n$ -tuple of covectors” 的 space。但是你结合 Digest of Essence of Linear Algebra 最后的部分，” $n$ -tuple of covectors” 满足 vector addition and scaling 的 8 条 axioms，所以可以看做一个 generalized 的 vector；换言之， $L^{n} (V_{1}, \dots, V_{n}; R)$ 也就是一个 generalized 的 vector space

4. Tensor Space / RankPermalink

For a vector space $V$ , we define:

T_{s}^{r} (V) = L^{r + s} (\underset{r}{\underset{⏟}{V^{*}, \dots, V^{*}}}, \underset{s}{\underset{⏟}{V, \dots, V}}; R)

Elements of $T_{s}^{r} (V)$ are called tensors on $V$ , contravariant of order $r$ and covariant of order $s$ ; or simply, of type $(r, s)$ .

Special cases:

$T_{0}^{0} (V) = R$
$T_{0}^{1} (V) = L (V^{*}; R) = V$
$T_{0}^{2} (V) = L (V^{*}, V^{*}; R) = L (V^{*}; V)$
$T_{1}^{0} (V) = L (V; R) = V^{*}$
$T_{2}^{0} (V) = L (V, V; R) = L (V; V^{*})$
$T_{1}^{1} (V) = L (V^{*}, V; R) = L (V; V) = L (V^{*}; V^{*})$

注意： $s + r$ 的值称作 tensor 的 rank；从 special case 来看：

0 阶 tensor 是 scalar
1 阶 tensor 是 vector/covector
2 阶 tensor 中只有 $T_{1}^{1} (V)$ 是 matrix
- 即你只能说 matrix 是 2 阶 tensor；不能说 2 阶 tensor 都是 matrix
- 仔细考虑一下，其实所有 $2 n$ 阶的 $T_{n}^{n} (V)$ 都是 matrix，见 6.2 的讨论
  - 当然这里说的都是 2-D 的实数 matrix

5. TensorPermalink

从 tensor space 来看，一个 tensor 就是个 $f : \underset{r}{\underset{⏟}{V^{*} \times \dots \times V^{*}}} \times \underset{s}{\underset{⏟}{V \times \dots \times V}} \to R$

从函数的角度来看， $V^{*}$ 和 $V$ 的顺序其实是可以打乱的，也可以是交错的；但为了研究起来方便，tensor 的定义强制要求了这个 “连续 $V^{*}$ 再连续 $V$ ” 的顺序

这个 $f$ 可以看做一个 $(r + s)$ -tuple of vectors/covectors $(p_{1}, \dots, p_{r}, q_{1}^{T}, \dots, q_{s}^{T})$ ，因为我们可以定义：

f (w_{1}^{T}, \dots, w_{r}^{T}, v_{1}, \dots, v_{s}) = \prod_{i = 1}^{r} w_{i}^{T} p_{i} \times \prod_{i = 1}^{s} q_{i}^{T} v_{i} = a \in R

6. Tensor Product OperatorPermalink

为了表示起来方便，我们引入 tensor product operator $\otimes$ 。它其实有两种应用场合：

两个 tensor space over vector space $V$ , $Θ_{1}$ 和 $Θ_{2}$ 的 tensor product $Θ_{1} \otimes Θ_{2}$ 仍然是一个 tensor over vector space $V$
- 考虑特殊情况：假设 $V$ 和 $W$ 都是 vector space (i.e. $T_{0}^{1} (V)$ ) over field $K$ ，那么 $V \otimes_{K} W$ 仍然是一个 vector space over field $K$
两个 tensor, $t_{1}$ 和 $t_{2}$ 的 tensor product $t_{1} \otimes t_{2}$ 仍然是一个 tensor
- 考虑特殊情况：vector/covector/matrix 之间也可以有 $\otimes$ 操作

如果 $t_{1} \in Θ_{1}, t_{2} \in Θ_{2}$ ，那么 $t_{1} \otimes t_{2} \in Θ_{1} \otimes Θ_{2}$ 。

我觉得暂时不要关注计算细节，先掌握大的计算原则比较重要。

6.1 用 $\otimes$ 表示 $L$ Permalink

这个逻辑其实要绕一下：

假定有 $f : \underset{r}{\underset{⏟}{V^{*} \times \dots \times V^{*}}} \times \underset{s}{\underset{⏟}{V \times \dots \times V}} \to R$ ，则 $f \in T_{s}^{r} (V) = L^{r + s} (\underset{r}{\underset{⏟}{V^{*}, \dots, V^{*}}}, \underset{s}{\underset{⏟}{V, \dots, V}}; R)$
又: $f$ 可以看做一个 $(r + s)$ -tuple of vectors/covectors $(p_{1}, \dots, p_{r}, q_{1}^{T}, \dots, q_{s}^{T})$
我们可以写 $T_{s}^{r} (V) = \underset{r}{\underset{⏟}{V \otimes \dots \otimes V}} \times \underset{s}{\underset{⏟}{V^{*} \otimes \dots \otimes V^{*}}}$
- 注意这里 $V$ 、 $V^{*}$ 的顺序和 $L$ 里是反的、和 $(r + s)$ -tuple 是一致的

6.2 Rank of Tensor ProductPermalink

基本原则 (参考 StackExchange: Understanding the definition of tensors as multilinear maps, by celtschk)：

T_{s}^{r} (V) \otimes T_{s^{'}}^{r^{'}} (V) \to T_{s + s^{'}}^{r + r^{'}} (V)

(f_{1} \otimes f_{2}) (\underset{r + r^{'}}{\underset{⏟}{κ, \dots, λ, μ, \dots, ν}}, \underset{s + s^{'}}{\underset{⏟}{u, \dots, v, w, \dots, z}}) = f_{1} (\underset{r}{\underset{⏟}{κ, \dots, λ}}, \underset{s}{\underset{⏟}{u, \dots, v}}) \cdot f_{2} (\underset{r^{'}}{\underset{⏟}{μ, \dots, ν}}, \underset{s^{'}}{\underset{⏟}{w, \dots, z}})

对 $T_{n}^{n} (V)$ 我们还可以进一步讨论一下：

按 Wikipedia: Tensor product of linear maps 的例子，matrix 对应 tensor，matrix 的 Kronecker Product 对应 tensor product。两个 $2 \times 2$ matrix (看作 $T_{1}^{1} (V)$ ) 的 Kronecker Product 是一个 $4 \times 4$ matrix，按道理它应该是一个 $T_{2}^{2} (V)$ 。按照这个逻辑展开，所有的 $T_{n}^{n} (V)$ 都是 matrix
至于这个 matrix 本身的维度，要从 $V$ 的维度说起:
- 如果 $V \subseteq R^{m}$ ，那么你的 $T_{1}^{1} (V)$ 应该是一个 $m \times m$ matrix (的集合)
  - 不可能不是 square matrix，因为 $\dim V = \dim V^{*} = m$
- 按照 Kronecker Product 的算法，你的 $T_{2}^{2} (V)$ 应该是一个 $m^{2} \times m^{2}$ matrix (的集合)

\begin{aligned} [\begin{array}{c} a_{1, 1} & a_{1, 2} \\ a_{2, 1} & a_{2, 2} \end{array}] \otimes [\begin{array}{c} b_{1, 1} & b_{1, 2} \\ b_{2, 1} & b_{2, 2} \end{array}] & = [\begin{array}{c} a_{1, 1} [\begin{matrix} b_{1, 1} & b_{1, 2} \\ b_{2, 1} & b_{2, 2} \end{matrix}] & a_{1, 2} [\begin{matrix} b_{1, 1} & b_{1, 2} \\ b_{2, 1} & b_{2, 2} \end{matrix}] \\ a_{2, 1} [\begin{matrix} b_{1, 1} & b_{1, 2} \\ b_{2, 1} & b_{2, 2} \end{matrix}] & a_{2, 2} [\begin{matrix} b_{1, 1} & b_{1, 2} \\ b_{2, 1} & b_{2, 2} \end{matrix}] \end{array}] \\ = [\begin{array}{c} a_{1, 1} b_{1, 1} & a_{1, 1} b_{1, 2} & a_{1, 2} b_{1, 1} & a_{1, 2} b_{1, 2} \\ a_{1, 1} b_{2, 1} & a_{1, 1} b_{2, 2} & a_{1, 2} b_{2, 1} & a_{1, 2} b_{2, 2} \\ a_{2, 1} b_{1, 1} & a_{2, 1} b_{1, 2} & a_{2, 2} b_{1, 1} & a_{2, 2} b_{1, 2} \\ a_{2, 1} b_{2, 1} & a_{2, 1} b_{2, 2} & a_{2, 2} b_{2, 1} & a_{2, 2} b_{2, 2} \end{array}] \end{aligned}

这引出一个很重要的思想：tensor 其实是 matrix of matrices；或者更宽泛一点来讲：tensor 是一个 meta-matrix，是一个 matrix of something，这个 something 是你可以自己定义的。接着用上面那个例子：

$t \in T_{1}^{1} (V)$ 是一个 $1 \times 1$ meta-matrix，它只有一个元素，但是这个元素是一个 $m \times m$ matrix，所以它整体上是一个 $m \times m$ matrix
- 你也可以看成是 $m \times m$ 个 $s \in T_{0}^{0} (V)$
$t \in T_{2}^{2} (V)$ 是一个 $m \times m$ meta-matrix，它的每个元素都是一个 $s \in T_{1}^{1} (V)$ ，所以它整体上是一个 $m^{2} \times m^{2}$ matrix
- 你也可以看成是 $m^{2} \times m^{2}$ 个 $s^{'} \in T_{0}^{0} (V)$
依此类推：
- $t \in T_{n}^{n} (V)$ 是一个 $m^{n - 1} \times m^{n - 1}$ meta-matrix，它的每个元素都是一个 $s \in T_{1}^{1} (V)$ ，所以它整体上是一个 $m^{n} \times m^{n}$ matrix
  - 你也可以看成是 $m^{n} \times m^{n}$ 个 $s^{'} \in T_{0}^{0} (V)$
- 或者我们写成 $T_{n}^{n} (V) = (T_{1}^{1} (V))^{\otimes n}$ ，其中 “ $\otimes n$ 次方” 定义为： $V^{\otimes n} \equiv \underset{n}{\underset{⏟}{V \otimes \dots \otimes V}}$
  - 明显 $T_{n}^{n} (V) \otimes T_{0}^{0} (V) = T_{n}^{n} (V)$
考虑不规则的情况：
- 若 $p > q$ ，则 $t \in T_{q}^{p} (V)$ 是一个 $m^{q} \times m^{q}$ meta-matrix，它的每个元素都是一个 $s \in T_{0}^{p - q} (V)$
- 若 $p < q$ ，则 $t \in T_{q}^{p} (V)$ 是一个 $m^{p} \times m^{p}$ meta-matrix，它的每个元素都是一个 $s \in T_{q - p}^{0} (V)$

6.3 $T_{0}^{0} (V) = R$ 的特殊性Permalink

严格来说，如果 $f_{1} : V \to X, f_{2} : W \to Y$ ，那么：

f_{1} \otimes f_{2} : V \otimes W \to X \otimes Y

从函数的角度来看：

(f_{1} \otimes f_{2}) (v \otimes w) = f_{1} (v) \otimes f_{2} (w)

但是因为我们一般处理 $R$ ，而 $R$ 又是 $T_{0}^{0} (V)$ ，所以 $R \otimes R = T_{0 + 0}^{0 + 0} (V) = R$

所以我们一般的 $V \otimes W$ 的元素仍然是一个 $f : V \otimes W \to R$

6.4 只有在把 tensor/tensor product 当作函数来做运算时你才会用到 Einstein summation conventionPermalink

这里我就不展开了，可以参考：

最后说明一点：在工程应用中经常忽略掉运算结果里 basis 的 tensor product，只保留 tensor product 的分量。这和你写 vector 时只关注分量而忽略约定俗成的 basis $i, j, k, \dots$ 道理是一样的。

X Facebook LinkedIn Bluesky

Tensor

1. 预备知识：covectorPermalink

2. 预备知识：Einstein summation conventionPermalink

3. 预备知识：Banach space / Vector space from continuous $n$ -linear mapsPermalink

4. Tensor Space / RankPermalink

5. TensorPermalink

6. Tensor Product OperatorPermalink

6.1 用 $\otimes$ 表示 $L$ Permalink

6.2 Rank of Tensor ProductPermalink

6.3 $T_{0}^{0} (V) = R$ 的特殊性Permalink

6.4 只有在把 tensor/tensor product 当作函数来做运算时你才会用到 Einstein summation conventionPermalink

分享

留下评论

猜您还喜欢

LL(0) vs. LL(1) Grammars: From Single-String to Flexible Repetition

Lark’s implementation of computing FIRST and FOLLOW sets

LL(1) Parsing

Top-Down Parsers: Recursive Descent, Predictive, and More

1. 预备知识：covectorPermalink

2. 预备知识：Einstein summation conventionPermalink

3. 预备知识：Banach space / Vector space from continuous n-linear mapsPermalink

4. Tensor Space / RankPermalink

5. TensorPermalink

6. Tensor Product OperatorPermalink

6.1 用 ⊗ 表示 LPermalink

6.2 Rank of Tensor ProductPermalink

6.3 T00(V)=R 的特殊性Permalink

6.4 只有在把 tensor/tensor product 当作函数来做运算时你才会用到 Einstein summation conventionPermalink

分享

留下评论

猜您还喜欢

LL(0) vs. LL(1) Grammars: From Single-String to Flexible Repetition

Lark’s implementation of computing FIRST and FOLLOW sets

LL(1) Parsing

Top-Down Parsers: Recursive Descent, Predictive, and More

3. 预备知识：Banach space / Vector space from continuous $n$ -linear mapsPermalink

6.1 用 $\otimes$ 表示 $L$ Permalink

6.3 $T_{0}^{0} (V) = R$ 的特殊性Permalink