Digest of Essence of Linear Algebra
Greatest thanks to 3Blue1Brown!
Chapter 1 - Vectors, what even are they?Permalink
The introduction of numbers as coordinates (by reference to the the particular division scheme of the open one-dimensional continuum) is an act of violence. ―Hermann Weyl
Mathematicians’ perspective:
- A vector can be anything where there’s a sensible notion of adding two vectors and multiplying a vector by a number.
- I.e. 只要实现了 “可以两两相加” 和 “可以与一个数相乘” 的都可以叫 vector
- Linear algebra revolves around vector addition and scalar multiplication. 线性代数是紧紧围绕着这两个基本概念的。
如果要从 geometrics 的角度来理解的话,我们最好把每个 vector 都理解为:
- An arrow inside a coordinate system, with it’s tail sitting at the origin (原点) (Sorry, Hermann Weyl)
- 虽然理论上,vectors can freely sit anywhere in the space.
- A vector
gives instructions for how to get from its tail to its tip.- First walk along 1st axis for
units; - then walk along 2nd axis for
units.
- First walk along 1st axis for
1.1 Adding two vectorsPermalink
假设有
- First walk along 1st axis for
units; - then walk along 2nd axis for
units; - then walk along 1st axis for
units; - then walk along 2nd axis for
units. - 或者
- First walk along 1st axis for
units; - then walk along 2nd axis for
units.
- First walk along 1st axis for
- 简单的实数加法也可以这么理解,比如
:- First walk (along the only axis) for
units; - then walk for
units; - you would walk
units in total.
- First walk (along the only axis) for
1.2 Multiplying a vector by a numberPermalink
means you stretch out so that it’s times as long as before.- First walk along 1st axis for
units; - then walk along 2nd axis for
units.
- First walk along 1st axis for
means first flip then stretch it out by factor of . means you squash it down so that it’s of the original length.
This process of 1) stretching or 2) squashing or 3) sometimes reversing the direction of a vector is called scaling. The number you used to scale the vector is called a scalar. (翻译为 “标量”. Never mind.) In fact, throughout linear algebra, one of main things that numbers do is to scale vectors.
Chapter 2 - Linear combinations, span, and basis vectorsPermalink
Mathematics requires a small dose, not of genius, but of imaginative freedom which, in a larger dose, would be insanity. ―Angus K. Rodgers
Unit vector: a vector that has a magnitude of one.
- Unit vector in
-direction - Unit vector in
-direction
Think about each coordinate in
- Together,
and are called the basis vectors of a coordinate system. (坐标系的基向量) - When you think about coordinates in
as scalars, the basis vectors are what those scalars scale to represent
What if we chose some arbitrary vectors as basis vectors? We could have done so and gotten a completely reasonable, new coordinate system.
- 亦即,如果有
,我们可以想象有一个 “ coordinate system”,它以 和 为 basis vectors,那么 就相当于是这个 coordinate system 的一个 arrow 指向 这个点。
Suppose
- Rule of thumb: any time we describe vectors numerically, it depends on an implicit choice of what basis vectors we’re using.
Any time you scale two vectors and adding them like this, it’s called a linear combination of those two vectors.
- You can see “linear” this way: if you fix one scalar and let the other one change its value freely, the tips of the resulting vectors would draw a straight line.
- When your original vectors happen to line up (共线),the tip of the resulting vector is limited to exactly the same line.
- If both vectors are zero, you’d just be stuck at the origin.
The span of
- The span of most 2-D vectors is all vectors in this 2-D space.
- When they line up, their span is all vectors whose tips sit on that line.
- The span of two vectors is basically a way of asking what are all the possible vector you can reach using only the 2 fundamental operations of vectors–vector addition and scalar multiplication.
Vectors vs PointsPermalink
In general, if you’re thinking about a vector on its own, think of it as an arrow. And if you’re dealing with a collections of vectors, it’s convenient to think of them all as points.
- The span of most 2-D vectors ends up being the entire infinite sheet of the 2-D space.
- When they line up (共线), their span is just that line.
In 3-D cases:
- Take 2 vectors in 3-D space that point in different directions. Their span is a flat sheet cutting through the origin of the 3-D space.
- Take 3 vectors:
- If the 3rd vector happens to be sitting on the span of the first two, their span would still be the flat sheet.
- In other words, adding a scaled 3rd vector to the linear combination doesn’t really give you access to any new vectors.
- Otherwise the span would be the whole of the 3-D space.
- You can imagine this way: when you scale the 3rd vector, it moves the span of the first 2 vectors along its direction, sweeping the flat sheet through all the 3-D space.
- If the 3rd vector happens to be sitting on the span of the first two, their span would still be the flat sheet.
In the case where the 3rd vector was already sitting on the span of the first two, or the case where two vectors happen to line up, at least one of these vectors is redundant, not adding anything to our span. Whenever this happens, you could remove one vector without reducing then span. We say that these vectors are linearly dependent.
- Another way of phrasing that would be to say that one of the vectors can be expressed as a linear combinations of the others, since it’s already in the span of the others.
On the other hand, if each vector really add another dimension to the span, they are said to be linearly independent.
Technical definition of basis:
- The basis of a vector space is a set of linearly independent vectors that span this full space.
and are called the basis vectors- 它们俩合起来称作(被它们 span 构成的)2-D space 的 basis
Chapter 3 - Linear transformations and matricesPermalink
Unfortunately, no one can be told what The Matrix is. You have to see it for yourself. ―Morpheus
Transformation is essentially a fancy word for “function”. 把
Visually speaking, a transformation is linear if it has two properties:
- all lines (平面内的所有 lines,不单单只是网格线) must remain lines, without getting curved;
- and the origin must remain fixed in place
(满足 1 但是不满足 2 的我们称为 Affine Transformation,仿射变换)
In general, you should think of linear transformations as “keeping grid lines parallel and evenly spaced” (保持网格线平行并等距分布).
How would you describe one of these numerically? What formula did you give to the comupter so that if you give it the coordinates of a vector, it can give you the coordinates of where that vector lands (in the transformed space)? It turns out that you only need to record where the two basis vectors,
E.g.
So a two dimensional linear transformation is completely described by just 4 numbers:
这也解释了矩阵乘法的规律:
Matrix-vector multiplication is just a way to compute what the linear transformation does to a given vector. 矩阵向量乘法就是计算线性变换作用于给定向量的一种途径。
If
Chapter 4 - Matrix multiplication as composition (of linear transformations)Permalink
To describe the effects of applying one linear transformation and then another.
注意这里是先做
计算方法同上。其实
Chapter 5 - The determinant (行列式)Permalink
The purpose of computation is insight, not numbers. ―Richard Hamming
Among those linear transformations, some of them seemed to stretch space out, while others squash it in. One thing that turns out to be pretty useful for understanding one of these transformations is to measure exactly how much it stretches or squashes things. More specificantly, to measure the factor by which the area of a given region increases or decreases.
比如
This very special scaling factor is called the determinant of that transformation.
If the determinant of a 2-D transformation is 0, it squashes all of space onto a line, or even a single point. Since then, the area of any region would become 0.
However, in fact, a determinant can be negative. How could you scale an area by a negative amount? This has to do with the idea of orientation. Any transformation that turn over the area (想象把一张纸从正面翻到背面) is said to invert the orientation of space.
Another way to think about it is to consider
Whenever the orientation of space is inverted, the determinant will be negative, but its absolute value still tells you the factor by which the area have been scaled.
If the determinant of a matrix is 0, the basis vectors inside this matrix are linearly depenedent.
Chapter 6 - Inverse matrices, column space, rank and null space (kernel)Permalink
To ask the right question is harder than to answer it. ―Georg Cantor
The main reasons that linear algebra is more broadly applicable and required for just about any technical discipline is that it let us solve certain systems of equations. By “systems of equations”, I mean you have a list of variables and a list of equations relating them.
线性方程组即是 linear system of equations.
To solve
Let’s start with
Applying the transformation on
But when
However, solution can still exist when
You might notice that some of those zero determinant cases feel a lot more restrictive than others (有的是降二维有的只降一维). When the output of a transformation is a line, i.e. 1-D, we say the transformation has a rank of 1. So the rank of a matrix is the number of dimensions in the output of the transformation.
The set of all possible outputs from a matrix, is called the column space of the matrix. In other words, the column space is the span of the columns of the matrix. So a more precise definition of rank would be that it’s the number of dimensions in the column space.
When its rank is equal to the number of columns, we call a matrix full rank.
Note that
反过来我们可以定义:The null space of matrix
Chapter 6 Supplement - Nonsquare matrices as transformations between dimensionsPermalink
首先只有 square matrix 才有 determinant。
A
A
Chapter 7 - Dot products and dualityPermalink
Let
反之
Chapter 8 - Cross productsPermalink
8.1 Standard introductionPermalink
非标准定义Permalink
Suppose
标准定义Permalink
The cross product is not a number, but a vector.
按这个定义:两个 2-D vectors 不可能求出叉积,虽然你可以按 determinant 来算出一个 scalar。所以一般叉积是指两个 3-D vectors 的叉积。
死记硬背式:
Digress: dot product 与 cross product 的几何意义Permalink
假定我们限定
- 令
,则 值的大小代表 和 共线的程度 - 令
,则 的长度代表 和 垂直的程度
8.2 Deeper understanding with linear transformationsPermalink
考虑 Chapter 7 - Dot products and duality 时我们说过的
Our plan:
- Define a 3D-to-1D linear transformation in terms of
and - Find its dual vector
- Show that this dual vector is
我们注意到:
Question: what 3-D vector
Answer:
因为
Chapter 9 - Change of basis (基变换)Permalink
Mathematics is the art of giving the same name to different things. ―Henri Poincaré
A space has no grid. 所有的坐标系都是我们人为加上去的。如果一个 vector 用 basis
假定
反过来”翻译”的话就是给定
How to translate a matrix?Permalink
假定我们在
同理:
In general, whenever you see an expression like
Chapter 10 - Eigenvectors and eigenvaluesPermalink
考虑单个 vector 和它所张成的一条直线。经历一个 linear transformation 之后,有些 vector 会偏离它原来张成的这条直线,有的仍然会留在它自己张成的直线上(只是长度或者方向发生了变化,相当于乘以了一个 scalar)。
All these special vectors that remain on their spans after a lienar transformation are called the eigenvectors of the transformation. Each eigenvector has associated with it, what’s called an eigenvalue, which is just the factor by which it stretched or squashed during the transformation.
With any linear transformation described by a matrix, you could understand what it’s doing by reading off the columns of this matrix as the landing spots for basis vectors. But often a better way to get at the heart of what the linear transformation actually does, less dependent on your particular coordinate system, is to find the eigenvectors and eigenvalues.
作用举例:Consider some 3-D rotation. If you can find an eigenvector for that rotation, what you found is the axis of rotation. (rotation 只旋转不拉伸,eigenvalue 为 1)
公式定义:
(现在这个公式看起来多好懂!
求解步骤:
If
A 2-D transformation doesn’t have to have (real) eigenvectors. 即
如果
What if both basis vectors are eigenvectors? 如果
如果我们想利用这一特性计算
假定在 coordinate system
(如果没有足够的 eigenvector 能 span the full space,你就拼不出足够的 column 到
Chapter 11 - Abstract vector spacesPermalink
Linear transformation for functionsPermalink
Formal definition of linearity:
- Additivity:
- Scaling:
“keeping grid lines parallel and evenly spaced” is really just an illustration of what these 2 properties mean in the specific case of points in 2-D space.
Another example:
To really drill in the parallele, let’s describe the derivative with a matrix. Let’s limit ourselves on polynomials (i.e. we have a space of all polynomials).
Define basis functions
注意这个矩阵的第一列相当于
Linear algebra concepts | Alternative names when applied to functions |
---|---|
Linear transformations | Linear operators |
Dot products | Inner products |
Cross products | N/A |
Eigenvectors | Eigenfunctions |
What is a vector?Permalink
As long as you’re dealing with a set of objects where there’s a reasonable notion of scaling and adding, whether that’s a set of arrows in apce, list of numbers, functions or whatever you choose to define, all the tools developed in linear algebra regarding vectors, linear transformations and all that stuff, should be able to apply.
Thses sets of vetor-ish things (arrows, lists of numbers, functions, etc) are called vector spaces.
我们有 8 条 rules for vectors addition and scaling:
These rules are called axioms. If a space statisfies these 8 axioms, it is a vector space. Axioms are not rules of nature, but an interface (consider Java Interface!)
The form that your vectors take doesn’t really matter as long as there is some notion of adding and scaling vectors that follow these axioms.
Abstractness is the price of generality.
Comments