computer vision 1

矩阵与图像

图像在Matlab中的存储形式为一个维度为3的矩阵 ($m\times n\times 3$)。

以下图像操作都是基齐次坐标(Homogeneous Coordinates)。 In homogeneous coordinates, the multiplication works out so the rightmost column of the matrix is a vector that gets added.

图像位移(Translation)
原位置：$P=(x,y)$, 新位置：$P’=(x+t_x,y+t_y)$,位移量：$t=(t_x,t_y)$,那么位移矩阵为：$T$,即， \[ \begin{bmatrix} x+t_x\\y+t_y\\1 \end{bmatrix}=P’==TP=\begin{bmatrix} 1&0&t_x\
0&1&t_y\
0&0&1 \end{bmatrix}\cdot \begin{bmatrix} x\\y\\1 \end{bmatrix} \]

图像缩放(Scaling)
原位置：$P=(x,y)$, 新位置：$P’=(s_x x,s_y y)$,缩放量：$s=(s_x,s_y)$,那么缩放矩阵为：$S$,即， \[ \begin{bmatrix} s_xx\\s_yy\\1 \end{bmatrix}=P’==SP=\begin{bmatrix} s_x&0&0\
0&s_y&0\
0&0&1 \end{bmatrix}\cdot \begin{bmatrix} x\\y\\1 \end{bmatrix} \]

由于矩阵乘法的不可交换性，导致 Translation & Scaling != Scaling & Translation

图像旋转(Rotation)
原位置：$P=(x,y)$, 新位置：$P’=(x’,y’)$,旋转角度：$\theta=(\theta)$,那么旋转矩阵为：$R$,即 \[ \begin{bmatrix} r_x\\r_y\\1 \end{bmatrix}=P’==RP=\begin{bmatrix} \text{cos}(\theta)&-\text{sin}(\theta)&0\
\text{sin}(\theta)&\text{cos}(\theta)&0\
0&0&1 \end{bmatrix}\cdot \begin{bmatrix} x\\y\\1 \end{bmatrix} \] 旋转公式： \[ \begin{split} r_x &= \text{cos}(\theta)x-\text{sin}(\theta)y\
r_y &= \text{sin}(\theta)x+\text{cos}(\theta)y \end{split} \] 旋转矩阵性质：旋转矩阵的转置产生一个相反方向的旋转
\[ \begin{split} RR^T &=R^TR=I \
det(R)&=1 \end{split} \] 旋转矩阵的行都是相互垂直的(正交)

奇异值分解Singular Value Decomposition (SVD)
SVD represents any matrix $A$ as a product of three matrices: $U\Sigma V^T$. In MATLAB, the function is $[U,S,V]=svd(A)$. \[ \begin{bmatrix} -0.39&-0.92\
-0.92&0.39 \end{bmatrix}_{m\times m}\times\begin{bmatrix} 9.51&0&0\\0&0.77&0 \end{bmatrix}_{m\times n}\times\begin{bmatrix} -0.42&-0.57&-0.70\\0.81&0.11&-0.58\\0.41&-0.82&0.41 \end{bmatrix}_{n\times n}=\begin{bmatrix} 1&2&3\\4&5&6 \end{bmatrix}_{n\times n} \]

SVD的意义。先将缩放因子吸收进$U$,得到$[U\Sigma]_{m\times n}$, \[ \begin{bmatrix} -3.67 & -0.71 & 0 \\ -8.8 & 0.30 & 0 \end{bmatrix}\times \begin{bmatrix} -0.42 & -0.57 & -0.70 \\ 0.81 & 0.11 & -0.58 \\ 0.41 & -0.82 & 0.41 \end{bmatrix} \]

矩阵$U\Sigma$第一列影响的部分为$V^T$的第一行。即，$U\Sigma$的第一列$[U\Sigma]_1=\begin{bmatrix} -3.67 \\ -8.8\end{bmatrix}$只缩放了$V^T$的第一行$(V_1)^T=\begin{bmatrix} -0.42&-0.57&-0.70 \end{bmatrix}$，同理，第二列影响第二行，影响的结果分别为， \[ \begin{split} A_1&=\begin{bmatrix} 1.6&2.1&2.6\
3.8&5.0&6.2 \end{bmatrix}=[U\Sigma]_1 \otimes (V_1)^T\\ A_2&=\begin{bmatrix} -0.6&-0.1&0.4 \
0.2&0&-0.2 \end{bmatrix}=[U\Sigma]_2 \otimes (V_2)^T \end{split} \] 其中，”$\otimes$”为Kronecker Product，显然，$A=A_1+A_2$. 更具意义的是这个由$U,V$的第一列和$\Sigma_1$张成的矩阵$A_1$与$A$已经相差不是太大了。一般，我们称$U$中的这些少量的列向量为主成分。