讲一下numpy的矩阵特征值分解与奇异值分解

时间:2024-01-12 22:37:08

1、特征值分解

主要还是调包:

from numpy.linalg import eig

特征值分解:  A = P*B*PT  当然也可以写成 A = QT*B*Q  其中B为对角元为A的特征值的对角矩阵,P=QT

首先A得对称正定,然后才能在实数域上分解,

>>> A = np.random.randint(-10,10,(4,4))
>>> A
array([[ 6, 9, -10, -1],
[ 5, 9, 5, -5],
[ -8, 7, -4, 4],
[ -1, -9, 0, 6]]) >>> C = np.dot(A.T, A)
>>> C
array([[126, 52, -3, -69],
[ 52, 292, -73, -80],
[ -3, -73, 141, -31],
[-69, -80, -31, 78]]) >>> vals, vecs = eig(C)
>>> vals
array([357.33597086, 174.10172008, 8.84429957, 96.71800949])
>>> vecs
array([[-0.28738314, -0.51589436, -0.38221983, -0.71075449],
[-0.87487263, 0.12873861, -0.24968051, 0.39456798],
[ 0.2572149 , -0.69304313, -0.33950158, 0.58161018],
[ 0.29300052, 0.48679627, -0.82237845, -0.02955945]])

故使用时应先将特征值转换为矩阵:

>>> Lambda = np.diag(vals)
>>> Lambda
array([[357.33597086, 0. , 0. , 0. ],
[ 0. , 174.10172008, 0. , 0. ],
[ 0. , 0. , 8.84429957, 0. ],
[ 0. , 0. , 0. , 96.71800949]]) >>> np.dot(np.dot(vecs, Lambda), vecs.T) # 与C=A.T*A相等
array([[126., 52., -3., -69.],
[ 52., 292., -73., -80.],
[ -3., -73., 141., -31.],
[-69., -80., -31., 78.]]) >>> np.dot(np.dot(vecs.T, Lambda), vecs)
array([[171.65817919, 45.58778569, 53.20435074, 13.37512137],
[ 45.58778569, 125.15670964, 28.22684299, 134.91290105],
[ 53.20435074, 28.22684299, 129.48789571, 80.5284382 ],
[ 13.37512137, 134.91290105, 80.5284382 , 210.69721545]])

故验证了使用np中的eig分解为A=P*B*PT 而不是A=QT*B*Q,其中P=vecs,

即 C = vecs * np.diag(vals) * vecs.T # 这里简写*为矩阵乘法

然后再来看使用np中的eig分解出来的vec中行向量是特征向量还是列向量是特征向量,只需验证:A*vecs[0] = vals[0]*vecs[0]

>>> np.dot(C, vecs[0])
array([-12.84806258, -80.82266859, 6.66283128, 17.51094927])
>>> vals[0]*vecs[0]
array([-102.69233303, -184.34761071, -136.58089252, -253.97814676]) >>> np.dot(C, vecs[:,0])
array([-102.69233303, -312.62346098, 91.91213634, 104.69962583])
>>> vals[0]*vecs[:, 0]
array([-102.69233303, -312.62346098, 91.91213634, 104.69962583])

后者两个是相等的,故使用np中的eig分解出的vecs的列向量是特征向量。

然后我们可以验证P是单位正交矩阵:

>>> np.dot(vecs.T, vecs)
array([[ 1.00000000e+00, -7.13175042e-17, -2.45525952e-18,
2.75965773e-16],
[-7.13175042e-17, 1.00000000e+00, 2.49530948e-17,
-5.58839097e-16],
[-2.45525952e-18, 2.49530948e-17, 1.00000000e+00,
-7.85564967e-17],
[ 2.75965773e-16, -5.58839097e-16, -7.85564967e-17,
1.00000000e+00]]) >>> np.dot(vecs, vecs.T)
array([[ 1.00000000e+00, 2.97888865e-16, -2.68317972e-16,
1.69020590e-16],
[ 2.97888865e-16, 1.00000000e+00, -4.40952204e-18,
-6.24188690e-17],
[-2.68317972e-16, -4.40952204e-18, 1.00000000e+00,
-1.13726775e-17],
[ 1.69020590e-16, -6.24188690e-17, -1.13726775e-17,
1.00000000e+00]]) # 可以看到除对角元外其他都是非常小的数

PT*P = P*PT = E , PT=P-1。事实上,在求解P的过程中就使用了施密特正交化过程。

另一方面,我们从数学角度来看:

首先补充一些数学知识:可以看我另一篇文章:矩阵知识

A = P*B*P-1  ,其中B为对角元素为A的特征值的对角阵,P的列向量为特征值对应的特征向量(因为B每行乘以P每列)

2、奇异值分解

还是调包:

from numpy.linalg import svd

设任意矩阵A是m*n矩阵

奇异值分解:A = U*Σ*VT , 其中U为满足UTU=E的m阶(m*m)酉矩阵,Σ为对角线上为奇异值σi 其他元素为0的广义m*n对角阵,V为满足VTV=E的n阶(n*n)酉矩阵

a = np.random.randint(-10,10,(4, 3)).astype(float)

'''
array([[ -9., 3., -7.],
[ 4., -8., -1.],
[ -1., 6., -9.],
[ -4., -10., 2.]])
''' In [53]: u, s, vh = np.linalg.svd(a) # 这里vh为V的转置 In [55]: u.shape, s.shape, vh.shape
Out[55]: ((4, 4), (3,), (3, 3)) '''
In [63]: u
Out[63]:
array([[-0.53815289, 0.67354057, -0.13816841, -0.48748749],
[ 0.40133556, 0.1687729 , 0.78900752, -0.43348888],
[-0.59291924, 0.04174708, 0.59180987, 0.54448603],
[ 0.44471115, 0.71841213, -0.09020922, 0.52723647]]) In [64]: s
Out[64]: array([16.86106528, 11.07993065, 7.13719934]) In [65]: vh
Out[65]:
array([[ 0.31212695, -0.760911 , 0.56885078],
[-0.74929793, -0.56527432, -0.3449892 ],
[ 0.58406282, -0.31855829, -0.74658639]])
''' In [56]: np.allclose(a, np.dot(u[:, :3] * s, vh))
Out[56]: True # 将s转化为奇异值矩阵
In [60]: smat[:3, :3] = np.diag(s) In [61]: smat
Out[61]:
array([[16.86106528, 0. , 0. ],
[ 0. , 11.07993065, 0. ],
[ 0. , 0. , 7.13719934],
[ 0. , 0. , 0. ]]) # 验证分解的正确性
In [62]: np.allclose(a, np.dot(u, np.dot(smat, vh)))
Out[62]: True