结构体 faiss::PCAMatrix

struct PCAMatrix : public faiss::LinearTransform

对一组向量应用主成分分析，可以选择进行白化和随机旋转。

公共函数

explicit PCAMatrix(int d_in = 0, int d_out = 0, float eigen_power = 0, bool random_rotation = false)

virtual void train(idx_t n, const float *x) override: 在 n 个向量上进行训练。如果 n < d_in，则特征向量矩阵将用 0 补全

void copy_from(const PCAMatrix &other): 复制预训练的 PCA 矩阵

void prepare_Ab(): 在计算均值、PCAMat 和特征值后调用

virtual void apply_noalloc(idx_t n, const float *x, float *xt) const override: 与 apply 相同，但结果是预先分配的

void transform_transpose(idx_t n, const float *y, float *x) const: 计算 x = A^T * (x - b)，如果 A 具有正交线，则是反变换

virtual void reverse_transform(idx_t n, const float *xt, float *x) const override: 仅在 is_orthonormal 为 true 时有效

void set_is_orthonormal(): 计算 A^T * A 以设置 is_orthonormal 标志

void print_if_verbose(const char *name, const std::vector<double> &mat, int n, int d) const

virtual void check_identical(const VectorTransform &other) const override

float *apply(idx_t n, const float *x) const

应用变换并返回分配的指针中的结果

参数:

n – 要变换的向量数
x – 输入向量，大小为 n * d_in

返回值:

输出向量，大小为 n * d_out

公共成员

float eigen_power

变换后，组件乘以 eigenvalues^eigen_power

>=0：不进行白化 =-0.5：完全白化

float epsilon: 添加到特征值的值，以避免白化时被 0 除

bool random_rotation: PCA 后的随机旋转

size_t max_points_per_d: 训练向量数和维度之间的比率

int balanced_bins: 尝试将输出特征向量分配到这么多个 bin 中

std::vector<float> mean: 均值，大小为 d_in。

std::vector<float> eigenvalues: 协方差矩阵的特征值（=奇异值的平方）

std::vector<float> PCAMat: PCA 矩阵，大小为 d_in * d_in。

bool have_bias

bool is_orthonormal

! 是否使用偏置项

检查矩阵 A 是否是正交的 (启用 reverse_transform)

std::vector<float> A: 变换矩阵，大小为 d_out * d_in。

std::vector<float> b: 偏置向量，大小为 d_out

bool verbose

int d_in

int d_out: ! 输入维度

bool is_trained: 如果 VectorTransform 不需要训练，或者已经完成训练，则设置