# Naive Principal Component Analysis

Principal Component Analysis is a type of unsupervised learning. Behind the PCA there is a geometric interpretation of the data; specifically, a principal component is a direction in which the data has the largest variation.

The Principal Component Analysis’ algorithm first centres the data by demeaning, then it places an axis along the direction of largest variation, and repeats for all the orthogonal axes to the first one. As a result, most of the variation is along the axes and the covariance matrix is diagonal (i.e., each component represents a new and uncorrelated variable). In general, the very first components are the most relevant.

The eigen-values determine how much “stretch” is required along their corresponding eigenvector (axis or component). The larger the eigen-values, the larger the variation along that component and, as a a consequence, the higher its relevancy.

According to the algorithm above, a PCA is employed to conduct a model-free factor analysis to infer the structure of portfolio returns instead of relying on established factor models.