Principal Component Analysis

Principal Component Analysis | Dimension Reduction

Dimension Reduction Techniques- The two popular and well-known dimension reduction techniques are-

In this article, we will discuss about

Principal Component Analysis Dimension Reduction:   It is defined as a process of converting a data set having vast dimensions into a data set with lesser dimensions. It ensures that the converted data set conveys similar information concisely.

 

Using dimension reduction techniques in the figure example 1, we convert the dimensions of data from 2 dimensions (x1 and x2) to 1 dimension (z1). In machine learning, using both these dimensions convey similar information. Also, they introduce a lot of noise in the system. So, it is better to use just one dimension. It makes the data relatively easier to explain.

Dimension reduction offers several benefits such as:

  • Compresses the data and thus reduces the storage space requirements
  • Reduces the time required for computation since less dimensions require less computation
  • Eliminates the redundant features
  • Improves the model performance

Properties of Principal Component Analysis:

  • Principal Component Analysis is a well-known dimension reduction technique
  • It transforms the variables into a new set of variables called as principal components
  • These principal components are linear combination of original variables and are orthogonal
  • The first principal component accounts for most of the possible variation of original data
  • The second principal component does its best to capture the variance in the data
  • There can be only two principal components for a two-dimensional data set

PCA Algorithm- The steps involved in PCA Algorithm are as follows:

Step-01: Get data

Step-02: Compute the mean vector (µ)

Step-03: Subtract mean from the given data

Step-04: Calculate the covariance matrix

Step-05: Calculate the eigen vectors and eigen values of the covariance matrix

Step-06: Choosing components and forming a feature vector

Step-07: Deriving the new data set

Applications of Principal Component Analysis:

PCA is predominantly used as a dimensionality reduction technique in domains like facial recognition, computer vision and image compression. It is also used for finding patterns in data of high dimension in the field of finance, data mining, bioinformatics, , etc.

Some of the Chemical Engineering Software which uses PCA are Aspen ProMV, Aspen Inferential Qualities etc.

Equinox designs and delivers efficient decision support systems to enable apt decision making from enterprise-wide raw data.

Unit No. 19, Electronic Estate, Pune-Satara Road, Pune, Maharashtra - 411009, INDIA.
+91-20-41020100