An explanation how the system is designed.
iPCA (interactive Principal Component Analaysis) is a system, with which uses are interactively analyze data. It uses Principle Component Analysis (PCA) that is a widely used mathematical technique in many fields for factor and trend analysis, dimension reduction, etc. However, it is often considered to be a "black box" operation whose results are difficult to interpret and sometimes counter-intuitive to the user. In order to assist the user in better understanding and utilizing PCA, the system has been developed by visualizing the results of principal component analysis using multiple coordinated views and a rich set of user interactions. Our design philosophy is to support analysis of multivariate datasets through extensive interaction with the PCA output.
It consists of four views (A ~ D) and two control panels (E ~ F).
Projection View(A): Two principal components (by default, the first and second most dominant eigenvectors) are used to project data points onto a two-dimensional coordinate system.
Eigenvector View(B): In the Eigenvector View, data points are shown in the eigenspace. The calculated eigenvectors and their eigenvalues are displayed in a vertically projected parallel coordinates visualization, with eigenvectors ranked from top to bottom by dominance. The distances between eigenvectors in the parallel coordinate view vary based on their eigenvalues, separating the eigenvectors based on their mathematical weights.
Data view(C): The Data View is located below the Projection View, and shows a parallel coordinates visualization of all data points in the original data dimensions. In this view, an auto-scaling function is applied to increase the readibility of data.
Correlation View(D): Pearson-correlation coefficients and relationships between variables are represented in the Correlation View as a matrix of scatter plots and values. Since correlations between dimensions are symmetric, repetition is avoided by separating the matrix into three components: the diagonal, the bottom triangle, and the top triangle. The diagonal displays the name of the dimension as a text string. The bottom triangle shows the coefficient value between two dimensions with a color indicating positive (red), neutral (white), and negative (blue) correlations. The top triangle contains cells of scatter plots in which all data items are projected onto the two intersecting dimensions. The colors of the data items are the same as the colors used in the other three views so that clusters are easily identified.
Here are some screenshots.
Here is a short explanation how to use the system.
|Data loading||- Click "Browse..." button, and select the data you want to load. Then click the "Start Loading" button located below of the "Browse..." button.|
|Navigation (Only works in the Projection view)||- Zooming (Mouse left button pressing - Zooming In/ Mouse right button pressing - Zooming Out)
- Panning (Mouse middle button pressing & move your mouse)
|Item selection (Allows in all views)||- Single item selection (Ctrl + Mouse left button clicking): Useful for selecting an individual item.
- Range selection (Alt + Mouse left button pressing + Creating a region boundary): Useful for selecting multiple items.
|Changing the pre-selected principal components.||By default, the first principal component (PC1) and the second principal component (PC2) are mapped with X-axis and Y-axis in the Projection view, correspondingly.
In the options (located next to the Eigenvector view), pre-selected options of PC1 and PC2 can be changed by the user. If you want to change the selection of PC2 to PC3, first click the PC2 check box and click the PC3 check box. Based on the principal component changes, you might see the changed visual representation in the Projection view - PC1 (X-axis) and PC3 (Y-Axis).