Follow the > on the right PCA calculates values (matrices really) from a dataset of samples — each having values for each variable. In Coda, a table naturally represents the dataset: each sample is a row, each variable is a column.
This pack provides you with two sync tables:
Principal Components, that gives you the first two Principal Components values for your dataset, Loadings, that gives you the weight of each variable in each Principal Component. To use the Principal Components sync table
Drag the table to your document from the Pack tab on the right, Press on the table the button , In the Label entry field, select the column with the names of your samples (e.g. MovieReviews.MovieName, DrinkingHabits.CountryName, etc), In the Variable1 entry field, select the column with the values for your first variable (e.g. MovieReviews.Vanity, or DrinkingHabits.Spirits), In the Variable2 entry field, select the column with the values for your second variable (e.g. MovieReviews.TheNewYorkTimes, or DrinkingHabits.Wine), Add as many Variables as needed (up to 6) using , Once all variables have been added, press , The sync table will return a single column “Principal Components”. Bring your pointer to the column to display the pulldown menu showing a small jigsaw piece and choose “PCA Pack - Principal Components options”. A dialog box opens, click there on “Related columns” and each one of them, Now the sync table shows all samples as rows, with their own label, and their value along the first Principal Component (column Pc1) and the second PC (column Pc2): Drinking Habits by Principal Components
The sync table can be displayed as a Scatter chart, with PC1 as horizontal axis, PC2 as vertical axis and segmented by Label. To use the Loadings sync table
Drag the table to your document from the Pack tab on the right, Press on the table the button In the VariableNames entry field, input the list of how your variables are named, e.g. =List(”Spirits”, “Wine”, “Beer”, “Life Expectancy”, “Heart Disease Rate”)
Alternatively, create a table with a Text column, with each row listing one variable name, and in the VariableNames entry field, select this Text column, e.g.: In the Variable1 entry field, select the column with the values for your first variable (e.g. MovieReviews.Vanity, or DrinkingHabits.Spirits) In the Variable2 entry field, select the column with the values for your second variable (e.g. MovieReviews.TheNewYorkTimes, or DrinkingHabits.Wine) Add as many Variables as needed (up to 6) using Once all variables have been added, press The sync table will return a single column “Loadings”. Bring your pointer to the column to display the pulldown menu showing a small jigsaw piece and choose “PCA Pack - Loadings options”. A dialog box opens, click there on “Related columns” and each one of them. Now the sync table shows all variables as rows, with their own label, and their weight for each principal component: Drinking Habits: Loadings
The table reads as follows:
Principal Component 1 =0.35 * Spirits - 0.45 * Wine + 0.07 * Beer - 0.58 * Life Expectancy + 0.58 Heart Disease Rate
The last row of the sync table gives you the percentage of data explained by using respectively the first PC, the first two PCs, the first three PCs, etc. You can retrieve the first one with this formula (replacing XXX by the corresponding name): Format({1}%, 100 * Loadings.Filter(Variable Name =”XXX Percentage Explained”)).Principal Component1)
To display only the Loading values, you need to filter out the last row: in the Filter tab on the right, press and select “Variable Name” “does not contain” “Percentage Explained” To Use Two Datasets in a Doc
Coda allows only one instance of a sync table per doc. So the same sync table will be used for all your datasets. To add a second dataset to the Principal Components or Loadings sync table:
On the sync table, press Options On the tab on the right, choose the PCA PackPress the Select the data for your second analysis as you did for the first dataset Once done, give your dataset a name by using and selecting “Group”. In the entry field for the Group criteria, enter a name of your choosing, e.g. “MovieReviews” or “Drinking” It’s advisable to also give a “Group” criteria to your first dataset To use the results of the PCA for your second dataset, create a view of the sync table and selecting “Group” “is equal to” the group name you chose before. You can use more than two datasets by repeating the steps above.