Exploration of Dendrograms
Completion requirements
Set up the following workflow.
- Load the good old data set on human development index for different countries. Check that data about geographical position (latitude, longitude, continent) and religion is among meta variables and thus not used in computation of distances.
- Compute Euclidean distance, normalized.
- Compute Hierarchical clustering (use Ward linkage). Annotate by country.
Now, let us explore what we have got.
- Observe the resulting dendrogram. Do parts of the "tree" correspond to geographical regions? (Sometimes they do, sometimes they don't. Just ... explore)
- It is particularly interesting to explore the cluster with developed countries, that is, Europe, North America, Australia and some others. How is it split into two groups? For instance, which countries that are traditionally (and geographically) in western Europe find themselves in the wrong cluster?
- Where is Cuba? Can you comment on why?
- Find Russian Federation. Which other countries are similar with respect to development?
- Note, again, that this clustering is not based on any geographical or political data, but only on socio-economic data, mostly from 2015 (Cuba being a good example). Bearing all this in mind: where do you find Ukraine? (Note: a country in one part of the clustering is not necessarily more similar to all countries in that group than to any country in any other.)
- Instead of countries, use "Majority religion" for annotation. Countries that are better off seem to be Christian. Does this indicate that (a) Christianity spurs development, (b) that smarter people turn towards Christianity, (c) none of this. Explain.
Last modified: Monday, 14 March 2022, 10:54 AM