Group up ------------------ * **Topic:** k means clusters, unsupervised learning, data grouping * **Task A:** 1. Implement the k means clustering algorithm. 2. Create random sets of data with, e.g., 4 centers, and try to find them. You may also try changing the dimensionality of the problem. * **Task B:** 1. Analyse the iris flower data with the k means algorithm. 2. Since we already know the species of each flower, the program prints a report on how well it did. Does k means recognize the flowers as well as the neural network did? * **Task C:** 1. Get the questionnaire data and analyse it with the k means algorithm. 2. We have no idea how many categories describe this set of data or whether it can be described as clusters at all. What do you think? How many clusters would you use to split up this data set? How would you describe these groups? * **Template:** `clustering.py `_ * **Data:** The iris set combines the training and test sets from the neural network task. The ``questions.txt`` set was collected from university teaching staff in a Finnish university in 2011. The original questionnaire contained 30 questions on what motivates teachers in their work. The questions were grouped in 5 categories and a total score between -1 (highly demotivating) and 1 (highly motivating) was calculated for each. The categories were (1) personal benefit, (2) importance of work, (3) received feedback, (4) available resources, and (5) development possibilities. - `iris-fulldata.csv `_ - `questions.txt `_ * **Further reading:** - https://en.wikipedia.org/wiki/Cluster_analysis - https://en.wikipedia.org/wiki/K-means_clustering clustering.py ############### .. automodule:: clustering :members: :undoc-members: