Skip to content

Cluster Analysis Course


EduCBA

Summary

Price
£149 inc VAT
Or £49.67/mo. for 3 months...
Study method
Online
Course format What's this?
Video
Duration
2 hours · Self-paced
Access to content
Lifetime access
Qualification
No formal qualification
Additional info
  • Exam(s) / assessment(s) is included in price

Overview

What is Cluster Analysis?

Cluster Analysis is a statistical tool which is used to classify objects into groups called clusters, where the objects belonging to one cluster are more similar to the other objects in that same cluster and the objects of other clusters are completely different. In simple words cluster analysis divides data into clusters that are meaningful and useful. Clustering is used mainly for two purposes – clustering for understanding and clustering for utility.

Application of cluster analysis

  • Cluster analysis is used in many fields like machine learning, market research, pattern recognition, data analysis, information retrieval, image processing and data compression.
  • Cluster analysis can help the marketers to find out distinct groups of their customer base.
  • Cluster analysis is used in the field of biology to find out plant and animal taxonomies and categorize genes with similar characteristics
  • Cluster analysis is used in an earth observation database to group the houses in a city according to the house type, value and location.
  • Clustering can also be used to segment the documents on the web based on a specific criteria
  • In data mining, cluster analysis is used to gain in-depth understanding about the characteristics of data in each cluster.

Description

Cluster Analysis Course DescriptionSection 1: Introduction

Meaning of Cluster Analysis

The term cluster analysis includes a number of different algorithms and methods for grouping of data and objects. It is an exploratory data analysis tool. Cluster analysis is used to discover data structures without explaining why they exist. This section includes the brief introduction, history and benefits of cluster analysis.

Understanding of Cluster Analysis

Under this section we will learn about good clustering which produces high quality clusters and also you will learn how to measure the quality of clustering. The other topics included in this section are major clustering approaches, techniques of cluster analysis, basic concepts and algorithms of cluster analysis.

Example of Cluster Analysis

Clustering is used in every aspect of our daily life. Under this chapter you will learn see some illustration and practical application of cluster analysis in various fields. One example is given with a retail chain of stores across various locations. Another example is given based on market segmentation. Finally a simple numerical example is given which explains the objectives of cluster analysis. An example from each field like marketing, land use, biology, Psychology, Medicine, information retrieval, etc where cluster analysis is used are also given under this section.

Section 2: Types of Clustering

Hierarchical method of Clustering

Hierarchical clustering is a set of nested clusters that are organized in the form of a tree. The hierarchical clustering also contains different methods under it to find out which clusters should be joined at each stage. There are two main types of hierarchical clustering – Agglomerative and Divisive. The agglomerative clustering algorithm is explained in detail with example under this section.

The main methods of hierarchical clustering are also explained in brief in this section

  • Nearest Neighbour Method (Single Linkage Method)
  • Furthest Neighbour Method (Complete Linkage Method)
  • Average Linkage Method (Between Groups)
  • Centroid Method
  • Ward’s Method

Single linkage clustering

Single linkage method is also known as the nearest neighbour method. This methods is used to measure the distance between clusters where there are more than two observations. The major topics included in this section are listed below

  • Spanning tree
  • Contracting Space
  • Chaining
  • Dendrogram or tree diagram
  • Example of nearest neighbour method using diagrams

Linkage methods, Wards method

The single linkage method is explained in detail in the previous chapter. This section deals with the other two linkage methods – Complete linkage and Average Linkage.

In Complete linkage method the distance between the two clusters is said to be the maximum distance between the members. The formula is explained in this section. An example is given in detail to make you understand easily.

In average linkage method the distance between two clusters is considered as the average distance between all the pairs in the two clusters. This method is explained in detail under this section with an example.

In centroid method the mean value of each variable of each cluster is found out and the distance between centroids is used to merge the clusters. This method is also explained with an example.

In the ward’s method the pairs of clusters are combined and the sum of the squared distances within each cluster is found out. Finally the lowest sum of squares is chosen. This method is more popular. This section contains examples of this method.

k means clustering

K means clustering is also known as Non Hierarchical clustering. Under this method the desired number of clusters are mentioned beforehand and the best solution is chosen from that. The steps for carrying out K means clustering is mentioned in this chapter.

K means and Example of K means, difference between hierarchical and non hierarchical clustering

The important points of K means clustering is mentioned in this chapter which includes Partitional clustering approach, centroid and K means algorithm. The details of K means clustering is explained using the following points

  • Initial Centroids
  • Closeness
  • Similarity measures
  • Happening of convergences
  • Complexity of K means
  • Types of K means clustering – Sub optimal clustering and Optimal Clustering
  • Solutions to Initial Centroids problem
  • Evaluating K means cluster
  • Difference between Hierarchical Clustering and K means Clustering
  • Strengths of K means clustering
  • Limitations of K means clustering

Example of K means no. of cluster, Statistical tests, Dendrogram, Scree plot

With its computation K means clustering is considered as a Analysis of Variance (ANOVA) in reverse. The physical fitness example is given to explain the K means clustering method. The K means clustering is explained with other examples using plots and graphs.

Dendrogram – When carrying out a hierarchical cluster analysis, the result can be represented in the form of a diagram which is known as Dendrogram. This diagram explains which are the clusters which have been joined at each stage of the analysis and what was its distance at the time of joining. This helps to select the optimum number of clusters. An example of a Dendrogram is given under this heading.

Scree Plot displays the eigenvalues connected with a component in descending order versus the number of the component. The pattern of Scree plot and the properties of Scree plot in cluster analysis is discussed in this section.

Two step cluster analysis, Evaluation

The two step cluster analysis is used to reveal natural clusters within a data set. It runs pre clustering method first and then hierarchical method. This section contains the following topics under it

  • Algorithm of two step cluster analysis
  • The two steps of the two step cluster analysis
  • Case study – classifying motor vehicles using two step cluster analysis

Example for Listwise and Pairwise deletion of missing values , SPSS windows of output

Listwise and Pairwise deletions are used to find out the missing data. These techniques are used when a data is missing completely at random. Listwise deletion deletes all the data if there is one or more missing values. Pairwise deletion tries to minimize the loss that can be caused because of Listwise deletion. Listwise and Pairwise deletion has its own advantages and disadvantages. This section includes the following topics

  • What is Listwise deletion
  • Example of Listwise deletion
  • What is Pairwise deletion
  • Example of Pairwise deletion

SPSS windows of output

In SPSS cluster analysis can be found under Analyze à Classify. SPSS offers three methods of cluster analysis – Hierarchical, K means and Two step cluster. This section includes examples of performing cluster analysis in SPSS.

K means cluster theory, SPSS windows for k means

This section explains what is K means clustering method, its history, algorithm, initialization methods, applications and description.

SPSS is another statistical software which is used to perform cluster analysis. The steps to conduct cluster analysis in SPSS is simple and it lets you to choose the variables on which the cluster analysis needs to be performed. You can perform K means in SPSS by going to the Analyze à Classify à K means cluster. The steps for performing K means cluster analysis in SPSS in given under this chapter. Necessary screenshots are also provided for your easy reference.

Questions and answers

Currently there are no Q&As for this course. Be the first to ask a question.

Reviews

Currently there are no reviews for this course. Be the first to leave a review.

FAQs

Interest free credit agreements provided by Zopa Bank Limited trading as DivideBuy are not regulated by the Financial Conduct Authority and do not fall under the jurisdiction of the Financial Ombudsman Service. Zopa Bank Limited trading as DivideBuy is authorised by the Prudential Regulation Authority and regulated by the Financial Conduct Authority and the Prudential Regulation Authority, and entered on the Financial Services Register (800542). Zopa Bank Limited (10627575) is incorporated in England & Wales and has its registered office at: 1st Floor, Cottons Centre, Tooley Street, London, SE1 2QG. VAT Number 281765280. DivideBuy's trading address is First Floor, Brunswick Court, Brunswick Street, Newcastle-under-Lyme, ST5 1HH. © Zopa Bank Limited 2024. All rights reserved.