Cluster Analysis

Get A Free Quote

Cluster analysis is a method of studying ‘objects to gather’ in mathematical statistics, that is, a group of individuals who are known to study will be divided into several types according to indicators (variables) before analyzing. For example, in the study of coronary heart disease, k observation indicators of n patients were observed, and cluster analysis was used to analyze which type of n patients belonged to, and similar patients could take similar treatment measures; it can also classify indicators to identify indicators that describe different aspects of the patient’s condition and help doctors better understand the patient’s condition. The basic goal of cluster analysis is to look for natural categories of samples or variables. Clustering of samples is called Q-type clustering, and the purpose is to find commonality between samples. The clustering of variables is called R-type clustering, and its purpose is to reduce the dimension of the index so as to select a representative index.

Our Services

Our statistical experts will select the appropriate cluster analysis method based on the cost, operational feasibility and data type provided, making your tasks easier and more efficient.

Systematic clustering

There are two kinds of systematic clustering methods: aggregation method and decomposition method. The aggregation method is to firstly treat each individual as a class, merge the two most similar classes, recreate the distance between the classes, and then again merge the two similar classes, reducing the number of the classes one by one until all the individuals are grouped together. First, we treat all individuals as one class, divide the most dissimilar individuals into two categories, and add one class to each step until all individuals become one. However, it should be noted that each step of the system clustering must calculate the correlation coefficient between the classes.

Dynamic clustering

When the number of samples is large, the calculation amount of the system clustering analysis is massive too. In addition, the system clustering method is used for aggregation, and once the sample is classified into a certain category, it does not change anymore, which is not flexible enough. In order to make up for the shortage of system clustering, a dynamic clustering method is generated. It is a rough classification of the sample and then corrected according to certain principle until the classification is reasonable. The process of dynamic clustering is rough as shown in the figure below.

Dynamic clustering process diagram

Figure 1. Dynamic clustering process diagram

K-means clustering

The most common one in dynamic clustering is k-means clustering. The steps are as follows. (1) k kinds of bibliography were prepared, and k samples were randomly selected as the coherence points. The centers of gravity were the vectors of the observations of k samples, which were recorded as X₁, X₂, --- X_k. (2) select one of the k samples in sequence, denote the observation by Y, calculate the Euclidean distance between Y and X₁, X₂, --- X_k, classify the sample into the smallest distance, and calculate the center of gravity of the class, called the mean vector. The process is repeated until all n samples are classified, and the new center of gravity of class k is still recorded as X₁, X₂, --- X_k. (3) repeat step (2) until the categorization of all samples is the same as the previous step.

Ordered sample clustering method

The system clustering and dynamic clustering methods mentioned above treat each sample on average, that is, any two samples may be classified into the same class. In scientific research, there is another type of data, and each sample has a natural order, such as the age of growth and development data, and the chronological order of incidence. We call this sample an ordered sample. When classifying ordered samples, the ordering characteristic of the samples should be taken into consideration, and the sample order cannot be disturbed. The resulting sample clustering method is called an ordered sample clustering.

Other clustering methods

The classification boundaries in actual data are often unclear, so it is not practical to completely separate samples and indicators into a certain category. In order to solve such problems, some researchers have proposed a fuzzy clustering method, that is, a sample or indicator belongs to all classes at the same time, and the difference is distinguished by the degree of membership. For example, a sample may belong to class A with a membership value of 0.3, and a class B with a membership value of 0.7, which is more realistic. In addition, the neural network aggregation method is also a commonly used clustering method. It describes each cluster as a specimen and is a “typical” of the class, and does not necessarily correspond to a specific record or example. Based on a distance calculation method, the object closest to the sample is found and classified into this category. A neural network based on competitive learning and self-organizing feature mapping network (SOFM) is commonly used.

We guarantee the confidentiality and sensitivity of our customers' data. We are committed to providing you with timely and high-quality deliverables. At the same time, we guarantee cost-effective, complete and concise reports.

If you are unable to find the specific service you are looking for, please feel free to contact us.

References:

1. Delpla Ianis, Florea Mihai, Pelletier Geneviève et al. (2018) ‘Optimizing disinfection by-product monitoring points in a distribution system using cluster analysis’. Chemosphere, 208: 512-521.
2. Fong Allan, Clark Lindsey, Cheng Tianyi et al. (2017) ‘Identifying influential individuals on intensive care units: using cluster analysis to explore culture’. J Nurs Manag, 25(5): 384-391.
3. Gould D J, Navaie D, Purssell E et al. (2018)’Changing the paradigm: messages for hand hygiene education and audit from cluster analysis’. J. Hosp. Infect., 98(4): 345-351.

Cluster Analysis

Cluster Analysis

Are you looking for a professional advisor for your trials?

Services

Contact Us