Our Services
Our statistical experts will select the appropriate cluster analysis method based on the cost, operational feasibility and data type provided, making your tasks easier and more efficient.There are two kinds of systematic clustering methods: aggregation method and decomposition method. The aggregation method is to firstly treat each individual as a class, merge the two most similar classes, recreate the distance between the classes, and then again merge the two similar classes, reducing the number of the classes one by one until all the individuals are grouped together. First, we treat all individuals as one class, divide the most dissimilar individuals into two categories, and add one class to each step until all individuals become one. However, it should be noted that each step of the system clustering must calculate the correlation coefficient between the classes.
When the number of samples is large, the calculation amount of the system clustering analysis is massive too. In addition, the system clustering method is used for aggregation, and once the sample is classified into a certain category, it does not change anymore, which is not flexible enough. In order to make up for the shortage of system clustering, a dynamic clustering method is generated. It is a rough classification of the sample and then corrected according to certain principle until the classification is reasonable. The process of dynamic clustering is rough as shown in the figure below.
Figure 1. Dynamic clustering process diagram
The most common one in dynamic clustering is k-means clustering. The steps are as follows. (1) k kinds of bibliography were prepared, and k samples were randomly selected as the coherence points. The centers of gravity were the vectors of the observations of k samples, which were recorded as X1, X2, --- Xk. (2) select one of the k samples in sequence, denote the observation by Y, calculate the Euclidean distance between Y and X1, X2, --- Xk, classify the sample into the smallest distance, and calculate the center of gravity of the class, called the mean vector. The process is repeated until all n samples are classified, and the new center of gravity of class k is still recorded as X1, X2, --- Xk. (3) repeat step (2) until the categorization of all samples is the same as the previous step.
The system clustering and dynamic clustering methods mentioned above treat each sample on average, that is, any two samples may be classified into the same class. In scientific research, there is another type of data, and each sample has a natural order, such as the age of growth and development data, and the chronological order of incidence. We call this sample an ordered sample. When classifying ordered samples, the ordering characteristic of the samples should be taken into consideration, and the sample order cannot be disturbed. The resulting sample clustering method is called an ordered sample clustering.
The classification boundaries in actual data are often unclear, so it is not practical to completely separate samples and indicators into a certain category. In order to solve such problems, some researchers have proposed a fuzzy clustering method, that is, a sample or indicator belongs to all classes at the same time, and the difference is distinguished by the degree of membership. For example, a sample may belong to class A with a membership value of 0.3, and a class B with a membership value of 0.7, which is more realistic. In addition, the neural network aggregation method is also a commonly used clustering method. It describes each cluster as a specimen and is a “typical” of the class, and does not necessarily correspond to a specific record or example. Based on a distance calculation method, the object closest to the sample is found and classified into this category. A neural network based on competitive learning and self-organizing feature mapping network (SOFM) is commonly used.
We guarantee the confidentiality and sensitivity of our customers' data. We are committed to providing you with timely and high-quality deliverables. At the same time, we guarantee cost-effective, complete and concise reports.
If you are unable to find the specific service you are looking for, please feel free to contact us.
References:
1. Delpla Ianis, Florea Mihai, Pelletier Geneviève et al. (2018) ‘Optimizing disinfection by-product monitoring points in a distribution system using cluster analysis’. Chemosphere, 208: 512-521.
2. Fong Allan, Clark Lindsey, Cheng Tianyi et al. (2017) ‘Identifying influential individuals on intensive care units: using cluster analysis to explore culture’. J Nurs Manag, 25(5): 384-391.
3. Gould D J, Navaie D, Purssell E et al. (2018)’Changing the paradigm: messages for hand hygiene education and audit from cluster analysis’. J. Hosp. Infect., 98(4): 345-351.