Gower's General Similarity Coefficient sij compares two cases i and j, and is defined as follows:
denotes the contribution provided by the kth variable, and It should be noted that the effect of the denominator Ordinal and Continuous Variables where: rk For continuous variables Binary Variables
It should be noted that the effect of the denominatorSk wijk is to divide the sum of the similarity scores by the number of variables; or if variable weights have been specified, by the sum of their weights.
Ordinal and Continuous Variables
where: rkis the range of values for the kth variable.
For continuous variablessijk ranges between 1, for identical values xik = xjk, and 0, for the two extreme values xmax - xmin.
For a binary variable (or dichotomous character), Gower defines the component of similarity and the weight according to the table (right), where + denotes that attribute k is "present" and - denotes that attribute k is "absent".
If all your variables are binary, then Gower's General Similarity Coefficient is equivalent to Jaccard's Similarity Coefficient A/(A+B+C) since the negative matches scored in cell D are ignored.
Differential Variable Weights
If the weight of any variable is zero, then the variable is effectively ignored for the calculation of proximities. Such variables are "masked" for clustering, but available forcluster profiling, to assist in the interpretation of a resulting cluster analysis.
General Distance Coefficients
However, the clustering options available using Gower are restricted to those applicable to similarity measures, and not to dissimilarities. Thus, for example, you will not be able to optimize theEuclidean Sum of Squares without first transforming your proximities into distances. For details of the corresponding General Distance Coefficient, click here.
Our implementation of Gower's General Similarity Coefficient is another example of the great flexibilty provided in Clustan software. Mixed data types frequently occur in social surveys and databases, but
you are unlikely to find that other software for cluster analysis or neural networks adequately caters for such practical diversity.
Gower's General Similarity Coefficient has been available in Clustan since 1984, and in ClustanGraphics since release 5 in 2001. A worked example of Gower's coefficient with psychiatric data is given
To order ClustanGraphics on-line click
Gower's General Similarity Coefficient has been available in Clustan since 1984, and in ClustanGraphics since release 5 in 2001. A worked example of Gower's coefficient with psychiatric data is givenhere.
To order ClustanGraphics on-line clickORDER now