Auto Script Feature
We started in 1968 with a batch program running on an IBM 1620, and progressed in 1997 to a Windows version with a full graphical user interface. Now we have turned
full circle, and will shortly provide an Auto Script feature for ClustanGraphics, to run a sequence of clustering steps as an automated batch program. The difference is that
you can set up your script simply by running ClustanGraphics normally, then edit it for automatic running whenever new data becomes available - thus it meets an essential need among our data mining clients. Details here. New Release - ClustanGraphics 5
ClustanGraphics5, our new release for Windows 95, 98, 2000, ME, XP and NT, was published in 2001. It was previewed at the 25th conference of the German Classification Society in Munich, March 2001, and
has been reviewed by several testers to whom we wish to express our grateful thanks. To preview some of its exciting new features, go to
If you have an interest in data mining or large survey analysis, spare the time to visit To order ClustanGraphics5 on-line click
ClustanGraphics Primer: A Guide to Cluster Analysis
A 60-page ClustanGraphics Primer has been published to accompany ClustanGraphics5. Its purpose is to introduce cluster analysis to beginners and to serve as a user manual for ClustanGraphics5. A
copy is supplied free with ClustanGraphics5, and further copies can be ordered for use with network and site licenses. To order ClustanGraphics5 on-line click Recent Publication - Interface '98
A paper presented at Interface '98, hosted by the University of Minnesota, has been published in Computing Science and Statistics, 30, 257-264. It describes our fast procedure for clustering large datasets
, suitable for data mining and large survey applications. The title and abstract are as follows: Efficient hierarchical cluster analysis for data mining and knowledge discovery David Wishart Abstract:
The paper compares hierarchical cluster analysis with decision trees for data mining and knowledge discovery applications. It is argued that "top-down" binary decision trees can force orthogonal
partitions on to data whose shape due to correlated variables might indicate that a non-orthogonal partition is more appropriate, whereas "bottom-up" hierarchical cluster analysis is better at recovering the
true shape. A fast algorithm is described for Ward's method, capable of constructing clustering trees for thousands of observations and therefore suitable for KDD applications. A hybrid clustering method is
proposed which combines the best features of Ward's method and single linkage (nearest neighbor) to resolve the shape of clusters having non-zero covariance. The use of an agglomerative tree for
identification is discussed, and the methods are illustrated by reference to the H-R diagram of visual stars. Finally, implementation for Windows is described. For a summary of the paper go to Recent Publication - Springer '99 Our paper presented at the German Classification Society, TU-Dresden in March 1998 (GfKl '98) has
been published in: Studies in Classification, Data Analysis and Knowledge Organization, Gaul, W., Locarek-Junge, H. (Eds), Classification in the Information Age, Springer, 1999, pp 268-275. The title and
abstract are as follows: ClustanGraphics3: Interactive Graphics for Cluster Analysis David Wishart Abstract
The methodology developed for optimally re-ordering a tree and proximity matrix and for cluster description is illustrated in the pages on Reorder Tree and Cluster Exemplars. Some examples of the Proteins case study used in this paper also appear in the Cluster Proximity Matrix and
Display Proximity Matrix pages, and an optimally ordered tree appears in
ClustanGraphics Preview. Clustering Large Datasets
illustrates the truncation of a hierarchical cluster analysis of 40,000 cases to 50 clusters. Invited Paper - ISI '99 A paper was recently presented on ClustanGraphics at the 52nd Session of the International Statistical
Institute, to be held at Helsinki, Finland, August 10-18, 1999. It was included in the theme: Statistical Aspects of Data Mining and Knowledge Discovery in Databases. The title and abstract are as follows:
Clustering Methods for Large Data Problems David Wishart Abstract
Some brief details of the general approach are at Clustering versus Decision Trees. A 4-page abstract can be downloaded from
isi_99 [100k zip]. The abstract of this paper was published in the ISI Bulletin for 1999. Classifying Single Malt Whiskies Using Cluster Analysis David Wishart Abstract
: Tasting notes in 10 recently published books on malt whisky and distillers' notes were coded and analysed for 84 single malt whiskies. Nearly 500 aromatic and taste descriptors were compiled and
grouped into a standard flavour profile of 12 categories: Body, Sweetness, Smoky, Medicinal, Tobacco, Honey, Spicy, Winey, Nutty, Malty, Fruity and Floral. A system of consensus coding was devised for
the panel of 10 authors, and the 84 malts were clustered into 10 groups according to their flavour profiles. The paper discusses possibilities for further refinement of the classification and applications in the areas
of product design, brand management and marketing. Malt Whisky Tasting: The meeting concludes with a tasting of selected single malt whiskies
representative of the clusters discussed in the presentation. These have been generously contributed by the distillers and producers of Venues: This talk has been given at PADD '98 (Data Mining), Biometrics and British Classification
Society, Scotch Malt Whisky Society, Royal Statistical Society, Royal Society of Arts, British Computer Society, Classification Society of North America, International Federation of Classification Societies,
Spirit of Speyside Whisky Festival, WhiskyShip Switzerland, and at seminars organised by the Universities of St. Andrews, Napier and Edinburgh. Details |