Cluster attributes

This tool can be used to performs k-means clustering on a selected group of attributes associated with a vector file. Each of the input attributes must be numerical. If the group of attributes are measured on different scales, it is advisable to either standardize the data (i.e. convert to z-scores) or normalize (transform the scale to range from 0 to 1). Both of these options can be specified as a pre-processing step to the cluster analysis. The user must also specify the number of clusters, the maximum number of iterations, and the minimum percent change, i.e. the minimum number of cases in the data set that change clusters between iterations before the iteration process is stopped. The cluster assignment for each record will be output to a new field in the vector file's attribute table named CLUSTER. Additionally, upon completion the tool will output a chart illustrating the cluster assignment change between iterations and an HTML report of cluster characteristics. Cluster centres are initialized with randomly selected records.

See Also:

Scripting:

The following is an example of a Python script using this tool:

wd = pluginHost.getWorkingDirectory()
inputData = wd + "neighbourhoods.shp" + ";" + "POPULATION" + ";" + "CRIME_RATE" + ";" + "INCOME"
rescalingMethod = "standardize"
numClusters = "10"
maxIterations = "500"
minChange = "0.5"
args = [inputData, rescalingMethod, numClusters, maxIterations, minChange]
pluginHost.runPlugin("ClusterAttributes", args, False)

This is a Groovy script also using this tool:

def wd = pluginHost.getWorkingDirectory()
def inputData = wd + "neighbourhoods.shp" + ";" + "POPULATION" + ";" + "CRIME_RATE" + ";" + "INCOME"
def rescalingMethod = "normalize"
def numClusters = "10"
def maxIterations = "500"
def minChange = "0.5"
String[] args = [inputData, rescalingMethod, numClusters, maxIterations, minChange]
pluginHost.runPlugin("ClusterAttributes", args, false)

Credits: