Distance (distance
)¶
The following example demonstrates how to compute distances between all examples:
>>> from Orange.data import Table
>>> from Orange.distance import Euclidean
>>> iris = Table('iris')
>>> dist_matrix = Euclidean(iris)
>>> # Distance between first two examples
>>> dist_matrix.X[0, 1]
0.53851648
The module for Distance is based on the popular scikitlearn and scipy packages. We wrap the following distance metrics:
Orange.distance.Euclidean
Orange.distance.Manhattan
Orange.distance.Cosine
Orange.distance.Jaccard
Orange.distance.SpearmanR
Orange.distance.SpearmanRAbsolute
Orange.distance.PearsonR
Orange.distance.PearsonRAbsolute
All distances have a common interface to the __call__ method which is the following:

Distance.
__call__
(e1, e2=None, axis=1, impute=False)[source]¶ Parameters:  e1 (
Orange.data.Table
orOrange.data.RowInstance
ornumpy.ndarray
) – input data instances, we calculate distances between all pairs  e2 (
Orange.data.Table
orOrange.data.RowInstance
ornumpy.ndarray
) – optional second argument for data instances if provided, distances between each pair, where first item is from e1 and second is from e2, are calculated  axis (int) – if axis=1 we calculate distances between rows, if axis=0 we calculate distances between columns
 impute (bool) – if impute=True all NaN values in matrix are replaced with 0
Returns: the matrix with distances between given examples
Return type:  e1 (