FluidMDS:
Filter:
Classes (extension) | Libraries > FluidCorpusManipulation

FluidMDS : FluidModelObject : FluidDataObject : FluidServerObject : Object
ExtensionExtension

Multidimensional Scaling
Source: FluidMDS.sc

Description

Dimensionality Reduction of a FluidDataSet Using Multidimensional Scaling

Multidimensional Scaling transforms a dataset to a lower number of dimensions while trying to preserve the distance relationships between the data points, so that even with fewer dimensions, the differences and similarities between points can still be observed and used effectively.

First, MDS computes a distance matrix by calculating the distance between every pair of points in the dataset. It then positions all the points in the lower number of dimensions (specified by numDimensions) and iteratively shifts them around until the distances between all the points in the lower number of dimensions is as close as possible to the distances in the original dimensional space.

What makes this MDS implementation more flexible than some of the other dimensionality reduction algorithms in FluCoMa is that MDS allows for different measures of distance to be used (see list below).

Note that unlike the other dimensionality reduction algorithms, MDS does not have a fit or transform method, nor does it have the ability to transform data points in buffers. This is essentially because the algorithm needs to do the fit & transform as one with just the data provided in the source DataSet and therefore incorporating new data points would require a re-fitting of the model.

Manhattan Distance: The sum of the absolute value difference between points in each dimension. This is also called the Taxicab Metric. https://en.wikipedia.org/wiki/Taxicab_geometry

Euclidean Distance: Square root of the sum of the squared differences between points in each dimension (Pythagorean Theorem) https://en.wikipedia.org/wiki/Euclidean_distance This metric is the default, as it is the most commonly used.

Squared Euclidean Distance: Square the Euclidean Distance between points. This distance measure more strongly penalises larger distances, making them seem more distant, which may reveal more clustered points. https://en.wikipedia.org/wiki/Euclidean_distance#Squared_Euclidean_distance

Minkowski Max Distance: The distance between two points is reported as the largest difference between those two points in any one dimension. Also called the Chebyshev Distance or the Chessboard Distance. https://en.wikipedia.org/wiki/Chebyshev_distance

Minkowski Min Distance: The distance between two points is reported as the smallest difference between those two points in any one dimension.

Symmetric Kullback Leibler Divergence: Because the first part of this computation uses the logarithm of the values, using the Symmetric Kullback Leibler Divergence only makes sense with non-negative data. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Symmetrised_divergence

Read more about FluidMDS on the learn platform.

Class Methods

FluidMDS.new(server, numDimensions: 2, distanceMetric: 1)

Arguments:

server

The Server on which to construct this object

numDimensions

The number of dimensions to reduce to

Constraints

  • Minimum: 1
distanceMetric

The distance metric to use (integer 0-5)

0Manhattan Distance
1Euclidean Distance (default)
2Squared Euclidean Distance
3Minkowski Max Distance
4Minkowski Min Distance
5Symmetric Kullback Leibler Divergence

Inherited class methods

Undocumented class methods

FluidMDS.cosine

FluidMDS.euclidean

FluidMDS.kl

FluidMDS.manhattan

FluidMDS.max

FluidMDS.min

FluidMDS.sqeuclidean

Instance Methods

.numDimensions

.numDimensions = value

Property for numDimensions. See new

.distanceMetric

.distanceMetric = value

Property for distanceMetric. See new

.fitTransform(sourceDataSet, destDataSet, action)

Fit the model to a FluidDataSet and write the new projected data to a destination DataSet.

Arguments:

sourceDataSet

Source DataSet

destDataSet

Destination DataSet

action

A function to execute when the server has completed running fitTransform

Inherited instance methods

Undocumented instance methods

.cols(action)

.fitTransformMsg(sourceDataSet, destDataSet)

.prGetParams

.read(filename, action)

.size(action)

.write(filename, action)

Examples

Comparing Distance Measures

Just looking at these plots won't really reveal the differences between these distance measures--the best way to see which might be best is to test them on your own data and listen to the musical differences they create!