dyconnmap.cluster package

Submodules

dyconnmap.cluster.cluster module

Base class for clustring algorithms

class dyconnmap.cluster.cluster.BaseCluster[source]

Bases: object

Base class for clustering alorithms.

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

encode(data, metric='euclidean', sort=True)[source]

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

dyconnmap.cluster.gng module

Growing NeuralGas

Growing Neural Gas (GNG) [Fritzke1995] is a dynamic neural network (as in adaptive) that learns topologies. Compared to Neural Gas, GNG provides the functionality of adding or purging the constructed graph of nodes and edges when certain criterion are met.

To do so, each node on the network stores a number of secondary information and statistics, such as, its learning vector, a local error, etc. Edge edge is assigned with a counter related to its age; so as older edges are pruned.

The convergence of the algorithm depends either on the maximum number of nodes of the graph, or an upper limit of elapsed iterations.

Briefly, the algorithm works as following:

  1. Create two nodes, with weights drawn randomly from the original distibution; connect these two nodes. Set the edge’s age to zero.

  2. Draw randomly a sample (\(\overrightarrow{x}\)) from the distibution.

  3. For each node (\(n\)) in the graph with associated weights \(\overrightarrow{w}\), we compute the euclidean distance from \(\overrightarrow{x}\): \(||\overrightarrow{n}_w - \overrightarrow{x}||^2\). Next, we find the two nodes closest \(\overrightarrow{x}\) with distances \(d_s\) and \(d_t\).

  4. The best matching unit (\(s\)) adjusts:

  1. its weights: \(\overrightarrow{s}_w \leftarrow \overrightarrow{s}_w + [e_w * (\overrightarrow{x} - \overrightarrow{s}_w)]\).

  2. its local error: \(s_{error} \leftarrow s_{error} + d_s\).

  1. Next, the nodes (\(N\)) adjacent to \(s\):

  1. update their weights: \(\overrightarrow{N}_w \leftarrow \overrightarrow{N}_w + [e_n * (\overrightarrow{x} - \overrightarrow{N}_w)]\).

  2. increase the age of the connecting edges by one.

  1. If the best and second mathing units (\(s\) and \(t\)) are connected, we reset the age of the connecting edge. Otherwise, we connect them.

  2. Regarding the pruning of the network. First we remove the edges with older than \(a_{max}\). In the seond pass, we remove any disconnected nodes.

  3. We check the iteration (\(iter\)), whether is a multiple of \(\lambda\) and if the maximum number of iteration has been reached; then we add a new node (\(q\)) in the graph:

  1. Let \(u\) denote the node with the highest error on the graph, and \(v\) its neighbor with the highest error.

  2. we disconnect \(u\) and \(v\)

  3. \(q\) is added between \(u\) and \(v\): \(\overrightarrow{q}_w \leftarrow \frac{ \overrightarrow{u}_w + \overrightarrow{v}_w }{2}\).

  4. connect \(q\) to \(u\), and \(q\) to \(v\)

  5. reduce the local errors of both \(u\) and \(v\): \(u_{error} \leftarrow \alpha * u_{error}\) and \(v_{error} \leftarrow \alpha * v_{error}\)

  6. define the local error \(q\): \(q_{error} \leftarrow u_{error}\)

  1. Adjust the error of each node (\(n\)) on the graph: \(n_{error} \leftarrow n_{error} - \beta * n_{error}\)

  2. Finally, increate the iteration (\(iter\)) and if any of the criterion is not satisfied, repeat from step #2.



Fritzke1995

Fritzke, B. (1995). A growing neural gas network learns topologies. In Advances in neural information processing systems (pp. 625-632).

class dyconnmap.cluster.gng.GrowingNeuralGas(n_max_protos=30, l=300, a_max=88, a=0.5, b=0.0005, iterations=10000, lrate=None, n_jobs=1, rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Growing Neural Gas

Parameters

n_max_protos : int

Maximum number of nodes.

l : int

Every iteration is checked if it is a multiple of this value.

a_max : int

Maximum age of edges.

a : float

Weights the local error of the nodes when adding a new node.

b : float

Weights the local error of all the nodes on the graph.

iterations : int

Total number of iterations.

lrate : list of length 2

The learning rates of the best matching unit and its neighbors.

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn)).

rng : object or None

An object of type numpy.random.RandomState.

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

Learn data, and construct a vector codebook.

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]

Learn data, and construct a vector codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

Returns

self : object

The instance itself

dyconnmap.cluster.mng module

Merge Neural Gas

Merge Neural Gas (MNG) [Strickert2003] is similar to the original Neural Gas algorithm, but each node, has an additional context vector (\(c\)) associated; and the best matching unit is determined by a linear combination of both the weight and context vector (thus the merge), from the previous iteration.

Similar to Neural Gas, \(N\) nodes have their weights (\(w\)) and context vectors (\(c\)) randomly initialized. When the network is presented with a new input sequence \(v\), the distance and the corresponding rank of each node \(n\) is estimated.

  1. The distance is computed by: \(d_i(n) = (1 - \alpha) * ||v-w_i||^2 + ||c(n)-c_i||^2\) and the context by: \(c(n) = (1 - \beta) * w_{I_{n}-1} + \beta * c_{I_{n}-1}\).

  • \(\alpha\) is the balancing factor between \(w\) and \(c\)

  • \(\beta\) is the merging degree between two subsequent iterations

  • \(I_{n}-1\) denotes the previous iteration

  1. Each node is adapted with:

  1. \({\Delta}w_i = e_w(k) * h_{\lambda(k)} (r(d_i,d)) * (v-w_i)\) and its context by

  2. \({\Delta}c_i = e_w(k) * h_{\lambda(k)} (r(d_i,d)) * (c(n)-c_i)\)

  • \(r(d_i, d)\) denotes the rank of the \(i\)-th node

As with Neural Gas, \(h_{\lambda(k)}(r(d_i, d)) = exp(\frac{-r(d_i, d)}{\lambda(k)})\) represents the neighborhood ranking function. \(\lambda(k)\) denotes the influence of the neighbood: \({\lambda_0(\frac{\lambda_T}{\lambda_0})}^{(\frac{t}{T_{max}})}\).

Notes



Strickert2003

Strickert, M., & Hammer, B. (2003, September). Neural gas for sequences. In Proceedings of the Workshop on Self-Organizing Maps (WSOM’03) (pp. 53-57).

class dyconnmap.cluster.mng.MergeNeuralGas(n_protos=10, iterations=1024, merge_coeffs=None, epsilon=None, lrate=None, n_jobs=1, rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Merge Neural Gas

Parameters

n_protos : int

The number of prototypes

iterations : int

The maximum iterations

merge_coeffs : list of length 2

The merging coefficients

epsilon : list of length 2

The initial and final training rates.

lrate : list of length 2

The initial and final rearning rates.

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn)).

rng : object

An object of type numpy.random.RandomState.

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

distortion : float

The normalized distortion error

Methods

encode(data[, metric])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

param data

encode(data, metric='euclidean')[source]

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]
Parameters

data

Returns

dyconnmap.cluster.ng module

NeuralGas

Inspired by Self Organizing Maps (SOMs), Neural Gas (NG), an unsupervised adaptive algorithm coined by [Martinetz1991]. Neural Gas does not assume a preconstructed lattice thus the adaptation cannot be based on the distances between the neighbor neurons (like in SOMs) because be definition there are no neighbors.

The adaptation-convergence is driven by a stochastic gradient function with a soft-max adaptation rule that minimizes the average distortion error.

First, we construct a number of neurons ( \(N_\vec{w}\) ) with random weights ( \(\vec{w}\) ). Then we train the model by feeding it feature vectors sequentially drawn from the distribution \(P(t)\). When a new feature vector is presented to the model, we sort all neurons’ weights (\(N_\vec{w})\) based on their Euclidean distance from \(\vec{x}\). Then, the adaptation if done by:

\[\vec{w} \leftarrow \vec{w} + [ N_{\vec{w}} \cdot e(t)) \cdot h(k)) \cdot (\vec{x} - \vec{w}) ], \forall \vec{w} \in N_{\vec{w}}\]

where,

\[ \begin{align}\begin{aligned}h(t)=exp{ \frac{-k}{\sigma^2}(t) }\\\sigma^2 = {\lambda_i(\frac{\lambda_T}{\lambda_0})}^{(\frac{t}{T_{max}})}\\e(t) = {e_i(\frac{e_T}{e_0})}^{(\frac{t}{T_{max}})}\end{aligned}\end{align} \]

The parameter \(\lambda\), governs the initial and final learning rate, while the parameter \(e\) the training respectively.

After the presentation of a feature vector, increase the itaration counter \(t\) and repeat until all desired criteria are met, or \(t = T_{max}\).

With these prototypes, we can represent all the input feature vectors \(\vec{x}\) using a Nearest Neighbor rule. The quality of this encoding can measured by the normalized distortion error:

\[\frac{ \sum_{t=1}^T \left | \left | X(t) - X^\ast(t)) \right | \right |^2 }{ \sum_{t=1}^T \left | \left | X(t) - \overline{X}) \right | \right |^2 }\]

where

\[\overline{X}` = \frac{1}{T} \sum_{t=1}^T{X(t)}\]

\(T\) is the number of reference prototypes; in \(X\) the input patterns are stored; \(X^\ast\) contains the approximated patterns as produced by the Nearest Neighbor rule.

Notes

For faster convergence, we can also draw random weights from the given probability distribution \(P(t)\)



Martinetz1991

Martinetz, T., Schulten, K., et al. A “neural-gas” network learns topologies. University of Illinois at Urbana-Champaign, 1991.

Laskaris2004

Laskaris, N. A., Fotopoulos, S., & Ioannides, A. A. (2004). Mining information from event-related recordings. Signal Processing Magazine, IEEE, 21(3), 66-77.

class dyconnmap.cluster.ng.NeuralGas(n_protos=10, iterations=1024, epsilon=None, lrate=None, n_jobs=1, metric='euclidean', rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Neural Gas

Parameters

n_protos : int

The number of prototypes

iterations : int

The maximum iterations

epsilon : list of length 2

The initial and final training rates

lrate : list of length 2

The initial and final rearning rates

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn))

metric : string

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

rng : object or None

An object of type numpy.random.RandomState

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

distortion : float

The normalized distortion error

Notes

Slightly based on http://webloria.loria.fr/~rougier/downloads/ng.py

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

Learn data, and construct a vector codebook.

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]

Learn data, and construct a vector codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

Returns

self : object

The instance itself

dyconnmap.cluster.rng module

Relational Neural Gas

Relational Neural Gas (RNG) [Hammer2007] is a variant of Neural Gas, that allows clustering and mining data from a pairwise similarity or dissimilarity matrix.



Hammer2007

Hammer, B., & Hasenfuss, A. (2007, September). Relational neural gas. In Annual Conference on Artificial Intelligence (pp. 190-204). Springer, Berlin, Heidelberg.

Hasenfuss2008

Hasenfuss, A., Hammer, B., & Rossi, F. (2008, July). Patch Relational Neural Gas–Clustering of Huge Dissimilarity Datasets. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition (pp. 1-12). Springer, Berlin, Heidelberg.

class dyconnmap.cluster.rng.RelationalNeuralGas(n_protos=10, iterations=100, lrate=None, metric='euclidean', rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Relational Neural Gas

Parameters

n_protos : int

The number of prototypes

iterations : int

The maximum iterations

lrate : list of length 2

The initial and final rearning rates

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn))

metric : string

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

rng : object or None

An object of type numpy.random.RandomState

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

Fit

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]

Fit

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used

dyconnmap.cluster.som module

Self Organizing Map

\(T\) is the number of reference prototypes; in \(X\) the input patterns are stored; \(X^\ast\) contains the approximated patterns as produced by the Nearest Neighbor rule.

Notes

For faster convergence, we can also draw random weights from the given probability distribution \(P(t)\)



Martinetz1991

Martinetz, T., Schulten, K., et al. A “neural-gas” network learns topologies. University of Illinois at Urbana-Champaign, 1991.

class dyconnmap.cluster.som.SOM(grid=(10, 10), iterations=1024, lrate=0.1, n_jobs=1, rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Self Organizing Map

Parameters

grid : list of length 2

The X and Y sizes of the grid

iterations : int

The maximum iterations

lrate : float

The initial rearning rate

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn))

rng : object or None

An object of type numpy.random.RandomState

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

findBMU(x, y)

fit(data)

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

classmethod findBMU(x, y)[source]
fit(data)[source]

dyconnmap.cluster.umatrix module

UMatrix Visualization

dyconnmap.cluster.umatrix.umatrix(M)[source]

Module contents

class dyconnmap.cluster.NeuralGas(n_protos=10, iterations=1024, epsilon=None, lrate=None, n_jobs=1, metric='euclidean', rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Neural Gas

Parameters

n_protos : int

The number of prototypes

iterations : int

The maximum iterations

epsilon : list of length 2

The initial and final training rates

lrate : list of length 2

The initial and final rearning rates

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn))

metric : string

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

rng : object or None

An object of type numpy.random.RandomState

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

distortion : float

The normalized distortion error

Notes

Slightly based on http://webloria.loria.fr/~rougier/downloads/ng.py

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

Learn data, and construct a vector codebook.

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]

Learn data, and construct a vector codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

Returns

self : object

The instance itself

class dyconnmap.cluster.MergeNeuralGas(n_protos=10, iterations=1024, merge_coeffs=None, epsilon=None, lrate=None, n_jobs=1, rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Merge Neural Gas

Parameters

n_protos : int

The number of prototypes

iterations : int

The maximum iterations

merge_coeffs : list of length 2

The merging coefficients

epsilon : list of length 2

The initial and final training rates.

lrate : list of length 2

The initial and final rearning rates.

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn)).

rng : object

An object of type numpy.random.RandomState.

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

distortion : float

The normalized distortion error

Methods

encode(data[, metric])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

param data

encode(data, metric='euclidean')[source]

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]
Parameters

data

Returns

class dyconnmap.cluster.RelationalNeuralGas(n_protos=10, iterations=100, lrate=None, metric='euclidean', rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Relational Neural Gas

Parameters

n_protos : int

The number of prototypes

iterations : int

The maximum iterations

lrate : list of length 2

The initial and final rearning rates

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn))

metric : string

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

rng : object or None

An object of type numpy.random.RandomState

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

Fit

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]

Fit

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used

class dyconnmap.cluster.GrowingNeuralGas(n_max_protos=30, l=300, a_max=88, a=0.5, b=0.0005, iterations=10000, lrate=None, n_jobs=1, rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Growing Neural Gas

Parameters

n_max_protos : int

Maximum number of nodes.

l : int

Every iteration is checked if it is a multiple of this value.

a_max : int

Maximum age of edges.

a : float

Weights the local error of the nodes when adding a new node.

b : float

Weights the local error of all the nodes on the graph.

iterations : int

Total number of iterations.

lrate : list of length 2

The learning rates of the best matching unit and its neighbors.

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn)).

rng : object or None

An object of type numpy.random.RandomState.

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

fit(data)

Learn data, and construct a vector codebook.

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

fit(data)[source]

Learn data, and construct a vector codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

Returns

self : object

The instance itself

class dyconnmap.cluster.SOM(grid=(10, 10), iterations=1024, lrate=0.1, n_jobs=1, rng=None)[source]

Bases: dyconnmap.cluster.cluster.BaseCluster

Self Organizing Map

Parameters

grid : list of length 2

The X and Y sizes of the grid

iterations : int

The maximum iterations

lrate : float

The initial rearning rate

n_jobs : int

Number of parallel jobs (will be passed to scikit-learn))

rng : object or None

An object of type numpy.random.RandomState

Attributes

protos : array-like, shape(n_protos, n_features)

The prototypical vectors

Methods

encode(data[, metric, sort])

Employ a nearest-neighbor rule to encode the given data using the codebook.

findBMU(x, y)

fit(data)

encode(data, metric='euclidean', sort=True)

Employ a nearest-neighbor rule to encode the given data using the codebook.

Parameters

data : real array-like, shape(n_samples, n_features)

Data matrix, each row represents a sample.

metric : string or None

One of the following valid options as defined for function http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html.

Valid options include:

  • euclidean

  • cityblock

  • l1

  • cosine

If None is passed, the matric used for learning the data will be used.

sort : boolean

Whether or not to sort the symbols using MDS first. Default True

Returns

encoded_data : real array-like, shape(n_samples, n_features)

data, as represented by the prototypes in codebook.

ts_symbols : list, shape(n_samples, 1)

A discrete symbolic time series

classmethod findBMU(x, y)[source]
fit(data)[source]
dyconnmap.cluster.umatrix(M)[source]