GOOD.utils.fast_pytorch_kmeans.kmeans

Classes

KMeans(n_clusters[, max_iter, tol, verbose, ...])

Kmeans clustering algorithm implemented with PyTorch

class GOOD.utils.fast_pytorch_kmeans.kmeans.KMeans(n_clusters, max_iter=300, tol=0.0001, verbose=0, mode='euclidean', init_method='kmeans++', minibatch=None, n_init=None, algorithm=None, device=None)[source]

Bases: object

Kmeans clustering algorithm implemented with PyTorch

Parameters
  • n_clusters – int, Number of clusters

  • max_iter – int, default: 100 Maximum number of iterations

  • tol – float, default: 0.0001 Tolerance

  • verbose – int, default: 0 Verbosity

  • mode – {‘euclidean’, ‘cosine’}, default: ‘euclidean’ Type of distance measure

  • init_method – {‘random’, ‘point’, ‘++’} Type of initialization

  • minibatch – {None, int}, default: None Batch size of MinibatchKmeans algorithm if None perform full KMeans algorithm

centroids

torch.Tensor, shape: [n_clusters, n_features] cluster centroids

static cos_sim(a, b)[source]

Compute cosine similarity of 2 sets of vectors

Parameters: a: torch.Tensor, shape: [m, n_features]

b: torch.Tensor, shape: [n, n_features]

static euc_sim(a, b)[source]

Compute euclidean similarity of 2 sets of vectors

Parameters: a: torch.Tensor, shape: [m, n_features]

b: torch.Tensor, shape: [n, n_features]

fit(X, sample_weight=None, centroids=None)[source]

Perform kmeans clustering

Parameters: X: torch.Tensor, shape: [n_samples, n_features]

fit_predict(X, sample_weight=None, centroids=None)[source]

Combination of fit() and predict() methods. This is faster than calling fit() and predict() seperately.

Parameters: X: torch.Tensor, shape: [n_samples, n_features]

centroids: {torch.Tensor, None}, default: None

if given, centroids will be initialized with given tensor if None, centroids will be randomly chosen from X

Return: labels: torch.Tensor, shape: [n_samples]

max_sim(a, b)[source]

Compute maximum similarity (or minimum distance) of each vector in a with all of the vectors in b

Parameters: a: torch.Tensor, shape: [m, n_features]

b: torch.Tensor, shape: [n, n_features]

predict(X)[source]

Predict the closest cluster each sample in X belongs to

Parameters: X: torch.Tensor, shape: [n_samples, n_features]

Return: labels: torch.Tensor, shape: [n_samples]

remaining_memory()[source]

Get remaining memory in gpu