qlearnkit.algorithms.qkmeans package¶
Submodules¶
qlearnkit.algorithms.qkmeans.centroid_initialization module¶
- qlearnkit.algorithms.qkmeans.centroid_initialization.kmeans_plus_plus(X: numpy.ndarray, k: int, random_state: int = 42) numpy.ndarray [source]¶
Create cluster centroids using the k-means++ algorithm.
- Parameters
X – The dataset to be used for centroid initialization.
k – The desired number of clusters for which centroids are required.
random_state – Determines random number generation for centroid initialization.
- Returns
Collection of k centroids as a numpy ndarray.
- qlearnkit.algorithms.qkmeans.centroid_initialization.naive_sharding(X: numpy.ndarray, k: int) numpy.ndarray [source]¶
Create cluster centroids using deterministic naive sharding algorithm.
- Parameters
X – The dataset to be used for centroid initialization.
k – The desired number of clusters for which centroids are required.
- Returns
Collection of k centroids as a numpy ndarray.
- qlearnkit.algorithms.qkmeans.centroid_initialization.random(X: numpy.ndarray, n_clusters: int, random_state: int = 42) numpy.ndarray [source]¶
Create random cluster centroids.
- Parameters
X – The dataset to be used for centroid initialization.
n_clusters – The desired number of clusters for which centroids are required.
random_state – Determines random number generation for centroid initialization.
- Returns
Collection of k centroids as a numpy ndarray.
qlearnkit.algorithms.qkmeans.qkmeans module¶
- class qlearnkit.algorithms.qkmeans.qkmeans.QKMeans(n_clusters: int = 6, quantum_instance: Optional[Union[qiskit.utils.quantum_instance.QuantumInstance, qiskit.providers.basebackend.BaseBackend, qiskit.providers.backend.Backend]] = None, *, init: Union[str, numpy.ndarray] = 'kmeans++', n_init: int = 1, max_iter: int = 30, tol: float = 0.0001, random_state: int = 42)[source]¶
Bases:
sklearn.base.ClusterMixin
,qlearnkit.algorithms.quantum_estimator.QuantumEstimator
The Quantum K-Means algorithm for classification
Note
The naming conventions follow the KMeans from sklearn.cluster
Example
Classify data using the Iris dataset.
import numpy as np import matplotlib.pyplot as plt from qlearnkit.algorithms import QKMeans from qiskit import BasicAer from qiskit.utils import QuantumInstance, algorithm_globals from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split seed = 42 algorithm_globals.random_seed = seed quantum_instance = QuantumInstance(BasicAer.get_backend('qasm_simulator'), shots=1024, optimization_level=1, seed_simulator=seed, seed_transpiler=seed) # Use iris data set for training and test data X, y = load_iris(return_X_y=True) num_features = 2 X = np.asarray([x[0:num_features] for x, y_ in zip(X, y) if y_ != 2]) y = np.asarray([y_ for x, y_ in zip(X, y) if y_ != 2]) qkmeans = QKMeans(n_clusters=3, quantum_instance=quantum_instance ) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed) qkmeans.fit(X_train) print(qkmeans.labels_) print(qkmeans.cluster_centers_) # Plot the results colors = ['blue', 'orange', 'green'] for i in range(X_train.shape[0]): plt.scatter(X_train[i, 0], X_train[i, 1], color=colors[qkmeans.labels_[i]]) plt.scatter(qkmeans.cluster_centers_[:, 0], qkmeans.cluster_centers_[:, 1], marker='*', c='g', s=150) plt.show() # Predict new points prediction = qkmeans.predict(X_test) print(prediction)
[0 0 2 1 0 2 2 0 1 0 1 1 2 2 0 2 0 0 1 2 1 1 2 0 2 2 0 1 0 1 1 1 2 1 1 0 1 0 2 0 0 0 0 2 2 0 2 0 0 0 0 1 0 2 1 1 0 2 0 1 0 0 2 1 1 0 2 1 2 0 0 0 0 0 2 0 0 2 0 0] [[5.96216216 2.77027027] [4.74761905 3.02857143] [5.34090909 3.63181818]]
[0 0 0 1 2 2 2 0 2 2 2 1 0 2 0 2 0 0 1 2]
- fit(X: numpy.ndarray, y: Optional[numpy.ndarray] = None)[source]¶
Fits the model using X as training dataset and y as training labels. For the qkmeans algorithm y is ignored. The fit model creates clusters from the training dataset given as input
- Parameters
X – training dataset
y – Ignored. Kept here for API consistency
- Returns
trained QKMeans object
- predict(X_test: numpy.ndarray) numpy.ndarray [source]¶
Predict the labels of the provided data.
- Parameters
X_test – New data to predict.
- Returns
Index of the cluster each sample belongs to.
- score(X: numpy.ndarray, y: Optional[numpy.ndarray] = None, sample_weight: Optional[numpy.ndarray] = None) float [source]¶
Returns Mean Silhouette Coefficient for all samples. :param X: array of features :param y: Ignored.
Not used, present here for API consistency by convention.
- Parameters
sample_weight – Ignored. Not used, present here for API consistency by convention.
- Returns
Mean Silhouette Coefficient for all samples.
qlearnkit.algorithms.qkmeans.qkmeans_circuit module¶
- qlearnkit.algorithms.qkmeans.qkmeans_circuit.construct_circuit(input_point: numpy.ndarray, centroids: numpy.ndarray, k: int) qiskit.circuit.quantumcircuit.QuantumCircuit [source]¶
Apply a Hadamard to the ancillary qubit and our mapped data points. Encode data points using U3 gate. Perform controlled swap to entangle the state with the ancillary qubit Apply another Hadamard gate to the ancillary qubit.
┌───┐ ┌───┐ |0anc>: ┤ H ├────────────■──────┤ H ├────────M └───┘ | └───┘ ┌───┐ ┌────┐ | |0>: ───┤ H ├───┤ U3 ├───X────────── └───┘ └────┘ | ┌───┐ ┌────┐ | |0>: ───┤ H ├───┤ U3 ├───X────────── └───┘ └────┘
- Parameters
input_point – Input point from which calculate the distance.
centroids – Array of points representing the centroids to calculate the distance to.
k – Number of centroids.
- Returns
The quantum circuit created.
Module contents¶
- class qlearnkit.algorithms.qkmeans.QKMeans(n_clusters: int = 6, quantum_instance: Optional[Union[qiskit.utils.quantum_instance.QuantumInstance, qiskit.providers.basebackend.BaseBackend, qiskit.providers.backend.Backend]] = None, *, init: Union[str, numpy.ndarray] = 'kmeans++', n_init: int = 1, max_iter: int = 30, tol: float = 0.0001, random_state: int = 42)[source]¶
Bases:
sklearn.base.ClusterMixin
,qlearnkit.algorithms.quantum_estimator.QuantumEstimator
The Quantum K-Means algorithm for classification
Note
The naming conventions follow the KMeans from sklearn.cluster
Example
Classify data using the Iris dataset.
import numpy as np import matplotlib.pyplot as plt from qlearnkit.algorithms import QKMeans from qiskit import BasicAer from qiskit.utils import QuantumInstance, algorithm_globals from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split seed = 42 algorithm_globals.random_seed = seed quantum_instance = QuantumInstance(BasicAer.get_backend('qasm_simulator'), shots=1024, optimization_level=1, seed_simulator=seed, seed_transpiler=seed) # Use iris data set for training and test data X, y = load_iris(return_X_y=True) num_features = 2 X = np.asarray([x[0:num_features] for x, y_ in zip(X, y) if y_ != 2]) y = np.asarray([y_ for x, y_ in zip(X, y) if y_ != 2]) qkmeans = QKMeans(n_clusters=3, quantum_instance=quantum_instance ) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed) qkmeans.fit(X_train) print(qkmeans.labels_) print(qkmeans.cluster_centers_) # Plot the results colors = ['blue', 'orange', 'green'] for i in range(X_train.shape[0]): plt.scatter(X_train[i, 0], X_train[i, 1], color=colors[qkmeans.labels_[i]]) plt.scatter(qkmeans.cluster_centers_[:, 0], qkmeans.cluster_centers_[:, 1], marker='*', c='g', s=150) plt.show() # Predict new points prediction = qkmeans.predict(X_test) print(prediction)
[0 0 2 1 0 2 2 0 1 0 1 1 2 2 0 2 0 0 1 2 1 1 2 0 2 2 0 1 0 1 1 1 2 1 1 0 1 0 2 0 0 0 0 2 2 0 2 0 0 0 0 1 0 2 1 1 0 2 0 1 0 0 2 1 1 0 2 1 2 0 0 0 0 0 2 0 0 2 0 0] [[5.96216216 2.77027027] [4.74761905 3.02857143] [5.34090909 3.63181818]]
[0 0 0 1 2 2 2 0 2 2 2 1 0 2 0 2 0 0 1 2]
- fit(X: numpy.ndarray, y: Optional[numpy.ndarray] = None)[source]¶
Fits the model using X as training dataset and y as training labels. For the qkmeans algorithm y is ignored. The fit model creates clusters from the training dataset given as input
- Parameters
X – training dataset
y – Ignored. Kept here for API consistency
- Returns
trained QKMeans object
- predict(X_test: numpy.ndarray) numpy.ndarray [source]¶
Predict the labels of the provided data.
- Parameters
X_test – New data to predict.
- Returns
Index of the cluster each sample belongs to.
- score(X: numpy.ndarray, y: Optional[numpy.ndarray] = None, sample_weight: Optional[numpy.ndarray] = None) float [source]¶
Returns Mean Silhouette Coefficient for all samples. :param X: array of features :param y: Ignored.
Not used, present here for API consistency by convention.
- Parameters
sample_weight – Ignored. Not used, present here for API consistency by convention.
- Returns
Mean Silhouette Coefficient for all samples.