lpproj is a Python implementation of Locality Preserving Projections, built to be compatible with scikit-learn.
This notebook contains a very short example showing the use of the code.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
We'll use scikit-learn and create some blobs in 500 dimensions
from sklearn.datasets import make_blobs
X, y = make_blobs(1000, n_features=300, centers=4,
cluster_std=8, random_state=42)
If we select a few random projections of the data, we can see that the clusters overlap significantly along any "line-of-sight" into the high-dimensional data:
fig, ax = plt.subplots(2, 2, figsize=(10, 10))
rand = np.random.RandomState(42)
for axi in ax.flat:
axi.scatter(X[:, rand.randint(X.shape[1])],
X[:, rand.randint(X.shape[1])], c=y);
We can find a projection that preserves the locality of the points using the LocalityPreservingProjection estimator; here we'll project the data into two dimensions:
from lpproj import LocalityPreservingProjection
lpp = LocalityPreservingProjection(n_components=2)
X_2D = lpp.fit_transform(X)
Plotting this projection, we confirm that it has kept nearby points together, as represented by the distinct clusters visible in the projection:
plt.scatter(X_2D[:, 0], X_2D[:, 1], c=y)
plt.title("Projected from 500->2 dimensions");
For more information, see the Locality Preserving Projection website