scikit-hyper
Machine learning for hyperspectral data in Python
- Simple tools for exploratory analysis of hyperspectral data
- Built on numpy, scipy, matplotlib and scikit-learn
- Simple to use, syntax similar to scikit-learn
Contents
About
This package builds upon the popular scikit-learn to provide an interface for performing machine learning on hyperspectral data. Many of the commonly used techniques in the analysis of hyperspectral data (PCA, ICA, clustering and classification) have been implemented and more will be added in the future.
scikit-hyper also provides two features which aim to make exploratory analysis easier:
-
Process object (
skhyper.process.Process
)This class forms the core of scikit-hyper. It provides useful information about the hyperspectral data and makes machine learning on the data simple.
-
Interactive hyperspectral viewer
A lightweight pyqt gui that provides an interative interface to view the hyperspectral data.
Please note that this package is currently in pre-release. The first general release will be v0.1.0
Installation
To install using pip
:
pip install scikit-hyper
The following packages are required:
- numpy
- scipy
- scikit-learn
- matplotlib
- seaborn
- PyQt5
- pyqtgraph
Features
Features implemented in scikit-hyper include:
- Classification (e.g. SVM, Naive Bayes)
- Clustering (KMeans)
- Decomposition (e.g. PCA, ICA, NMF)
- Hyperspectral viewer
- Tools (smoothing, normalization)
Examples
Hyperspectral denoising
import numpy as np
from skhyper.process import Process
from skhyper.decomposition import PCA
# Generating a random 4-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 10, 1024)
X = Process(test_data, scale=True)
# To denoise the dataset using PCA:
# First we fit the PCA model to the data, and then fit_transform()
# All the usual scikit-learn parameters are available
mdl = PCA()
mdl.fit_transform(X)
# The scree plot can be accessed by:
mdl.plot_statistics()
# Choosing the number of components to keep, we project back
# into the original space:
Xd = mdl.inverse_transform(n_components=200)
# Xd is another instance of Process, which contains the new
# denoised hyperspectral data
Hyperspectral clustering
import numpy as np
from skhyper.process import Process
from skhyper.cluster import KMeans
# Generating a random 3-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 1024)
X = Process(test_data, scale=True)
# Again, all the usual scikit-learn parameters are available
mdl = KMeans(n_clusters=4)
mdl.fit(X)
# The outputs are:
# mdl.labels_ (a 2d/3d image with n_clusters number of labels)
# mdl.image_components_ (a list of n_clusters number of image arrays)
# mdl.spec_components_ (a list of n_clusters number of spectral arrays)
Documentation
The docs are hosted here.
The package API includes documentation from the scikit-learn modules where the particular module is wrapped around the scikit-learn version.
License
scikit-hyper is licensed under the OSI approved BSD 3-Clause License.