brianwu-s / dimentionality_reduction Goto Github PK

Nowadays unstructured high-dimensional data like video, audio, text and images has become hot topics in mining research. However, high-dim data is often accompanied with the problem of substantial computation cost and low training efficiency. Otherwise high dimension brings about sparseness of data space representations, making it more likely to be overfitted. As a consequence, dimensionality reduction has to be applied to the preprocess of data. In this report, we try nine different dimensionality reduction methods, including selection by variance, Random Forest, PCA, kernel PCA, LDA, AE, VAE, t-SNE, Umap. Then we made overall comparisons between performance of various approaches and hyperparameters. The experiment on AwA2 dataset shows that LDA gets attains the most efficient performance with 0.93 accuracy and only 49 dimensions, while PCA with sigmoid kernel function reaches the best accuracy 0.935 but reduces dimension barely to 1024.

License: MIT License

Python 0.29% Jupyter Notebook 34.95% HTML 64.76%

Recommend Projects

brianwu-s / dimentionality_reduction Goto Github PK

dimentionality_reduction's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent