dmitryulyanov / multicore-tsne Goto Github PK
View Code? Open in Web Editor NEWParallel t-SNE implementation with Python and Torch wrappers.
License: Other
Parallel t-SNE implementation with Python and Torch wrappers.
License: Other
I followed the instructions on this link to get a version of gcc that supports openmp.
but it looks like the install script isn't using gcc:
-- The C compiler identification is AppleClang 8.0.0.8000038
-- The CXX compiler identification is AppleClang 8.0.0.8000038
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
help?
I installed this.
It works inside the directory.
When I move it too dist-packages, it breaks.
It is on sys.path
Im running into this error:
OSError: cannot load library /home/vsilva/anaconda2/lib/python2.7/site-packages/MulticoreTSNE/libtsne_multicore.so: /home/vsilva/anaconda2/bin/../lib/libgomp.so.1: version
GOMP_4.0' not found (required by /home/vsilva/anaconda2/lib/python2.7/site-packages/MulticoreTSNE/libtsne_multicore.so). Additionally, ctypes.util.find_library() did not manage to locate a library called '/home/vsilva/anaconda2/lib/python2.7/site-packages/MulticoreTSNE/libtsne_multicore.so'`
I've tried searching everywhere but noone has a consistent answer. Does anyone run into this?
Hello.
Thanks for your job.
I try to use your library, but I have this exception:
Exception` in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.5/dist-packages/MulticoreTSNE/init.py", line 20, in run
self._target(*self._args)
TypeError: an integer is required
Hi
Is there any roadmap for allowing higher dimensionality ? tSNE can also be used to reduce the dimension in datasets - e.g from 200 down to 10. Being able to do this with something much faster than sklearn would be really cool.
Thanks
Ian
Do you just implement this algorithm with the help of openMP?
Have you ever make some change in the process of bhtsne?
I am unable to install using pip install .It gives Running setup.py bdist_wheel for MulticoreTSNE ... error, CMake error at CMakeLists.txt:1 Failed to run MSBuild command.Then build failed error. I am using cmake version 3.11.0-rc4, Microsoft .Net Framework v4.0.30319.
I am getting this error:
$ pip install .
Processing c:\users\deep chatterjee\multicore-tsne
Requirement already satisfied: numpy in e:\anaconda3\lib\site-packages (from Mul ticoreTSNE==0.1) (1.14.5)
Requirement already satisfied: cffi in e:\anaconda3\lib\site-packages (from Mult icoreTSNE==0.1) (1.10.0)
Requirement already satisfied: pycparser in e:\anaconda3\lib\site-packages (from cffi->MulticoreTSNE==0.1) (2.18)
Building wheels for collected packages: MulticoreTSNE
Running setup.py bdist_wheel for MulticoreTSNE: started
Running setup.py bdist_wheel for MulticoreTSNE: finished with status 'error'
Complete output from command E:\Anaconda3\python.exe -u -c "import setuptools, tokenize;file='C:\Users\Public\Documents\Wondershare\CreatorTemp\pip- req-build-fc9af9iu\setup.py';f=getattr(tokenize, 'open', open)(file);code=f .read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" b dist_wheel -d C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-wheel-c0r7vy ex --python-tag cp36:
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.6
creating build\lib.win-amd64-3.6\MulticoreTSNE
copying MulticoreTSNE_init_.py -> build\lib.win-amd64-3.6\MulticoreTSNE
creating build\lib.win-amd64-3.6\MulticoreTSNE\tests
copying MulticoreTSNE\tests\test_base.py -> build\lib.win-amd64-3.6\MulticoreT SNE\tests
copying MulticoreTSNE\tests_init_.py -> build\lib.win-amd64-3.6\MulticoreTS NE\tests
running egg_info
creating MulticoreTSNE.egg-info
writing MulticoreTSNE.egg-info\PKG-INFO
writing dependency_links to MulticoreTSNE.egg-info\dependency_links.txt
writing requirements to MulticoreTSNE.egg-info\requires.txt
writing top-level names to MulticoreTSNE.egg-info\top_level.txt
writing manifest file 'MulticoreTSNE.egg-info\SOURCES.txt'
reading manifest file 'MulticoreTSNE.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'MulticoreTSNE.egg-info\SOURCES.txt'
running build_ext
cmake version 3.11.0-rc4
CMake suite maintained and supported by Kitware (kitware.com/cmake).
-- Building for: Visual Studio 10 2010
CMake Error at CMakeLists.txt:1 (PROJECT):
Failed to run MSBuild command:
C:/Windows/Microsoft.NET/Framework/v4.0.30319/MSBuild.exe
to get the value of VCTargetsPath:
Microsoft (R) Build Engine version 4.6.1055.0
[Microsoft .NET Framework, version 4.0.30319.42000]
Copyright (C) Microsoft Corporation. All rights reserved.
Build started 02-08-2018 17:04:52.
Project "C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-f c9af9iu\build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcx proj" on node 1 (default targets).
C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9iu\b uild\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcxproj(14,2 ): error MSB4019: The imported project "C:\Microsoft.Cpp.Default.props" was not found. Confirm that the path in the <Import> declaration is correct, and that th e file exists on disk.
Done Building Project "C:\Users\Public\Documents\Wondershare\CreatorTemp\p ip-req-build-fc9af9iu\build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCT argetsPath.vcxproj" (default targets) -- FAILED.
Build FAILED.
"C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9iu\ build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcxproj" (d efault target) (1) ->
C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9iu \build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcxproj(14 ,2): error MSB4019: The imported project "C:\Microsoft.Cpp.Default.props" was no t found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.
0 Warning(s)
1 Error(s)
Time Elapsed 00:00:00.06
Exit code: 1
-- Configuring incomplete, errors occurred!
See also "C:/Users/Public/Documents/Wondershare/CreatorTemp/pip-req-build-fc9a f9iu/build/temp.win-amd64-3.6/Release/CMakeFiles/CMakeOutput.log".
ERROR: Cannot generate Makefile. See above errors.
Failed building wheel for MulticoreTSNE
Running setup.py clean for MulticoreTSNE
Failed to build MulticoreTSNE
Installing collected packages: MulticoreTSNE
Running setup.py install for MulticoreTSNE: started
Running setup.py install for MulticoreTSNE: finished with status 'error'
Complete output from command E:\Anaconda3\python.exe -u -c "import setuptool s, tokenize;file='C:\Users\Public\Documents\Wondershare\CreatorTemp\pi p-req-build-fc9af9iu\setup.py';f=getattr(tokenize, 'open', open)(file);code =f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-r ni883bh\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.6
creating build\lib.win-amd64-3.6\MulticoreTSNE
copying MulticoreTSNE_init_.py -> build\lib.win-amd64-3.6\MulticoreTSNE
creating build\lib.win-amd64-3.6\MulticoreTSNE\tests
copying MulticoreTSNE\tests\test_base.py -> build\lib.win-amd64-3.6\Multicor eTSNE\tests
copying MulticoreTSNE\tests_init_.py -> build\lib.win-amd64-3.6\Multicore TSNE\tests
running egg_info
writing MulticoreTSNE.egg-info\PKG-INFO
writing dependency_links to MulticoreTSNE.egg-info\dependency_links.txt
writing requirements to MulticoreTSNE.egg-info\requires.txt
writing top-level names to MulticoreTSNE.egg-info\top_level.txt
reading manifest file 'MulticoreTSNE.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'MulticoreTSNE.egg-info\SOURCES.txt'
running build_ext
cmake version 3.11.0-rc4
CMake suite maintained and supported by Kitware (kitware.com/cmake).
-- Building for: Visual Studio 10 2010
CMake Error at CMakeLists.txt:1 (PROJECT):
Failed to run MSBuild command:
C:/Windows/Microsoft.NET/Framework/v4.0.30319/MSBuild.exe
to get the value of VCTargetsPath:
Microsoft (R) Build Engine version 4.6.1055.0
[Microsoft .NET Framework, version 4.0.30319.42000]
Copyright (C) Microsoft Corporation. All rights reserved.
Build started 02-08-2018 17:04:57.
Project "C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build -fc9af9iu\build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.v cxproj" on node 1 (default targets).
C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9iu \build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcxproj(14 ,2): error MSB4019: The imported project "C:\Microsoft.Cpp.Default.props" was no t found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.
Done Building Project "C:\Users\Public\Documents\Wondershare\CreatorTemp \pip-req-build-fc9af9iu\build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\V CTargetsPath.vcxproj" (default targets) -- FAILED.
Build FAILED.
"C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9i u\build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcxproj" (default target) (1) ->
C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9 iu\build\temp.win-amd64-3.6\Release\CMakeFiles\3.11.0-rc4\VCTargetsPath.vcxproj( 14,2): error MSB4019: The imported project "C:\Microsoft.Cpp.Default.props" was not found. Confirm that the path in the <Import> declaration is correct, and tha t the file exists on disk.
0 Warning(s)
1 Error(s)
Time Elapsed 00:00:00.04
Exit code: 1
-- Configuring incomplete, errors occurred!
See also "C:/Users/Public/Documents/Wondershare/CreatorTemp/pip-req-build-fc 9af9iu/build/temp.win-amd64-3.6/Release/CMakeFiles/CMakeOutput.log".
ERROR: Cannot generate Makefile. See above errors.
----------------------------------------
Command "E:\Anaconda3\python.exe -u -c "import setuptools, tokenize;file='C: \Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9iu\se tup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n' , '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Use rs\Public\Documents\Wondershare\CreatorTemp\pip-record-rni883bh\install-record.t xt --single-version-externally-managed --compile" failed with error code 1 in C: \Users\Public\Documents\Wondershare\CreatorTemp\pip-req-build-fc9af9iu\
Thanks in advance
Hi Dmitry,
currently, the option random_state
is avoided and thereby every tSNE plot looks different. Would you consider setting a seed for the initialization as described above, in random_state != None
? If you want, I can make a pull request for that.
Cheers,
Alex
Building in macOS 10.12.6. In link.txt, generated by the makefile, I needed to add the following flags to get it link:
-lc++ -lstdc++
Hello
I get "segmentation fault (core dumped)" when run multicore-tsne. One of the cases that the crash occurs is when the input data contains lots of zeros. Is there any fix for this problem?
Thanks
from MulticoreTSNE import MulticoreTSNE as TSNE
import numpy as np
tsne = TSNE(n_jobs=40, perplexity=30)
tsne.fit(np.zeros([5,3]))
Squared euclidean distance cannot be used in VPTree search and thus sqrt() should be calculated for result:
sqrt() should be used in https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/multicore_tsne/vptree.h#L71
or https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/multicore_tsne/vptree.h#L206
Using squared euclidean distance in VPTree causes search to not find all k nearest points.
See lvdmaaten/bhtsne#41 (comment) and http://stevehanov.ca/blog/index.php?id=130
"It is worth repeating that you must use a distance metric that satisfies the triangle inequality. I spent a lot of time wondering why my VP tree was not working. It turns out that I had not bothered to find the square root in the distance calculation. This step is important to satisfy the requirements of a metric space, because if the straight line distance to a <= b+c, it does not necessarily follow that a2 <= b2 + c2."
Omitting sqrt in VPTree search seems to bring increased performance because it doesn't search all necessary branches in tree. You can ensure that by calculating t-SNE with both metrics and using same initial coordinates. You will see that output differs a bit. I have done this in comment lvdmaaten/bhtsne#41 (comment)
t-sne is inherently randomized but still not that much. It produces consistently different (much worse) results compared to scikit-learn Barnes-Hut implementation.
Example on IRIS dataset:
Scikit-learn with default parameters and learning rate 100
Multicore T-SNE with default parameters and learning rate 100
The greater distance of setosa
cluster is also supported by general statistical properties of the dataset (and other embedding algorithms) so the results of scikit-learn lib are more consistent with the original manifold structure
This is due to not having a .fit() method. There for you cannot encode categorical variables so easily.
This would be very nice to have!
In line tsne.cpp#L37 vectors indices
and distances
will have < K + 1
elements if K > N - 1
.
This will cause erroneous reads out of the bounds of the vector in tsne.cpp#L388 and other loops.
There should be a check that K <= N -1
.
I am seeing
Performing t-SNE using 1 cores.
Using no_dims = 2, perplexity = 30.000000, and theta = 0.500000
Computing input similarities...
Building tree...
this verbose message even when my n_jobs=4
I'm trying to install MulticoreTSNE into a Docker image built on top of the Jupyter Minimal distribution (Anaconda Python, etc.). Previously I was able to run this without a hitch. I've tested the following combinations:
cmake
3.18.2 = Build successful (this was the version dumped from my last working image)cmake
3.20.1 = Build unsuccessful (error below)cmake
3.18.2 = Build successful (this was the version dumped from my last working image)cmake
3.20.1 = Build unsuccessful (error below)Forcing cmake to downgrade to 3.18.2 seems to force a downgrade of Python from 3.9 to 3.8 so it's possible that is the source of the problem, but the report suggests it's cmake.
The error is:
#13 53.91 running build_ext
#13 53.91 cmake version 3.20.1
#13 53.91
#13 53.91 CMake suite maintained and supported by Kitware (kitware.com/cmake).
#13 53.91 CMake Error: Unknown argument --
#13 53.91 CMake Error: Run 'cmake --help' for all supported options.
#13 53.91
#13 53.91 ERROR: Cannot generate Makefile. See above errors.
Full context is:
> [4/5] RUN conda-env create -n ethos -f ./python.test.yml && conda clean --all --yes --force-pkgs-dirs && find /opt/conda/ -follow -type f -name '*.a' -delete && find /opt/conda/ -follow -type f -name '*.pyc' -delete && find /opt/conda/ -follow -type f -name '*.js.map' -delete && pip cache purge && rm -rf /home/jovyan/.cache/pip && rm ./python.test.yml:
#12 0.549 Collecting package metadata (repodata.json): ...working... done
#12 25.27 Solving environment: ...working... done
#12 29.94
#12 29.94 Downloading and Extracting Packages
libffi-3.3 | 51 KB | ########## | 100%
lz4-c-1.9.3 | 179 KB | ########## | 100%
libgomp-9.3.0 | 376 KB | ########## | 100%
libedit-3.1.20191231 | 121 KB | ########## | 100%
xz-5.2.5 | 343 KB | ########## | 100%
rhash-1.4.1 | 192 KB | ########## | 100%
krb5-1.17.2 | 1.4 MB | ########## | 100%
readline-8.1 | 295 KB | ########## | 100%
bzip2-1.0.8 | 484 KB | ########## | 100%
_openmp_mutex-4.5 | 22 KB | ########## | 100%
tzdata-2021a | 121 KB | ########## | 100%
libssh2-1.9.0 | 226 KB | ########## | 100%
certifi-2020.12.5 | 143 KB | ########## | 100%
sqlite-3.35.4 | 1.4 MB | ########## | 100%
ca-certificates-2020 | 137 KB | ########## | 100%
zstd-1.4.9 | 431 KB | ########## | 100%
libuv-1.41.0 | 1.0 MB | ########## | 100%
setuptools-49.6.0 | 943 KB | ########## | 100%
libstdcxx-ng-9.3.0 | 4.0 MB | ########## | 100%
libcurl-7.76.1 | 328 KB | ########## | 100%
cmake-3.20.1 | 14.7 MB | ########## | 100%
libgcc-ng-9.3.0 | 7.8 MB | ########## | 100%
python-3.9.2 | 27.3 MB | ########## | 100%
zlib-1.2.11 | 106 KB | ########## | 100%
pip-21.0.1 | 1.1 MB | ########## | 100%
ld_impl_linux-64-2.3 | 618 KB | ########## | 100%
tk-8.6.10 | 3.2 MB | ########## | 100%
openssl-1.1.1k | 2.1 MB | ########## | 100%
libnghttp2-1.43.0 | 808 KB | ########## | 100%
wheel-0.36.2 | 31 KB | ########## | 100%
python_abi-3.9 | 4 KB | ########## | 100%
libev-4.33 | 104 KB | ########## | 100%
_libgcc_mutex-0.1 | 3 KB | ########## | 100%
expat-2.3.0 | 168 KB | ########## | 100%
ncurses-6.2 | 985 KB | ########## | 100%
c-ares-1.17.1 | 109 KB | ########## | 100%
#12 44.17 Preparing transaction: ...working... done
#12 44.43 Verifying transaction: ...working... done
#12 45.95 Executing transaction: ...working... done
#12 47.59 Installing pip dependencies: ...working... Ran pip subprocess with arguments:
#12 52.31 ['/opt/conda/envs/ethos/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/jovyan/condaenv.4kzuzpkn.requirements.txt']
#12 52.31 Pip subprocess output:
#12 52.31 Collecting MulticoreTSNE
#12 52.31 Downloading MulticoreTSNE-0.1.tar.gz (20 kB)
#12 52.31 Collecting numpy
#12 52.31 Downloading numpy-1.20.2-cp39-cp39-manylinux2010_x86_64.whl (15.4 MB)
#12 52.31 Collecting cffi
#12 52.31 Downloading cffi-1.14.5-cp39-cp39-manylinux1_x86_64.whl (406 kB)
#12 52.31 Collecting pycparser
#12 52.31 Downloading pycparser-2.20-py2.py3-none-any.whl (112 kB)
#12 52.31 Building wheels for collected packages: MulticoreTSNE
#12 52.31 Building wheel for MulticoreTSNE (setup.py): started
#12 52.31 Building wheel for MulticoreTSNE (setup.py): finished with status 'error'
#12 52.31 Running setup.py clean for MulticoreTSNE
#12 52.31 Failed to build MulticoreTSNE
#12 52.31 Installing collected packages: pycparser, numpy, cffi, MulticoreTSNE
#12 52.31 Running setup.py install for MulticoreTSNE: started
#12 52.31 Running setup.py install for MulticoreTSNE: finished with status 'error'
#12 52.31
#12 52.31 failed
#12 52.31
#12 52.31
#12 52.31 ==> WARNING: A newer version of conda exists. <==
#12 52.31 current version: 4.10.0
#12 52.31 latest version: 4.10.1
#12 52.31
#12 52.31 Please update conda by running
#12 52.31
#12 52.31 $ conda update -n base conda
#12 52.31
#12 52.31
#12 52.31 Pip subprocess error:
#12 52.31 ERROR: Command errored out with exit status 1:
#12 52.31 command: /opt/conda/envs/ethos/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/setup.py'"'"'; __file__='"'"'/tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-gm7zmxgs
#12 52.31 cwd: /tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/
#12 52.31 Complete output (26 lines):
#12 52.31 running bdist_wheel
#12 52.31 running build
#12 52.31 running build_py
#12 52.31 creating build
#12 52.31 creating build/lib.linux-x86_64-3.9
#12 52.31 creating build/lib.linux-x86_64-3.9/MulticoreTSNE
#12 52.31 copying MulticoreTSNE/__init__.py -> build/lib.linux-x86_64-3.9/MulticoreTSNE
#12 52.31 creating build/lib.linux-x86_64-3.9/MulticoreTSNE/tests
#12 52.31 copying MulticoreTSNE/tests/test_base.py -> build/lib.linux-x86_64-3.9/MulticoreTSNE/tests
#12 52.31 copying MulticoreTSNE/tests/__init__.py -> build/lib.linux-x86_64-3.9/MulticoreTSNE/tests
#12 52.31 running egg_info
#12 52.31 writing MulticoreTSNE.egg-info/PKG-INFO
#12 52.31 writing dependency_links to MulticoreTSNE.egg-info/dependency_links.txt
#12 52.31 writing requirements to MulticoreTSNE.egg-info/requires.txt
#12 52.31 writing top-level names to MulticoreTSNE.egg-info/top_level.txt
#12 52.31 reading manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
#12 52.31 reading manifest template 'MANIFEST.in'
#12 52.31 writing manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
#12 52.31 running build_ext
#12 52.31 cmake version 3.20.1
#12 52.31
#12 52.31 CMake suite maintained and supported by Kitware (kitware.com/cmake).
#12 52.31 CMake Error: Unknown argument --
#12 52.31 CMake Error: Run 'cmake --help' for all supported options.
#12 52.31
#12 52.31 ERROR: Cannot generate Makefile. See above errors.
#12 52.31 ----------------------------------------
#12 52.31 ERROR: Failed building wheel for MulticoreTSNE
#12 52.31 ERROR: Command errored out with exit status 1:
#12 52.31 command: /opt/conda/envs/ethos/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/setup.py'"'"'; __file__='"'"'/tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-5jx__mkf/install-record.txt --single-version-externally-managed --compile --install-headers /opt/conda/envs/ethos/include/python3.9/MulticoreTSNE
#12 52.31 cwd: /tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/
#12 52.31 Complete output (26 lines):
#12 52.31 running install
#12 52.31 running build
#12 52.31 running build_py
#12 52.31 creating build
#12 52.31 creating build/lib.linux-x86_64-3.9
#12 52.31 creating build/lib.linux-x86_64-3.9/MulticoreTSNE
#12 52.31 copying MulticoreTSNE/__init__.py -> build/lib.linux-x86_64-3.9/MulticoreTSNE
#12 52.31 creating build/lib.linux-x86_64-3.9/MulticoreTSNE/tests
#12 52.31 copying MulticoreTSNE/tests/test_base.py -> build/lib.linux-x86_64-3.9/MulticoreTSNE/tests
#12 52.31 copying MulticoreTSNE/tests/__init__.py -> build/lib.linux-x86_64-3.9/MulticoreTSNE/tests
#12 52.31 running egg_info
#12 52.31 writing MulticoreTSNE.egg-info/PKG-INFO
#12 52.31 writing dependency_links to MulticoreTSNE.egg-info/dependency_links.txt
#12 52.31 writing requirements to MulticoreTSNE.egg-info/requires.txt
#12 52.31 writing top-level names to MulticoreTSNE.egg-info/top_level.txt
#12 52.31 reading manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
#12 52.31 reading manifest template 'MANIFEST.in'
#12 52.31 writing manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
#12 52.31 running build_ext
#12 52.31 cmake version 3.20.1
#12 52.31
#12 52.31 CMake suite maintained and supported by Kitware (kitware.com/cmake).
#12 52.31 CMake Error: Unknown argument --
#12 52.31 CMake Error: Run 'cmake --help' for all supported options.
#12 52.31
#12 52.31 ERROR: Cannot generate Makefile. See above errors.
#12 52.31 ----------------------------------------
#12 52.31 ERROR: Command errored out with exit status 1: /opt/conda/envs/ethos/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/setup.py'"'"'; __file__='"'"'/tmp/pip-install-110qdwqv/multicoretsne_4b5d168de5e04f6c894a77b8595839b9/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-5jx__mkf/install-record.txt --single-version-externally-managed --compile --install-headers /opt/conda/envs/ethos/include/python3.9/MulticoreTSNE Check the logs for full command output.
#12 52.31
#12 52.31
#12 52.31 CondaEnvException: Pip failed
#12 52.31
------
executor failed running [/bin/bash -c conda-env create -n ${env_nm} -f ./${yaml_nm} && conda clean --all --yes --force-pkgs-dirs && find /opt/conda/ -follow -type f -name '*.a' -delete && find /opt/conda/ -follow -type f -name '*.pyc' -delete && find /opt/conda/ -follow -type f -name '*.js.map' -delete && pip cache purge && rm -rf /home/$NB_USER/.cache/pip && rm ./${yaml_nm}]: exit code: 1
Hello @DmitryUlyanov !
I have been trying to run TSNE on big datasets and so far it has been working great but I think I have reached your program's limit.
I have a huuge dataset ( 3091356 x 1120 ) and as soon as I start to fit the data I just get segmentation fault ( core dumped)
.
I have the RAM required to run this thing, is it possible that you have some sort of pointer or malloc error ?
Thanks!
When I try to set n_component = 3
ts_2 = TSNE(n_component=3, n_jobs=4,perplexity=100,random_state=5, verbose=2)
This is the error message I get:
assert n_components == 2, 'n_components should be 2'
AssertionError: n_components should be 2
Hi @DmitryUlyanov, thanks for this great library! Just wanted to let you know that I created Ruby bindings for it. https://github.com/ankane/tsne
Looks like a deprecated function being removed from matplotlib; The Python MNIST test example fails with this error.
Hello,
I'm trying to install the MulticoreTSNE using the pip install MulticoreTSNE
, but getting the following error:
/Users/jason/opt/miniconda3/lib/python3.8/site-packages/cmake/data/CMake.app/Contents/bin/cmake -E cmake_progress_start /private/var/folders/st/t3rpc2cn5m3c761j79yj7dm40000gn/T/pip-req-build-y8q7m_a5/build/temp.macosx-10.9-x86_64-3.8/CMakeFiles 0
installing to build/bdist.macosx-10.9-x86_64/wheel
running install
running install_lib
creating build/bdist.macosx-10.9-x86_64
creating build/bdist.macosx-10.9-x86_64/wheel
creating build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE
copying build/lib.macosx-10.9-x86_64-3.8/MulticoreTSNE/libtsne_multicore.so -> build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE
creating build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE/tests
copying build/lib.macosx-10.9-x86_64-3.8/MulticoreTSNE/tests/__init__.py -> build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE/tests
copying build/lib.macosx-10.9-x86_64-3.8/MulticoreTSNE/tests/test_base.py -> build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE/tests
copying build/lib.macosx-10.9-x86_64-3.8/MulticoreTSNE/__init__.py -> build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE
running install_egg_info
Copying MulticoreTSNE.egg-info to build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE-0.1-py3.8.egg-info
running install_scripts
[WARNING] This wheel needs a higher macOS version than the version your Python interpreter is compiled against. To silence this warning, set MACOSX_DEPLOYMENT_TARGET to at least 11_0 or recreate these files with lower MACOSX_DEPLOYMENT_TARGET:
build/bdist.macosx-10.9-x86_64/wheel/MulticoreTSNE/libtsne_multicore.soTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/st/t3rpc2cn5m3c761j79yj7dm40000gn/T/pip-req-build-y8q7m_a5/setup.py", line 74, in <module>
setup(
File "/Users/jason/opt/miniconda3/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/Users/jason/opt/miniconda3/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/Users/jason/opt/miniconda3/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/Users/jason/opt/miniconda3/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/Users/jason/opt/miniconda3/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 328, in run
impl_tag, abi_tag, plat_tag = self.get_tag()
File "/Users/jason/opt/miniconda3/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 278, in get_tag
assert tag in supported_tags, "would build wheel with unsupported tag {}".format(tag)
AssertionError: would build wheel with unsupported tag ('cp38', 'cp38', 'macosx_11_0_x86_64')
----------------------------------------
ERROR: Failed building wheel for MulticoreTSNE
The version of python is 3.8.5, and the operating system is macOS Big Sur.
Any suggestions would be appreciated.
Hi.
Could you please try running your benchmark with scikit-learn 0.19.1?
Thanks!
Andy
Make sure you have cmake installed, otherwise it will silently fail. Should be in docs.
I downloaded and installed as per instructions, however on my 2017 MacBook Pro running High Sierra, the test script using MNIST always uses 1 core, regardless of the n_jobs parameter passed in.
I got a segmentation fault (core dumped)
error.
This was at the start of the Computing
input similarities...` step.
Any ideas how to debug?
Thanks
I would like to request some form of status output.
For example, I have access to a machine with 40 cores and 100 Gb ram and have been running Multicore-TSNE for a few days. It would be a nice to get some output every now and again.
Hi,
I was trying to get TSNE running on unbuntu in a docker container. However I am getting this error message below: Any way to get around with this? Thanks!
user12@82cf25a0ccd7:~/Multicore-TSNE$ pip install .
File "/miniconda/lib/python3.5/site.py", line 176
file=sys.stderr)
^
SyntaxError: invalid syntax
Hi, is it possible to install this module through pip? It would greatly reduce the inertia for new users
OS: ubuntu
python version: Python 3.6.4 :: Anaconda, Inc.
downloading MNIST
downloaded
Traceback (most recent call last):
File "MulticoreTSNE/examples/test.py", line 81, in <module>
tsne = TSNE(n_jobs=int(args.n_jobs), verbose=1, n_components=args.n_components, random_state=660)
File "/home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/MulticoreTSNE/__init__.py", line 63, in __init__
self.C = self.ffi.dlopen(path + "/libtsne_multicore.so")
File "/home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/cffi/api.py", line 141, in dlopen
lib, function_cache = _make_ffi_library(self, name, flags)
File "/home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/cffi/api.py", line 802, in _make_ffi_library
backendlib = _load_backend_lib(backend, libname, flags)
File "/home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/cffi/api.py", line 797, in _load_backend_lib
raise OSError(msg)
OSError: cannot load library '/home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/MulticoreTSNE/libtsne_multicore.so': /home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/MulticoreTSNE/libtsne_multicore.so: undefined symbol: _ZNSt8ios_base4InitD1Ev. Additionally, ctypes.util.find_library() did not manage to locate a library called '/home/marder/anaconda3/envs/mctsne/lib/python3.6/site-packages/MulticoreTSNE/libtsne_multicore.so'
The algorithm does not seem to work properly if the target space is bigger than 2-dimensional. Is there a plan for extended functionality?
Hello,
I use Multicore-opt-TSNE on a big data, 24 000 000 events and 18 parameters on a ubuntu server with 40 core, and 500G Ram with this command line :
python2 MulticoreTSNE/run/run_optsne.py --optsne --data Data.csv --outfile Data_tsne.csv --n_threads 40 --perp 50
and i have this error : Memory allocation failed!
Can you tell me if i change something in my commande line (in my parameters) i have more luck to run my job or if you know the setup necessary to run a Multicore-Tsne on thi big data ?
Best regards.
Quentin Barbier
I tried doing both pip install as in the directions along with the actual setup file but it can't find my cmake.
jespinozlt-osx:Multicore-TSNE jespinoz$ ls
MANIFEST.in README.md mnist-tsne.png multicore_tsne python requirements.txt setup.py torch
jespinozlt-osx:Multicore-TSNE jespinoz$ which cmake
/Users/jespinoz/anaconda/bin/cmake
jespinozlt-osx:Multicore-TSNE jespinoz$ python setup.py install
running install
-- The C compiler identification is AppleClang 8.0.0.8000042
-- The CXX compiler identification is AppleClang 8.0.0.8000042
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Checking if C linker supports --verbose
-- Checking if C linker supports --verbose - no
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Checking if CXX linker supports --verbose
-- Checking if CXX linker supports --verbose - no
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Try OpenMP C flag = [-fopenmp=libomp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [ ]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [/openmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [-Qopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [-openmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [-xopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [+Oopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [-qsmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP C flag = [-mp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-fopenmp=libomp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [ ]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [/openmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-Qopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-openmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-xopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [+Oopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-qsmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
-- Try OpenMP CXX flag = [-mp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed
CMake Error at /Users/jespinoz/anaconda/share/cmake-3.6/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
Could NOT find OpenMP (missing: OpenMP_C_FLAGS OpenMP_CXX_FLAGS)
Call Stack (most recent call first):
/Users/jespinoz/anaconda/share/cmake-3.6/Modules/FindPackageHandleStandardArgs.cmake:388 (_FPHSA_FAILURE_MESSAGE)
/Users/jespinoz/anaconda/share/cmake-3.6/Modules/FindOpenMP.cmake:234 (find_package_handle_standard_args)
CMakeLists.txt:6 (FIND_PACKAGE)
-- Configuring incomplete, errors occurred!
See also "/Users/jespinoz/Multicore-TSNE/multicore_tsne/release/CMakeFiles/CMakeOutput.log".
See also "/Users/jespinoz/Multicore-TSNE/multicore_tsne/release/CMakeFiles/CMakeError.log".
cannot find cmake
Whilst the compilation and installation worked fine on Windows 8.1, running the code in Python results in
OSError: cannot load library \lib\site-packages\MulticoreTSNE/libtsne_multicore.so: error 0x7e
I guess that's since windows rather expects a DLL than a .so library. Unfortunately my CMAKE skills are not sufficient to adjust the current build instructions to also produce a .dll on Windows - so here's me hoping that someone might fix that.
Hello,
Both through the pip and git clone install I get this:
` ERROR: running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/MulticoreTSNE
copying MulticoreTSNE/init.py -> build/lib.linux-x86_64-3.7/MulticoreTSNE
creating build/lib.linux-x86_64-3.7/MulticoreTSNE/tests
copying MulticoreTSNE/tests/init.py -> build/lib.linux-x86_64-3.7/MulticoreTSNE/tests
copying MulticoreTSNE/tests/test_base.py -> build/lib.linux-x86_64-3.7/MulticoreTSNE/tests
running egg_info
creating MulticoreTSNE.egg-info
writing MulticoreTSNE.egg-info/PKG-INFO
writing dependency_links to MulticoreTSNE.egg-info/dependency_links.txt
writing requirements to MulticoreTSNE.egg-info/requires.txt
writing top-level names to MulticoreTSNE.egg-info/top_level.txt
writing manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
reading manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'MulticoreTSNE.egg-info/SOURCES.txt'
running build_ext
cmake version 3.14.4
CMake suite maintained and supported by Kitware (kitware.com/cmake).
-- The CXX compiler identification is unknown
CMake Error at CMakeLists.txt:1 (PROJECT):
No CMAKE_CXX_COMPILER could be found.
Tell CMake where to find the compiler by setting either the environment
variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
to the compiler, or to the compiler name if it is in the PATH.
-- Configuring incomplete, errors occurred!
See also "/tmp/pip-req-build-ztrw2075/build/temp.linux-x86_64-3.7/CMakeFiles/CMakeOutput.log".
See also "/tmp/pip-req-build-ztrw2075/build/temp.linux-x86_64-3.7/CMakeFiles/CMakeError.log".
ERROR: Cannot generate Makefile. See above errors`
Hello and thank you for creating this library!
Some students of mine got stuck trying to use this library when they where providing wrongfully data forgetting to impute the NaN values and the library would just crash without any message specifying the nature of the error.
I will be making a pull request shortly to address this case and to add an exception that explains what needs to be fixed.
Hi!
is it possible to add other metrics, some data are not well suited for default 'euclidean' one?
Many thanks!
I tried saving the model using sklearn joblib but this error occur, TypeError: can't pickle module objects.
Is this a python pickle problem or library issue? I do found python had issue on pickling multiprocessing instance https://stackoverflow.com/questions/8804830/python-multiprocessing-pickling-error
Multicore-TSNE is the latest git master version, Python 3.6, Ubuntu 17.04
This is a "foward" from RGLab/Rtsne.multicore#7
I observed that the results differ based on the number of threads specified.
In my application which used BH-SNE to create a 2D embedding followed by automated clustering using DBSCAN, I have replaced the single-threaded Rtsne
call by a call to your multi-threaded Rtsne.multicore
. This was nice&easy thanks to the similarity of both interfaces.
However, when I run the application, the results differ ever so slightly, as indicated below (just the first couple of points each time):
Using 1 thread
-4.3473001944841 -9.88816236259427
-0.264536173449281 2.26121958696939
-11.8037471711157 -1.23420653192463
18.5043209507443 -13.4638139443446
1.51823629529208 -27.2209786228982
8.44296382274354 11.5004388863181
17.0385503073606 -19.5842234534257
-1.80122124653633 -35.1542911986375
-14.9339466535662 11.4724805072396
-16.7179891732902 10.300907221322
Using 2 threads
-4.33102494052646 -9.94346771160292
-0.300330796745644 2.47627128482164
-14.4865548712467 3.83169546954971
18.0266761572745 -13.3481838170748
1.55009711170931 -27.3536683521347
8.57133969496983 11.704078885386
16.8146752705904 -19.4804761345993
-1.67702875389705 -35.6116919363096
-16.328562693303 10.9834569354747
-17.9212513482976 10.1738069116024
Using 3 threads
-4.15202535615338 -9.91628914440292
-0.266922842312901 2.30165398545058
-12.0458514750223 -1.26327092092668
18.3116039523395 -13.4472311793933
1.8728867702686 -27.0478452540983
8.21259960134093 11.338018514761
16.938103908809 -19.4664656504238
-1.51129210868152 -35.5926372619633
-15.7107052664802 10.622091607029
-16.9275577907434 10.5760540704756
Using 4 threads
-4.40493207317474 -10.2542865145978
-0.240311071414228 2.34386945654285
-11.613066543124 -1.22167721092907
17.978213066292 -13.6367838896947
1.68103298346623 -27.3950001130062
8.48320430773571 11.5841961868582
16.5975194709815 -19.6467988772466
-1.21063128661383 -35.6738754692542
-16.2962040171112 11.6000609166704
-16.4988660902924 10.7927849813962
The results using the same number of threads seems to be consistent between different runs, though - which is good at least :)
Using 1 thread - a second run
-4.3473001944841 -9.88816236259427
-0.264536173449281 2.26121958696939
-11.8037471711157 -1.23420653192463
18.5043209507443 -13.4638139443446
1.51823629529208 -27.2209786228982
8.44296382274354 11.5004388863181
17.0385503073606 -19.5842234534257
-1.80122124653633 -35.1542911986375
-14.9339466535662 11.4724805072396
-16.7179891732902 10.300907221322
And for all the points, computing the MD5SUM:
cat ./one_threads/one.bin.embedding.tsv | awk '{print $1,$2}' | gmd5sum
2410c2539be68ffe1f52d1be0f04bfac -
cat ./one_threads_old/one.bin.embedding.tsv | awk '{print $1,$2}' | gmd5sum
2410c2539be68ffe1f52d1be0f04bfac -
cat ./two_threads/two.bin.embedding.tsv | awk '{print $1,$2}' | gmd5sum
1f7dd4212d74b162420c79e619b3b91b -
cat ./three_threads/three.bin.embedding.tsv | awk '{print $1,$2}' | gmd5sum
f659b3527318c9545766fed14fc72daa -
./four_threads/four.bin.embedding.tsv | awk '{print $1,$2}' | gmd5sum
0e7425b7acf3438d047fb1550bbd069f -
While the differences are hard to spot by eye - I mean in a 2D scatterplot -, the automatic clustering is affected by the differences.
Your input is greatly appreciated!
I explore this further and here is a minimal working example:
library(Rtsne.multicore) # Load package
library(digest)
iris_unique <- unique(iris) # Remove duplicates
mat <- as.matrix(iris_unique[,1:4])
set.seed(42) # Sets seed for reproducibility
tsne_out1 <- Rtsne.multicore(mat, num_threads = 1) # Run TSNE
set.seed(42) # Sets seed for reproducibility
tsne_out1_2 <- Rtsne.multicore(mat, num_threads = 1) # Run TSNE
set.seed(42) # Sets seed for reproducibility
tsne_out2 <- Rtsne.multicore(mat, num_threads = 2) # Run TSNE
set.seed(42) # Sets seed for reproducibility
tsne_out2_2 <- Rtsne.multicore(mat, num_threads = 2) # Run TSNE
set.seed(42) # Sets seed for reproducibility
tsne_out3 <- Rtsne.multicore(mat, num_threads = 3) # Run TSNE
set.seed(42) # Sets seed for reproducibility
tsne_out3_2 <- Rtsne.multicore(mat, num_threads = 3) # Run TSNE
set.seed(42) # Sets seed for reproducibility
tsne_out4 <- Rtsne.multicore(mat, num_threads = 4) # Run TSNE
print(digest(tsne_out1))
print(digest(tsne_out1_2))
print(digest(tsne_out2))
print(digest(tsne_out2_2))
print(digest(tsne_out3))
print(digest(tsne_out3_2))
print(digest(tsne_out4))
and some demo output from Rstudio:
> source('~/.active-rstudio-document')
[1] "6adbcd6eb0106f49c7ac0a99eae369fc"
[1] "6adbcd6eb0106f49c7ac0a99eae369fc"
[1] "6269caaf71aca51ca57e2ead7425a14f"
[1] "6269caaf71aca51ca57e2ead7425a14f"
[1] "82974082989bc301349e03f3d9ee5c5b"
[1] "a8c779d9a4f54f2c14d84b624ffe9da9"
[1] "ccc0b4af068a4c2005504c0b1493e256"
> source('~/.active-rstudio-document')
[1] "6adbcd6eb0106f49c7ac0a99eae369fc"
[1] "6adbcd6eb0106f49c7ac0a99eae369fc"
[1] "6269caaf71aca51ca57e2ead7425a14f"
[1] "6269caaf71aca51ca57e2ead7425a14f"
[1] "b3479248cefc9b979521e13b25418223"
[1] "07dd9ce0d52e0cb0d1332f8d4849675c"
[1] "8b3a73318d64dd07f96ecdc2e06251d5"
As you can see, the results are consistent between different runs using the same number of threads (here for 1 or 2 threads) yet differ when using different numbers of threads.
Moreover, I am confused as to why the results for 3 threads and 4 threads are different between two runs, i.e., behave differently than 1 or 2 threads.
This is quite puzzling to me and your input is highly appreciated!
Best,
Cedric
Would it be possible to include functionality to pass in a precomputed distance matrix?
This is a great work! The speed is impressive!
Meanwhile, I notice that this version does not provide the KL divergence attribute, right? In fact, in Scikit-Learn if you can easily get it from
tsne.kl_divergence_
Also, it seems that the perplexity should be smaller than 1/3 of the number of data points - any way to use a larger perplexity?
When I pull from today's trunk and attempt to install I get an error here.
/usr/bin/make -f CMakeFiles/tsne_multicore.dir/build.make CMakeFiles/tsne_multicore.dir/build
make[2]: Entering directory '/home/jorvis/git/Multicore-TSNE/build/temp.linux-x86_64-3.7'
[ 33%] Building CXX object CMakeFiles/tsne_multicore.dir/splittree.cpp.o
/usr/bin/c++ -Dtsne_multicore_EXPORTS -Wall -fopenmp -O3 -DNDEBUG -O3 -fPIC -ffast-math -funroll-loops -fPIC -o CMakeFiles
/tsne_multicore.dir/splittree.cpp.o -c /home/jorvis/git/Multicore-TSNE/multicore_tsne/splittree.cpp
/home/jorvis/git/Multicore-TSNE/multicore_tsne/splittree.cpp: In member function ‘void SplitTree::subdivide()’:
/home/jorvis/git/Multicore-TSNE/multicore_tsne/splittree.cpp:197:18: error: ‘mean_y’ was not declared in this scope
delete[] mean_y;
^
CMakeFiles/tsne_multicore.dir/build.make:65: recipe for target 'CMakeFiles/tsne_multicore.dir/splittree.cpp.o' failed
make[2]: *** [CMakeFiles/tsne_multicore.dir/splittree.cpp.o] Error 1
make[2]: Leaving directory '/home/jorvis/git/Multicore-TSNE/build/temp.linux-x86_64-3.7'
CMakeFiles/Makefile2:70: recipe for target 'CMakeFiles/tsne_multicore.dir/all' failed
make[1]: *** [CMakeFiles/tsne_multicore.dir/all] Error 2
make[1]: Leaving directory '/home/jorvis/git/Multicore-TSNE/build/temp.linux-x86_64-3.7'
Makefile:86: recipe for target 'all' failed
make: *** [all] Error 2
Looks like the failure might be happening in:
Multicore-TSNE/multicore_tsne/tsne.cpp
with this function: evaluateError where it is producing nans.
Willing to send $100USD in Bitcoin to the first person that can demonstrate a solution before I do.
I'd like to save the transformation at some range of intermediate iterations (or even every iteration if possible).
Something like this. Specifically for this sort of example animation.
Right now this is sort of possible by setting the random state and init and running from scratch each time with a different niter, but that's not exactly right.
I am using a python 3.7 version and could not install MulticoreTSNE using conda, or using pip (pip install MulticoreTSNE) since it tries to downgrade a few installed packages, including python itself (to 3.6.8). Below is the error message.
Should we be expecting MulticoreTSNE to be compatable with python 3.7, or would you recommend installing 3.6.8? I would avoid the latter, since it means quite a bit of reloading.
Cheers
The following packages will be DOWNGRADED:
_ipyw_jlab_nb_ext~ 0.1.0-py37_0 --> 0.1.0-py36_0
louvain 0.6.1-py37h0a44026_2 --> 0.6.1-py36h0a44026_2
mkl-service 1.1.2-py37hfbe908c_5 --> 1.1.2-py36hfbe908c_5
navigator-updater 0.2.1-py37_0 --> 0.2.1-py36_0
pot 0.5.1-py37h1702cab_1000 --> 0.5.1-py36h1702cab_1000
pycairo 1.18.0-py37ha54c0a8_1000 --> 1.18.0-py36ha54c0a8_1000
pycurl 7.43.0.2-py37ha12b0ac_0 --> 7.43.0.2-py36ha12b0ac_0
pyqt 5.9.2-py37h655552a_2 --> 5.9.2-py36h655552a_2
pyreadr 0.1.9-py37h2573ce8_0 --> 0.1.9-py36h2573ce8_0
python 3.7.3-h359304d_0 --> 3.6.8-haf84260_0
python-igraph 0.7.1.post7-py37h01d97ff_0 --> 0.7.1.post7-py36h01d97ff_0
sphinxcontrib 1.0-py37_1 --> 1.0-py36_1
Wonderful work! The API is simply perfect. But I wonder to know how to see the steps I have run and how can I visualize every step during the progress?
Hi @DmitryUlyanov, thanks for this library. I was looking at the license and it looks like the copyright year and name aren't filled out.
Line 3 in 11b0cd7
In the source code, there are a few headers that mention:
* Created by Laurens van der Maaten.
* Copyright 2012, Delft University of Technology. All rights reserved.
*
* Multicore version by Dmitry Ulyanov, 2016. [email protected]
Also, readme mentions the license is inherited from bhtsne, but that repo uses original BSD license (4 clause) and this one uses BSD 3 Clause.
Can you provide some clarification on the licensing?
import numpy as np
print(np.log(np.nextafter(0, np.inf, dtype=np.float64)))
from MulticoreTSNE import MulticoreTSNE as TSNE
tsne = TSNE(n_jobs=6)
print(np.log(np.nextafter(0, np.inf, dtype=np.float64)))
Result:
# print(np.log(np.nextafter(0, np.inf, dtype=np.float64)))
-744.4400719213812
# from MulticoreTSNE import MulticoreTSNE as TSNE
# tsne = TSNE(n_jobs=6)
# print(np.log(np.nextafter(0, np.inf, dtype=np.float64)))
__main__:1: RuntimeWarning: divide by zero encountered in log
-inf
After calling tsne = TSNE(n_jobs=6)
my numpy is not working any more as intended.
How can I fix this?
Title is pretty self explanatory. I used your implementation a few weeks ago successfully and everything was perfect, but now when i installed this on another machine after a few iterations it starts spamming that particular error message. I have tried installing everything from scratch and nothing seems to work. I am using the same data as with the other machines.
Has something been changed? Its pretty silly, but this is the only TSNE implementation that i can find that wont take me a day per attempt.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.