faustomorales / pdqhash-python Goto Github PK
View Code? Open in Web Editor NEWPython bindings for Facebook's PDQ hash
License: MIT License
Python bindings for Facebook's PDQ hash
License: MIT License
Recently, I was attempting to install pdqhash-python in a fresh virtual environment without having installed any other dependencies first. I encountered an errror when the install process was attempting to execute pdqhash's setup.py.
❯ pip install pdqhash
Collecting pdqhash
Using cached pdqhash-0.2.1.tar.gz (638 kB)
ERROR: Command errored out with exit status 1:
command: /home/bodnarbm/.venv/pdqhash-python/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-kown38ok/pdqhash/setup.py'"'"'; __file__='"'"'/tmp/pip-install-kown38ok/pdqhash/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-kown38ok/pdqhash/pip-egg-info
cwd: /tmp/pip-install-kown38ok/pdqhash/
Complete output (5 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-kown38ok/pdqhash/setup.py", line 5, in <module>
import numpy
ModuleNotFoundError: No module named 'numpy'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Looking at the latest CI builds, I see that it is also having an initial issue with numpy
not being installed, but then pipenv
reattempts the install after installing the missing dependency.
I can also reproduce this inside of some docker python containers, either from pypi or a local clone of the repo.
From pypi:
❯ docker run -it python:3.6 pip install pdqhash
Collecting pdqhash
Downloading pdqhash-0.2.1.tar.gz (638 kB)
|████████████████████████████████| 638 kB 12.4 MB/s
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-2ops58sm/pdqhash/setup.py'"'"'; __file__='"'"'/tmp/pip-install-2ops58sm/pdqhash/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-g5ngyh5e
cwd: /tmp/pip-install-2ops58sm/pdqhash/
Complete output (5 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-2ops58sm/pdqhash/setup.py", line 5, in <module>
import numpy
ModuleNotFoundError: No module named 'numpy'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Local repo clone:
❯ docker run -v `pwd`:/opt/pdqhash-python -it python:3.6 pip install /opt/pdqhash-python
Processing /opt/pdqhash-python
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-x20n17b7/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-x20n17b7/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-vfnvcuxw
cwd: /tmp/pip-req-build-x20n17b7/
Complete output (5 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-req-build-x20n17b7/setup.py", line 5, in <module>
import numpy
ModuleNotFoundError: No module named 'numpy'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
This could create issues for those trying to compare hashes that use a different pdq implementations.
Test used to validate output are in reverse order (local order of each 16 bit section is correct)
facebook/ThreatExchange#319
This results in hashes like the following:
Expected:
e8ecb3355e3125c8e2ce3f30a0d4e84f8682b878b3c34cdbdb063278db27d992
Output:
d992db273278db064cdbb3c3b8788682e84fa0d43f30e2ce25c85e31b335e8ec
Hotfix that should work for users of the lib (still investigating):
hash_vector, quality = pdqhash.compute(image)
hash_vector = np.array(hash_vector).reshape(16, 16)[::-1, :].flatten()
Building pdqhash 0.2.2 fails in my environment with the following error.
Environment details:
OS: MacOS Monterey 12.5.1
Python: 3.10.6
➜ ~ pip3 inspect
...
"environment": {
"implementation_name": "cpython",
"implementation_version": "3.10.6",
"os_name": "posix",
"platform_machine": "arm64",
"platform_release": "21.6.0",
"platform_system": "Darwin",
"platform_version": "Darwin Kernel Version 21.6.0: Wed Aug 10 14:28:23 PDT 2022; root:xnu-8020.141.5~2/RELEASE_ARM64_T6000",
"python_full_version": "3.10.6",
"platform_python_implementation": "CPython",
"python_version": "3.10",
"sys_platform": "darwin"
}
Error message:
➜ ~ pip3 install pdqhash
Collecting pdqhash
Using cached pdqhash-0.2.2.tar.gz (638 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: pdqhash
Building wheel for pdqhash (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for pdqhash (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [32 lines of output]
Error in sitecustomize; set PYTHONVERBOSE for traceback:
AssertionError:
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-12-arm64-cpython-310
creating build/lib.macosx-12-arm64-cpython-310/tests
copying tests/__init__.py -> build/lib.macosx-12-arm64-cpython-310/tests
copying tests/test_compute.py -> build/lib.macosx-12-arm64-cpython-310/tests
creating build/lib.macosx-12-arm64-cpython-310/pdqhash
copying pdqhash/__init__.py -> build/lib.macosx-12-arm64-cpython-310/pdqhash
running egg_info
writing pdqhash.egg-info/PKG-INFO
writing dependency_links to pdqhash.egg-info/dependency_links.txt
writing top-level names to pdqhash.egg-info/top_level.txt
reading manifest file 'pdqhash.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*.so' found under directory 'pdqhash'
warning: no previously-included files matching '*.dll' found under directory 'pdqhash'
warning: no previously-included files matching '*.cpp' found under directory 'pdqhash'
warning: no previously-included files matching '*.c' found under directory 'pdqhash'
writing manifest file 'pdqhash.egg-info/SOURCES.txt'
copying pdqhash/bindings.pyx -> build/lib.macosx-12-arm64-cpython-310/pdqhash
running build_ext
building 'pdqhash.bindings' extension
creating build/temp.macosx-12-arm64-cpython-310
creating build/temp.macosx-12-arm64-cpython-310/pdqhash
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -IThreatExchange -I/opt/homebrew/Cellar/[email protected]/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/numpy/core/include -I/opt/homebrew/Cellar/[email protected]/3.10.6_2/Frameworks/Python.framework/Versions/3.10/include/python3.10 -c pdqhash/bindings.cpp -o build/temp.macosx-12-arm64-cpython-310/pdqhash/bindings.o --std=c++11
clang: error: no such file or directory: 'pdqhash/bindings.cpp'
clang: error: no input files
error: command '/usr/bin/clang' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pdqhash
Failed to build pdqhash
ERROR: Could not build wheels for pdqhash, which is required to install pyproject.toml-based projects
Microsoft Windows [Version 10.0.19043.1889]
(c) Microsoft Corporation. All rights reserved.
C:\Users\ikena>pip install pdqhash
Defaulting to user installation because normal site-packages is not writeable
Collecting pdqhash
Downloading pdqhash-0.2.2.tar.gz (638 kB)
|████████████████████████████████| 638 kB 819 kB/s
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: pdqhash
Building wheel for pdqhash (pyproject.toml) ... error
ERROR: Command errored out with exit status 1:
command: 'c:\program files\python39\python.exe' 'C:\Users\ikena\AppData\Roaming\Python\Python39\site-packages\pip\_vendor\pep517\in_process\_in_process.py' build_wheel 'C:\Users\ikena\AppData\Local\Temp\tmpi6pt2pqf'
cwd: C:\Users\ikena\AppData\Local\Temp\pip-install-u_1i99om\pdqhash_5c052aba4bed420ea6c308716eb6c9b4
Complete output (77 lines):
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-39
creating build\lib.win-amd64-cpython-39\pdqhash
copying pdqhash\__init__.py -> build\lib.win-amd64-cpython-39\pdqhash
creating build\lib.win-amd64-cpython-39\tests
copying tests\test_compute.py -> build\lib.win-amd64-cpython-39\tests
copying tests\__init__.py -> build\lib.win-amd64-cpython-39\tests
running egg_info
writing pdqhash.egg-info\PKG-INFO
writing dependency_links to pdqhash.egg-info\dependency_links.txt
writing top-level names to pdqhash.egg-info\top_level.txt
reading manifest file 'pdqhash.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*.so' found under directory 'pdqhash'
warning: no previously-included files matching '*.dll' found under directory 'pdqhash'
warning: no previously-included files matching '*.cpp' found under directory 'pdqhash'
warning: no previously-included files matching '*.c' found under directory 'pdqhash'
writing manifest file 'pdqhash.egg-info\SOURCES.txt'
copying pdqhash\bindings.pyx -> build\lib.win-amd64-cpython-39\pdqhash
running build_ext
cythoning pdqhash/bindings.pyx to pdqhash\bindings.cpp
warning: pdqhash\bindings.pyx:87:8: local variable 'quality' referenced before assignment
warning: pdqhash\bindings.pyx:90:41: local variable 'quality' referenced before assignment
warning: pdqhash\bindings.pyx:111:8: local variable 'quality' referenced before assignment
warning: pdqhash\bindings.pyx:112:50: local variable 'quality' referenced before assignment
warning: pdqhash\bindings.pyx:151:12: local variable 'quality' referenced before assignment
warning: pdqhash\bindings.pyx:162:7: local variable 'quality' referenced before assignment
warning: pdqhash\bindings.pyx:69:37: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:69:37: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:117:42: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:117:42: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:118:42: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:118:42: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:119:43: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:119:43: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:120:43: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:120:43: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:121:39: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:121:39: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:122:39: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:122:39: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:123:43: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:123:43: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:124:44: Not all members given for struct 'Hash256'
warning: pdqhash\bindings.pyx:124:44: Not all members given for struct 'Hash256'
building 'pdqhash.bindings' extension
creating build\temp.win-amd64-cpython-39
creating build\temp.win-amd64-cpython-39\Release
creating build\temp.win-amd64-cpython-39\Release\pdqhash
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IThreatExchange -IC:\Users\ikena\AppData\Local\Temp\pip-build-env-ricrhsaz\overlay\Lib\site-packages\numpy\core\include "-Ic:\program files\python39\include" "-Ic:\program files\python39\Include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.31103\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" /EHsc /Tppdqhash\bindings.cpp /Fobuild\temp.win-amd64-cpython-39\Release\pdqhash\bindings.obj --std=c++11
cl : Command line warning D9002 : ignoring unknown option '--std=c++11'
bindings.cpp
C:\Users\ikena\AppData\Local\Temp\pip-build-env-ricrhsaz\overlay\Lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_api.h(14) : Warning Msg: Using deprecated NumPy API, disable it with #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
ThreatExchange\pdq/cpp/common/pdqhashtypes.h(78): error C3861: '__builtin_popcount': identifier not found
ThreatExchange\pdq/cpp/common/pdqhashtypes.h(85): error C3861: '__builtin_popcount': identifier not found
ThreatExchange\pdq/cpp/common/pdqhashtypes.cpp(192): error C3861: 'random': identifier not found
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(26): warning C4305: 'initializing': truncation from 'double' to 'float'
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(27): warning C4305: 'initializing': truncation from 'double' to 'float'
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(28): warning C4305: 'initializing': truncation from 'double' to 'float'
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(294): warning C4244: 'initializing': conversion from 'float' to 'int', possible loss of data
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(302): warning C4244: 'initializing': conversion from 'float' to 'int', possible loss of data
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(533): warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(533): warning C4244: 'initializing': conversion from 'double' to 'const float', possible loss of data
ThreatExchange\pdq/cpp/hashing/pdqhashing.cpp(536): warning C4244: '=': conversion from 'double' to 'float', possible loss of data
ThreatExchange\pdq/cpp/downscaling/downscaling.cpp(18): warning C4305: 'initializing': truncation from 'double' to 'float'
ThreatExchange\pdq/cpp/downscaling/downscaling.cpp(19): warning C4305: 'initializing': truncation from 'double' to 'float'
ThreatExchange\pdq/cpp/downscaling/downscaling.cpp(20): warning C4305: 'initializing': truncation from 'double' to 'float'
pdqhash\bindings.cpp(2337): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pdqhash\bindings.cpp(2346): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pdqhash\bindings.cpp(2679): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pdqhash\bindings.cpp(2688): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pdqhash\bindings.cpp(3131): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pdqhash\bindings.cpp(3140): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.31.31103\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
----------------------------------------
ERROR: Failed building wheel for pdqhash
Failed to build pdqhash
ERROR: Could not build wheels for pdqhash, which is required to install pyproject.toml-based projects
WARNING: You are using pip version 21.3.1; however, version 22.2.2 is available.
You should consider upgrading via the 'c:\program files\python39\python.exe -m pip install --upgrade pip' command.
C:\Users\ikena>```
ERROR: Failed building wheel for pdqhash
Failed to build pdqhash
ERROR: Could not build wheels for pdqhash which use PEP 517 and cannot be installed directly
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.