callidon / pyhdt Goto Github PK
View Code? Open in Web Editor NEWRead and query HDT documents with ease in Python
Home Page: https://callidon.github.io/pyHDT
License: MIT License
Read and query HDT documents with ease in Python
Home Page: https://callidon.github.io/pyHDT
License: MIT License
Is there a way of utilizing pyHDT to find triples that contain a given keyword? For instance, when working with the Semantic Web Dog Food HDT, I first load the data as below:
from hdt import HDTDocument
document = HDTDocument("swdf.hdt")
I am able to quickly find all the triples containing a given URI if I know that URI beforehand:
(triples, cardinality) = doc.search_triples("", "", "http://data.semanticweb.org/person/christian-bizer")
However, there are scenarios where it would be useful to search the HDT for URIs containing the keyword/phrase of interest, such as christian-bizer
. Is there a way of doing this with pyHDT?
I'm trying to query over a large HDT file (LOD-a-lot, 28 billion triples), where the indexes are already generated.
When I load the HDT file, it does not return any error and it is relatively fast:
document = HDTDocument(data.hdt)
However, it seems that not all triples are loaded. Specifically, I'm expecting the following command to return 28 billion triples, it is returning around 2.5 billion triples:
document.total_triples
I also verified from querying the data that indeed a number of triples were not loaded.
/dev/null
here.StdoutProgressListener
to get some progress output from the HDTManager::loadXXX methods. In the same commit hereLet me know if you find any of those useful and I'll send a PR. :)
In the newest commit, the search function is redefined.
If I have a graph containing
A,B,C
and I search for
X ? ?
I get all triples in the database as a result
A,B,C.
Under the 1.1.0 version and hdtSearch I get an empty set as the result.
I suspect that the resource to ID conversion results prior to searching might need to be checked.
Hi.. I am not able to import hdt.
I am not sure if the prerequisite gcc/clang with c++11 is supported.
How can I check it in windows?
How can I download it ?
Logs:
cl : Command line warning D9002 : ignoring unknown option '-std=c++11'
triple_iterator.cpp
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Iinclude -Iinclude -Iinclude/ -Ihdt-cpp-1.3.2/libhdt/include/ -Ihdt-cpp-1.3.2/libhdt/src/dictionary/ -Ihdt-cpp-1.3.2/libcds/include/ -Ihdt-cpp-1.3.2/libcds/src/static/bitsequence -Ihdt-cpp-1.3.2/libcds/src/static/coders -Ihdt-cpp-1.3.2/libcds/src/static/mapper -Ihdt-cpp-1.3.2/libcds/src/static/permutation -Ihdt-cpp-1.3.2/libcds/src/static/sequence -Ihdt-cpp-1.3.2/libcds/src/utils -Ic:\users\new\anaconda3\include -Ic:\users\new\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /EHsc /Tpsrc/tripleid_iterator.cpp /Fobuild\temp.win-amd64-3.7\Release\src/tripleid_iterator.obj -std=c++11
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.