rdflib / sparqlwrapper Goto Github PK

View Code? Open in Web Editor NEW

511.0 35.0 122.0 7.47 MB

A wrapper for a remote SPARQL endpoint

Home Page: https://sparqlwrapper.readthedocs.io/

License: Other

Makefile 0.03% Python 99.97%

sparql-endpoints python pypi wrapper sparql sparql-query rdf

sparqlwrapper's Introduction

SPARQL Endpoint interface to Python

About
Installation & Distribution
How to use
SPARQL Endpoint Implementations
Development

About

SPARQLWrapper is a simple Python wrapper around a SPARQL service to remotely execute your queries. It helps by creating the query invocation and, optionally, converting the result into a more manageable format.

Installation & Distribution

You can install SPARQLWrapper from PyPI:

$ pip install sparqlwrapper

You can install SPARQLWrapper from GitHub:

$ pip install git+https://github.com/rdflib/sparqlwrapper#egg=sparqlwrapper

You can install SPARQLWrapper from Debian:

$ sudo apt-get install python-sparqlwrapper

Note

Be aware that there could be a gap between the latest version of SPARQLWrapper and the version available as Debian package.

Also, the source code of the package can be downloaded in .zip and .tar.gz formats from GitHub SPARQLWrapper releases. Documentation is included in the distribution.

How to use

You can use SPARQLWrapper either as a Python command line script or as a Python package.

Command Line Script

To use as a command line script, you will need to install SPARQLWrapper and then a command line script called rqw (spaRQl Wrapper) will be available within the Python environment into which it is installed. run $ rql -h to see all the script's options.

Python package

Here are a series of examples of different queries executed via SPARQLWrapper as a python package.

SELECT examples

Simple use of this module is as follows where a live SPARQL endpoint is given and the JSON return format is used:

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper(
    "http://vocabs.ardc.edu.au/repository/api/sparql/"
    "csiro_international-chronostratigraphic-chart_geologic-time-scale-2020"
)
sparql.setReturnFormat(JSON)

# gets the first 3 geological ages
# from a Geological Timescale database,
# via a SPARQL endpoint
sparql.setQuery("""
    PREFIX gts: <http://resource.geosciml.org/ontology/timescale/gts#>

    SELECT *
    WHERE {
        ?a a gts:Age .
    }
    ORDER BY ?a
    LIMIT 3
    """
)

try:
    ret = sparql.queryAndConvert()

    for r in ret["results"]["bindings"]:
        print(r)
except Exception as e:
    print(e)

This should print out something like this:

{'a': {'type': 'uri', 'value': 'http://resource.geosciml.org/classifier/ics/ischart/Aalenian'}}
{'a': {'type': 'uri', 'value': 'http://resource.geosciml.org/classifier/ics/ischart/Aeronian'}}
{'a': {'type': 'uri', 'value': 'http://resource.geosciml.org/classifier/ics/ischart/Albian'}}

The above result is the response from the given endpoint, retrieved in JSON, and converted to a Python object, ret, which is then iterated over and printed.

ASK example

This query gets a boolean response from DBPedia's SPARQL endpoint:

from SPARQLWrapper import SPARQLWrapper, XML

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
    ASK WHERE { 
        <http://dbpedia.org/resource/Asturias> rdfs:label "Asturias"@es
    }    
""")
sparql.setReturnFormat(XML)
results = sparql.query().convert()
print(results.toxml())

You should see something like:

<?xml version="1.0" ?>
<sparql
    xmlns="http://www.w3.org/2005/sparql-results#"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd">
<head/>
    <boolean>true</boolean>
</sparql>

CONSTRUCT example

CONSTRUCT queries return RDF, so queryAndConvert() here produces an RDFlib Graph object which is then serialized to the Turtle format for printing:

from SPARQLWrapper import SPARQLWrapper

sparql = SPARQLWrapper("http://dbpedia.org/sparql")

sparql.setQuery("""
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX sdo: <https://schema.org/>

    CONSTRUCT {
      ?lang a sdo:Language ;
      sdo:alternateName ?iso6391Code .
    }
    WHERE {
      ?lang a dbo:Language ;
      dbo:iso6391Code ?iso6391Code .
      FILTER (STRLEN(?iso6391Code)=2) # to filter out non-valid values
    }
    LIMIT 3
""")

results = sparql.queryAndConvert()
print(results.serialize())

Results from this query should look something like this:

@prefix schema: <https://schema.org/> .

<http://dbpedia.org/resource/Arabic> a schema:Language ;
    schema:alternateName "ar" .

<http://dbpedia.org/resource/Aragonese_language> a schema:Language ;
    schema:alternateName "an" .

<http://dbpedia.org/resource/Uruguayan_Spanish> a schema:Language ;
    schema:alternateName "es" .

DESCRIBE example

Like CONSTRUCT queries, DESCRIBE queries also produce RDF results, so this example produces an RDFlib Graph object which is then serialized into the JSON-LD format and printed:

from SPARQLWrapper import SPARQLWrapper

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("DESCRIBE <http://dbpedia.org/resource/Asturias>")

results = sparql.queryAndConvert()
print(results.serialize(format="json-ld"))

The result for this example is large but starts something like this:

[
    {
        "@id": "http://dbpedia.org/resource/Mazonovo",
        "http://dbpedia.org/ontology/subdivision": [
            {
                "@id": "http://dbpedia.org/resource/Asturias"
            }
    ],
...

SPARQL UPDATE example

UPDATE queries write changes to a SPARQL endpoint, so we can't easily show a working example here. However, if https://example.org/sparql really was a working SPARQL endpoint that allowed updates, the following code might work:

from SPARQLWrapper import SPARQLWrapper, POST, DIGEST

sparql = SPARQLWrapper("https://example.org/sparql")
sparql.setHTTPAuth(DIGEST)
sparql.setCredentials("some-login", "some-password")
sparql.setMethod(POST)

sparql.setQuery("""
    PREFIX dbp:  <http://dbpedia.org/resource/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

    WITH <http://example.graph>
    DELETE {
       dbo:Asturias rdfs:label "Asturies"@ast
    }
    """
)

results = sparql.query()
print results.response.read()

If the above code really worked, it would delete the triple dbo:Asturias rdfs:label "Asturies"@ast from the graph http://example.graph.

SPARQLWrapper2 example

There is also a SPARQLWrapper2 class that works with JSON SELECT results only and wraps the results to make processing of average queries even simpler.

from SPARQLWrapper import SPARQLWrapper2

sparql = SPARQLWrapper2("http://dbpedia.org/sparql")
sparql.setQuery("""
    PREFIX dbp:  <http://dbpedia.org/resource/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

    SELECT ?label
    WHERE {
        dbp:Asturias rdfs:label ?label
    }
    LIMIT 3
    """
                )

for result in sparql.query().bindings:
    print(f"{result['label'].lang}, {result['label'].value}")

The above should print out something like:

en, Asturias
ar, أشتورية
ca, Astúries

Return formats

The expected return formats differs per query type (SELECT, ASK, CONSTRUCT, DESCRIBE...).

Note

From the SPARQL specification, The response body of a successful query operation with a 2XX response is either:

SELECT and ASK: a SPARQL Results Document in XML, JSON, or CSV/TSV format.
DESCRIBE and CONSTRUCT: an RDF graph serialized, for example, in the RDF/XML syntax, or an equivalent RDF graph serialization.

The package, though it does not contain a full SPARQL parser, makes an attempt to determine the query type when the query is set. This should work in most of the cases, but there is a possibility to set this manually, in case something goes wrong.

Automatic conversion of the results

To make processing somewhat easier, the package can do some conversions automatically from the return result. These are:

for XML, the xml.dom.minidom is used to convert the result stream into a Python representation of a DOM tree.
for JSON, the json package to generate a Python dictionary.
for CSV or TSV, a simple string.
For RDF/XML and JSON-LD, the RDFLib package is used to convert the result into a Graph instance.
For RDF Turtle/N3, a simple string.

There are two ways to generate this conversion:

use ret.convert() in the return result from sparql.query() in the code above
use sparql.queryAndConvert() to get the converted result right away, if the intermediate stream is not used

For example, in the code below:

try :
    sparql.setReturnFormat(SPARQLWrapper.JSON)
    ret = sparql.query()
    d = ret.convert()
except Exception as e:
    print(e)

the value of d is a Python dictionary of the query result, based on the SPARQL Query Results JSON Format.

Partial interpretation of the results

Further help is to offer an extra, partial interpretation of the results, again to cover most of the practical use cases. Based on the SPARQL Query Results JSON Format, the SPARQLWrapper.SmartWrapper.Bindings class can perform some simple steps in decoding the JSON return results. If SPARQLWrapper.SmartWrapper.SPARQLWrapper2 is used instead of SPARQLWrapper.Wrapper.SPARQLWrapper, this result format is generated. Note that this relies on a JSON format only, ie, it has to be checked whether the SPARQL service can return JSON or not.

Here is a simple code that makes use of this feature:

from SPARQLWrapper import SPARQLWrapper2

sparql = SPARQLWrapper2("http://example.org/sparql")
sparql.setQuery("""
    SELECT ?subj ?prop
    WHERE {
        ?subj ?prop ?obj
    }
    """
)

try:
    ret = sparql.query()
    print(ret.variables)  # this is an array consisting of "subj" and "prop"
    for binding in ret.bindings:
        # each binding is a dictionary. Let us just print the results
        print(f"{binding['subj'].value}, {binding['subj'].type}")
        print(f"{binding['prop'].value}, {binding['prop'].type}")
except Exception as e:
    print(e)

To make this type of code even easier to realize, the [] and in operators are also implemented on the result of SPARQLWrapper.SmartWrapper.Bindings. This can be used to check and find a particular binding (ie, particular row in the return value). This features becomes particularly useful when the OPTIONAL feature of SPARQL is used. For example:

from SPARQLWrapper import SPARQLWrapper2

sparql = SPARQLWrapper2("http://example.org/sparql")
sparql.setQuery("""
    SELECT ?subj ?obj ?opt
    WHERE {
        ?subj <http://a.b.c> ?obj .
        OPTIONAL {
            ?subj <http://d.e.f> ?opt
        }
    }
    """
)

try:
    ret = sparql.query()
    print(ret.variables)  # this is an array consisting of "subj", "obj", "opt"
    if ("subj", "prop", "opt") in ret:
        # there is at least one binding covering the optional "opt", too
        bindings = ret["subj", "obj", "opt"]
        # bindings is an array of dictionaries with the full bindings
        for b in bindings:
            subj = b["subj"].value
            o = b["obj"].value
            opt = b["opt"].value
            # do something nice with subj, o, and opt

    # another way of accessing to values for a single variable:
    # take all the bindings of the "subj"
    subjbind = ret.getValues("subj")  # an array of Value instances
    ...
except Exception as e:
    print(e)

GET or POST

By default, all SPARQL services are invoked using HTTP GET verb. However, POST might be useful if the size of the query extends a reasonable size; this can be set in the query instance.

Note that some combinations may not work yet with all SPARQL processors (e.g., there are implementations where POST + JSON return does not work). Hopefully, this problem will eventually disappear.

SPARQL Endpoint Implementations

Introduction

From SPARQL 1.1 Specification:

The response body of a successful query operation with a 2XX response is either:

SELECT and `ASK`: a SPARQL Results Document in XML, JSON, or CSV/TSV format.
DESCRIBE and `CONSTRUCT`: an RDF graph serialized, for example, in the RDF/XML syntax, or an equivalent RDF graph serialization.

The fact is that the parameter key for the choice of the output format is not defined. Virtuoso uses format, Fuseki uses output, rasqual seems to use results, etc... Also, in some cases HTTP Content Negotiation can/must be used.

ClioPatria

Website: The SWI-Prolog Semantic Web Server
Documentation: Search 'sparql' in http://cliopatria.swi-prolog.org/help/http.
Uses: Parameters and Content Negotiation.
Parameter key: format.
Parameter value: MUST be one of these values: rdf+xml, json, csv, application/sparql-results+xml or application/sparql-results+json.

OpenLink Virtuoso

Website: OpenLink Virtuoso
Parameter key: format or output.
JSON-LD (application/ld+json): supported (in CONSTRUCT and DESCRIBE).

Parameter value, like directly: "text/html" (HTML), "text/x-html+tr" (HTML (Faceted Browsing Links)), "application/vnd.ms-excel", "application/sparql-results+xml" (XML), "application/sparql-results+json" (JSON), "application/javascript" (Javascript), "text/turtle" (Turtle), "application/rdf+xml" (RDF/XML), "text/plain" (N-Triples), "text/csv" (CSV), "text/tab-separated-values" (TSV)
Parameter value, like indirectly: "HTML" (alias text/html), "JSON" (alias application/sparql-results+json), "XML" (alias application/sparql-results+xml), "TURTLE" (alias text/rdf+n3), JavaScript (alias application/javascript) See http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSSparqlProtocol#Additional HTTP Response Formats -- SELECT
For a SELECT query type, the default return mimetype (if Accept: */* is sent) is application/sparql-results+xml
For a ASK query type, the default return mimetype (if Accept: */* is sent) is text/html
For a CONSTRUCT query type, the default return mimetype (if Accept: */* is sent) is text/turtle
For a DESCRIBE query type, the default return mimetype (if Accept: */* is sent) is text/turtle

Fuseki

Website: Fuseki
Uses: Parameters and Content Negotiation.
Parameter key: format or output (Fuseki 1, Fuseki 2).
JSON-LD (application/ld+json): supported (in CONSTRUCT and DESCRIBE).

Fuseki 1 - Short names for "output=" : "json", "xml", "sparql", "text", "csv", "tsv", "thrift"
Fuseki 2 - Short names for "output=" : "json", "xml", "sparql", "text", "csv", "tsv", "thrift"
If a non-expected short name is used, the server returns an "Error 400: Can't determine output serialization"
Valid alias for SELECT and ASK: "json", "xml", csv", "tsv"
Valid alias for DESCRIBE and CONSTRUCT: "json" (alias for json-ld ONLY in Fuseki 2), "xml"
Valid mimetype for DESCRIBE and CONSTRUCT: "application/ld+json"
Default return mimetypes: For a SELECT and ASK query types, the default return mimetype (if Accept: / is sent) is application/sparql-results+json
Default return mimetypes: For a DESCRIBE and CONTRUCT query types, the default return mimetype (if Accept: / is sent) is text/turtle
In case of a bad formed query, Fuseki 1 returns 200 instead of 400.

Eclipse RDF4J

Website: Eclipse RDF4J (formerly known as OpenRDF Sesame)
Documentation: https://rdf4j.eclipse.org/documentation/rest-api/#the-query-operation, https://rdf4j.eclipse.org/documentation/rest-api/#content-types
Uses: Only content negotiation (no URL parameters).
Parameter: If an unexpected parameter is used, the server ignores it.
JSON-LD (application/ld+json): supported (in CONSTRUCT and DESCRIBE).

SELECT
- application/sparql-results+xml (DEFAULT if Accept: */* is sent))
- application/sparql-results+json (also application/json)
- text/csv
- text/tab-separated-values
- Other values: application/x-binary-rdf-results-table
ASK
- application/sparql-results+xml (DEFAULT if Accept: */* is sent))
- application/sparql-results+json
- Other values: text/boolean
- Not supported: text/csv
- Not supported: text/tab-separated-values
CONSTRUCT
- application/rdf+xml
- application/n-triples (DEFAULT if Accept: */* is sent)
- text/turtle
- text/n3
- application/ld+json
- Other acceptable values: application/n-quads, application/rdf+json, application/trig, application/trix, application/x-binary-rdf
- text/plain (returns application/n-triples)
- text/rdf+n3 (returns text/n3)
- text/x-nquads (returns application/n-quads)
DESCRIBE
- application/rdf+xml
- application/n-triples (DEFAULT if Accept: */* is sent)
- text/turtle
- text/n3
- application/ld+json
- Other acceptable values: application/n-quads, application/rdf+json, application/trig, application/trix, application/x-binary-rdf
- text/plain (returns application/n-triples)
- text/rdf+n3 (returns text/n3)
- text/x-nquads (returns application/n-quads)

RASQAL

Website: RASQAL
Documentation: http://librdf.org/rasqal/roqet.html
Parameter key: results.
JSON-LD (application/ld+json): NOT supported.

Uses roqet as RDF query utility (see http://librdf.org/rasqal/roqet.html) For variable bindings, the values of FORMAT vary upon what Rasqal supports but include simple for a simple text format (default), xml for the SPARQL Query Results XML format, csv for SPARQL CSV, tsv for SPARQL TSV, rdfxml and turtle for RDF syntax formats, and json for a JSON version of the results.

For RDF graph results, the values of FORMAT are ntriples (N-Triples, default), rdfxml-abbrev (RDF/XML Abbreviated), rdfxml (RDF/XML), turtle (Turtle), json (RDF/JSON resource centric), json-triples (RDF/JSON triples) or rss-1.0 (RSS 1.0, also an RDF/XML syntax).

Marklogic

Website: Marklogic
Uses: Only content negotiation (no URL parameters).
JSON-LD (application/ld+json): NOT supported.

You can use following methods to query triples:

SPARQL mode in Query Console. For details, see Querying Triples with SPARQL
XQuery using the semantics functions, and Search API, or a combination of XQuery and SPARQL. For details, see Querying Triples with XQuery or JavaScript.
HTTP via a SPARQL endpoint. For details, see Using Semantics with the REST Client API.

Formats are specified as part of the HTTP Accept headers of the REST request. When you query the SPARQL endpoint with REST Client APIs, you can specify the result output format (See https://docs.marklogic.com/guide/semantics/REST#id_54258. The response type format depends on the type of query and the MIME type in the HTTP Accept header.

This table describes the MIME types and Accept Header/Output formats (MIME type) for different types of SPARQL queries. (See https://docs.marklogic.com/guide/semantics/REST#id_54258 and https://docs.marklogic.com/guide/semantics/loading#id_70682)

SELECT
- application/sparql-results+xml
- application/sparql-results+json
- text/html
- text/csv
ASK queries return a boolean (true or false).
CONSTRUCT or DESCRIBE
- application/n-triples
- application/rdf+json
- application/rdf+xml
- text/turtle
- text/n3
- application/n-quads
- application/trig

AllegroGraph

Website: AllegroGraph
Documentation: https://franz.com/agraph/support/documentation/current/http-protocol.html
Uses: Only content negotiation (no URL parameters).
Parameter: The server always looks at the Accept header of a request, and tries to generate a response in the format that the client asks for. If this fails, a 406 response is returned. When no Accept, or an Accept of / is specified, the server prefers text/plain, in order to make it easy to explore the interface from a web browser.
JSON-LD (application/ld+json): NOT supported.

SELECT
- application/sparql-results+xml (DEFAULT if Accept: / is sent)
- application/sparql-results+json (and application/json)
- text/csv
- text/tab-separated-values
- OTHERS: application/sparql-results+ttl, text/integer, application/x-lisp-structured-expression, text/table, application/processed-csv, text/simple-csv, application/x-direct-upis
ASK
- application/sparql-results+xml (DEFAULT if Accept: / is sent)
- application/sparql-results+json (and application/json)
- Not supported: text/csv
- Not supported: text/tab-separated-values
CONSTRUCT
- application/rdf+xml (DEFAULT if Accept: / is sent)
- text/rdf+n3
- OTHERS: text/integer, application/json, text/plain, text/x-nquads, application/trix, text/table, application/x-direct-upis
DESCRIBE
- application/rdf+xml (DEFAULT if Accept: / is sent)
- text/rdf+n3

4store

Website: 4store
Documentation: https://4store.danielknoell.de/trac/wiki/SparqlServer/
Uses: Parameters and Content Negotiation.
Parameter key: output.
Parameter value: alias. If an unexpected alias is used, the server is not working properly.
JSON-LD (application/ld+json): NOT supported.

SELECT
- application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
- application/sparql-results+json or application/json (alias json)
- text/csv (alias csv)
- text/tab-separated-values (alias tsv). Returns "text/plain" in GET.
- Other values: text/plain, application/n-triples
ASK
- application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
- application/sparql-results+json or application/json (alias json)
- text/csv (alias csv)
- text/tab-separated-values (alias tsv). Returns "text/plain" in GET.
- Other values: text/plain, application/n-triples
CONSTRUCT
- application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
- text/turtle (alias "text")
DESCRIBE
- application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
- text/turtle (alias "text")

Valid alias for SELECT and ASK: "json", "xml", csv", "tsv" (also "text" and "ascii")
Valid alias for DESCRIBE and CONSTRUCT: "xml", "text" (for turtle)

Blazegraph

Website: Blazegraph (Formerly known as Bigdata) & NanoSparqlServer
Documentation: https://wiki.blazegraph.com/wiki/index.php/REST_API#SPARQL_End_Point
Uses: Parameters and Content Negotiation.
Parameter key: format (available since version 1.4.0). Setting this parameter will override any Accept Header that is present
Parameter value: alias. If an unexpected alias is used, the server is not working properly.
JSON-LD (application/ld+json): NOT supported.

SELECT
- application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
- application/sparql-results+json or application/json (alias json)
- text/csv
- text/tab-separated-values
- Other values: application/x-binary-rdf-results-table
ASK
- application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
- application/sparql-results+json or application/json (alias json)
CONSTRUCT
- application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
- text/turtle (returns text/n3)
- text/n3
DESCRIBE
- application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
- text/turtle (returns text/n3)
- text/n3

Valid alias for SELECT and ASK: "xml", "json"
Valid alias for DESCRIBE and CONSTRUCT: "xml", "json" (but it returns unexpected "application/sparql-results+json")

GraphDB

Website: GraphDB, formerly known as OWLIM (OWLIM-Lite, OWLIM-SE)
Documentation: https://graphdb.ontotext.com/documentation/free/
Uses: Only content negotiation (no URL parameters).
Note: If the Accept value is not within the expected ones, the server returns a 406 "No acceptable file format found."
JSON-LD (application/ld+json): supported (in CONSTRUCT and DESCRIBE).

SELECT
- application/sparql-results+xml, application/xml (.srx file)
- application/sparql-results+json, application/json (.srj file)
- text/csv (DEFAULT if Accept: / is sent)
- text/tab-separated-values
ASK
- application/sparql-results+xml, application/xml (.srx file)
- application/sparql-results+json (DEFAULT if Accept: / is sent), application/json (.srj file)
- NOT supported: text/csv, text/tab-separated-values
CONSTRUCT
- application/rdf+xml, application/xml (.rdf file)
- text/turtle (.ttl file)
- application/n-triples (.nt file) (DEFAULT if Accept: / is sent)
- text/n3, text/rdf+n3 (.n3 file)
- application/ld+json (.jsonld file)
DESCRIBE
- application/rdf+xml, application/xml (.rdf file)
- text/turtle (.ttl file)
- application/n-triples (.nt file) (DEFAULT if Accept: / is sent)
- text/n3, text/rdf+n3 (.n3 file)
- application/ld+json (.jsonld file)

Stardog

Website: Stardog
Documentation: https://www.stardog.com/docs/#_http_headers_content_type_accept (looks outdated)
Uses: Only content negotiation (no URL parameters).
Parameter key: If an unexpected parameter is used, the server ignores it.
JSON-LD (application/ld+json): supported (in CONSTRUCT and DESCRIBE).

SELECT
- application/sparql-results+xml (DEFAULT if Accept: / is sent)
- application/sparql-results+json
- text/csv
- text/tab-separated-values
- Other values: application/x-binary-rdf-results-table
ASK
- application/sparql-results+xml (DEFAULT if Accept: / is sent)
- application/sparql-results+json
- Other values: text/boolean
- Not supported: text/csv
- Not supported: text/tab-separated-values
CONSTRUCT
- application/rdf+xml
- text/turtle (DEFAULT if Accept: / is sent)
- text/n3
- application/ld+json
- Other acceptable values: application/n-triples, application/x-turtle, application/trig, application/trix, application/n-quads
DESCRIBE
- application/rdf+xml
- text/turtle (DEFAULT if Accept: / is sent)
- text/n3
- application/ld+json
- Other acceptable values: application/n-triples, application/x-turtle, application/trig, application/trix, application/n-quads

Development

Requirements

The RDFLib package is used for RDF parsing.

This package is imported in a lazy fashion, i.e. only when needed. If the user never intends to use the RDF format, the RDFLib package is not imported and the user does not have to install it.

Source code

The source distribution contains:

SPARQLWrapper: the Python package. You should copy the directory somewhere into your PYTHONPATH. Alternatively, you can also run the distutils scripts: python setup.py install
test: some unit and integrations tests. In order to run the tests some packages have to be installed before. So please install the dev packages: pip install '.[dev]'
scripts: some scripts to run the package against some SPARQL endpoints.
docs: the documentation.

Community

Community support is available through the RDFlib developer's discussion group rdflib-dev. The archives. from the old mailing list are still available.

Issues

Please, report any issue to github.

Documentation

The SPARQLWrapper documentation is available online.

Other interesting documents are the latest SPARQL 1.1 Specification (W3C Recommendation 21 March 2013) and the initial SPARQL Specification (W3C Recommendation 15 January 2008).

License

The SPARQLWrapper package is licensed under W3C license.

Acknowledgement

The package was greatly inspired by Lee Feigenbaum's similar package for Javascript.

Developers involved:

Ivan Herman <http://www.ivan-herman.net>
Sergio Fernández <http://www.wikier.org>
Carlos Tejo Alonso <http://www.dayures.net>
Alexey Zakhlestin <https://indeyets.ru/>

Organizations involved:

sparqlwrapper's People

Contributors

Stargazers

Watchers

Forkers

web5design pombredanne indeyets dbs olberger abdelghaniazri karimkhanp nicholsn andrea-f asiant jludik bcroq zxenia marcelometal cmarat amarillion asergi georggr christoschristofidis h4ck3rm1k3 sunnepah jpmccu domsooch allenakinkunle c-martinez satra debmalya joao-pedro riccardotommasini ankushsaini44 gromgull thinkin3d lamby pwin isspek ari-vedant-jain tecnodiario yogeshsajanikar hsuse statisticsfinland edwardbetts purudpd justachetan danmichaelo dinghe light-city cottrell auz-amjad surreyhobbit zhangziliang04 darrix zabihimayvan wsu-wacs zhang14 erika1203 chilimangoes meznercoan zhangxuann simonlmartin ilibx paustefa dalavancloud alkouf njanakiev bzqweiyi feileyu jeffersonchaves qimiaorpi skylerreimer alex-ip yuqing2018 abuonomo subrotho shafaypro kevinguom mallahyari fhircat oca99 syats t0b3 mosoriob diepinto30 fakabbir amin-siemens openboilerplates strangepleasures situx rajesh-ibm-power konstantinklepikov santosh653 denatahvildari admariner asanchez75 sitedata marnellus giuseppefutia wolfch-elsevier aahmadai chimera-suite fyuval

sparqlwrapper's Issues

packaging broken

Hi guys, seeing another pip install failure on our build server as of release 1.7.3

Mixlib::ShellOut::ShellCommandFailed

Expected process to exit with [0], but received '1'
---- Begin output of "bash" "/tmp/chef-script20151105-4556-o9zodr" ----
STDOUT: Collecting rdflib (from -r requirements.txt (line 1))
Downloading rdflib-4.2.1.tar.gz (889kB)
Collecting statsd (from -r requirements.txt (line 2))
Downloading statsd-3.2.1-py2.py3-none-any.whl
Collecting lshash==0.0.4dev (from -r requirements.txt (line 3))
Downloading lshash-0.0.4dev.tar.gz
Collecting knowsis.commons[stats](from -r requirements.txt %28line 4%29)
Downloading https://repo.fury.io/knowsis/
Collecting isodate (from rdflib->-r requirements.txt (line 1))
Downloading isodate-0.5.4.tar.gz
Collecting pyparsing (from rdflib->-r requirements.txt (line 1))
Downloading pyparsing-2.0.5-py2.py3-none-any.whl
Collecting SPARQLWrapper (from rdflib->-r requirements.txt (line 1))
Downloading SPARQLWrapper-1.7.3.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 20, in
File "/tmp/pip-build-uBrAyp/SPARQLWrapper/setup.py", line 45, in
requirements = list(parse_requirements('requirements.txt', session=PipSession()))
File "/home/knowsis/releases/7283cfc3f18343475889e947570c2f1a295f5c36/env/lib/python2.7/site-packages/pip/req/req_file.py", line 77, in parse_requirements
filename, comes_from=comes_from, session=session
File "/home/knowsis/releases/7283cfc3f18343475889e947570c2f1a295f5c36/env/lib/python2.7/site-packages/pip/download.py", line 416, in get_file_content
'Could not open requirements file: %s' % str(exc)
pip.exceptions.InstallationError: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'

QueryResult.convert() should be able to provide predictable result format

Currently, result of QueryResult.convert depends on the input type:

SPARQL_XML results in DOM object
SPARQL_JSON results in associative array (decoded json)
RDF_N3 results in string
RDF_XML and RDF_JSONLD result in rdflib.ConjunctiveGraph

It is low-level stuff and while it works in a controlled environment (users of library might get a working workflow via trial-and-error), in real-world usecases endpoint might give results in unexpected format, which will be handled differently and users application will just break as result will be of the wrong type.

convert() should have only 2 options for returned values (similar to what is listed in spec):

SPARQL_* should return "SPARQL Results Document" in the form of object which gives access to metadata and provides iterator for getting rows. Both resources and literals should be provided as corresponding rdflib objects (I think #15 is just about this)
RDF_* should return "RDF graph" in the form of rdflib.ConjunctiveGraph

So, in most cases, users of the library should never need to specify desired response-type. Whatever it is on the low-level, sparqlwrapper should present it via proper high-level objects

Find a long-term solution for keepalive

As reported in the mailing list, looks like the keepalive module has been removed in version 3.9.1 of urlgrabber.

I found some discussions about the issue:

http://stackoverflow.com/questions/1037406/python-urllib2-with-keep-alive

But I need to read more to find a long-term solution for such feature...

setCredentials is detached from endpoints

SPARQLWrapper takes endpoint and updateEndpoint URLs as parameters of constructor

separately, it has method named setCredentials which allows to specify login/password which are used for Basic Authentication.

There are 2 problems with this:

it is possible to have endpoints for queries and updates which require different credentials and current implementation doesn't provide good solution for this case
endpoints might use different authentication schemes (Digest Authentication, OAuth, …)

I propose to separate 3 entities:

Set of classes which implement informal Authentication protocol (I wonder if we can reuse some existing package for this)
Endpoint class which implements SPARQL 1.1 Protocol which take Auth-object as parameter and takes care of HTTP part
Wrapper which takes 1 or 2 Endpoint-objects as parameters and provides user-friendly API

This approach will also allow us to add endpoint implementations which use custom triplestore-specific protocols as a bonus.

This is a large change, so should be implemented in 2.x branch.

SPARQL/Update query-types

Right now, 3 query-types for SPARQL/Update are defined: INSERT, DELETE, MODIFY and query-type is detected via regular expression.

The problem is, that there is no MODIFY keyword in SPARQL/Update grammer as modify query is defined as ( 'WITH' iri )? ( DeleteClause InsertClause? | InsertClause ) UsingClause* 'WHERE' GroupGraphPattern.

On the other hand, there is a bunch of query-types unknown to SPARQLWrapper (LOAD, CLEAR, DROP, etc.)

setCredentials "Invalid header value" / "QueryBadFormed"

Using SPARQLWrapper 1.6.4 (pip).

When setCredentials() is used the resulting query request Authorization header seems to contain an erroneous trailing newline.

This results in an "Invalid header value" ValueError with Python 2.7, or a "QueryBadFormed" exception with Python 2.6.

The following fix to Wrapper.py seems to correct this behaviour:

# Replace this
request.add_header("Authorization", "Basic %s" % base64.encodestring(credentials.encode('utf-8')))
# With this
request.add_header("Authorization", "Basic %s" % base64.b64encode(credentials.encode('utf-8')))

Add conversion to RDFLib native objects

Sometimes it'd be useful to be able to convert SELECT results to native RDFLib objects.

Build system broken

There are two several problems:

urs@speedy:~/p/RDFLib/sparqlwrapper$ pip install --user .
Unpacking /home/urs/p/RDFLib/sparqlwrapper
  Running setup.py (path:/tmp/pip-0_cPU0-build/setup.py) egg_info for package from file:///home/urs/p/RDFLib/sparqlwrapper
    SPARQLWrapper/Wrapper.py:101: RuntimeWarning: JSON-LD disabled because no suitable support has been found
      warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip-0_cPU0-build/setup.py", line 32, in <module>
        author = SPARQLWrapper__authors__,
    NameError: name 'SPARQLWrapper__authors__' is not defined
    Complete output from command python setup.py egg_info:
    SPARQLWrapper/Wrapper.py:101: RuntimeWarning: JSON-LD disabled because no suitable support has been found

  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)

Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip-0_cPU0-build/setup.py", line 32, in <module>

    author = SPARQLWrapper__authors__,

NameError: name 'SPARQLWrapper__authors__' is not defined

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip-0_cPU0-build
Storing debug log for failure in /home/urs/.pip/pip.log

There is a dot missing between SPARQLWrapper_ and __authors__.

Having fixed this, it works for Python 2, but running pip3 for Python 3 exposes another problem:

urs@speedy:~/p/RDFLib/sparqlwrapper$ pip3 install --user .
Unpacking /home/urs/p/RDFLib/sparqlwrapper
  Running setup.py (path:/tmp/pip-s1q1yjhc-build/setup.py) egg_info for package from file:///home/urs/p/RDFLib/sparqlwrapper
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip-s1q1yjhc-build/setup.py", line 24, in <module>
        import SPARQLWrapper
      File "/tmp/pip-s1q1yjhc-build/SPARQLWrapper/__init__.py", line 187, in <module>
        from Wrapper import SPARQLWrapper
    ImportError: No module named 'Wrapper'
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip-s1q1yjhc-build/setup.py", line 24, in <module>

    import SPARQLWrapper

  File "/tmp/pip-s1q1yjhc-build/SPARQLWrapper/__init__.py", line 187, in <module>

    from Wrapper import SPARQLWrapper

ImportError: No module named 'Wrapper'

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip-s1q1yjhc-build
Storing debug log for failure in /home/urs/.pip/pip.log

It's probably a bad idea to import SPARQLWrapper before it has been tranformed by 2to3, because setup.py is also run by Python 3. (It seems this particular error occurs because relative imports in Python 3 don't work without a leading dot anymore.)

update doc available on the web

using epydoc

Link for rdflib raises Error

The link for rdflib "http://www.rdflib.net/" in http://rdflib.github.io/sparqlwrapper gives "Error:not found" . It should be changed to https://pypi.python.org/pypi/rdflib

allow for manual setting of Accept header or headers in general

as can be seen in http://stackoverflow.com/a/30306013/1423333 there are endpoints out there which don't understand the ",".join(_SPARQL_JSON) https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/Wrapper.py#L451 Accept header, but only want "one" mime-type :(

in https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/Wrapper.py#L506 we unconditionally add the Accept header.

I think it would be much better if one could manually set headers somehow and the _createRequest() method wouldn't override them if they're there already.

Could be done in one go when switching to the Requests lib (see #51)?

enhance warning for missing rdflib-jsonld

Currently a default install of SPARQLWrapper issues this warning for every user:

In [1]: from SPARQLWrapper import SPARQLWrapper, JSON
/usr/local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)

Several issues with this:

seems weird to me... if i just installed something, then want to use it and the first thing it tells me is "Warning, i'm not complete" i get a bad feeling...
if missing jsonld support is worth a warning, why doesn't SPARQLWrapper depend on it so it's auto installed?
in any case the warning should tell me that i maybe want to install the rdflib-jsonld package... this might seem obvious to us, but newcomers might be quite confused without this hint.
if it's really optional (so we don't expect that a sane endpoint unasked replies with JSONLD), could this warning maybe be delayed until someone tries tosetReturnFormat(JSONLD)?

release notes?

Is there anywhere I can find a concise set of release notes for v1.7.0?

travis tests fail for py3

tests for python 3 seem to fail somewhere around the urlgrabber module:

https://travis-ci.org/RDFLib/sparqlwrapper/jobs/77481345

I got an error when using setCredentials

ValueError: Invalid header value 'Basic pokwepdokwo\n'

pokwepdokwo is the actual encoding of user:password that I see also in the Authorization field in the Request header for example in Chrome, so there is a wrongfully added \n at the end, is this an error or it's me do something wrong?

Unicode problems

There are still some unicode related issues in 1.6.2. I should have spotted them sooner,
sorry for that. The most problematic one is caused by urllib.urlencode
handling unicode objects the wrong way. I wasn't aware of this issue.

Personally, I would make sure that internally, you only have unicode objects,
i.e. that SPARQLWrapper.setQuery and similar methods convert str objects to
unicode objects. Then, decode them before applying urlencode and assembling
the HTTP request. I hope 2to3 is then able to apply the correct transformations.

Python 2.7.6 (default, Mar 22 2014, 15:40:47) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from SPARQLWrapper import SPARQLWrapper, XML, POST, GET, URLENCODED, POSTDIRECTLY
/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
>>> uquery = u'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query = uquery.encode('UTF-8')
>>> uquery
u'INSERT DATA { <urn:michel> <urn:says> "\xe9" }'
>>> query
'INSERT DATA { <urn:michel> <urn:says> "\xc3\xa9" }'
>>> wrapper = SPARQLWrapper('http://localhost:3030/ukpp/sparql', 'http://localhost:3030/ukpp/update')

POSTDIRECTLY only works for unicode objects. Except for the unclear error
message, this is not necessarily wrong, because a SPARQL query is in Unicode
and the SPARQL protocol mandates UTF-8 as charset.

>>> wrapper.setMethod(POST)
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f7513e5c450>
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(query)
>>> wrapper.query()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
    return QueryResult(self._query())
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 483, in _query
    request = self._createRequest()
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 442, in _createRequest
    request.data = self.queryString.encode('UTF-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 39: ordinal not in range(128)

When using URLENCODED, it doesn't work with a unicode object, because for
some reason, urllib.urlencode can't handle unicode objects correctly.

>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
    return QueryResult(self._query())
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 483, in _query
    request = self._createRequest()
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 448, in _createRequest
    request.data = urllib.urlencode(parameters, True)
  File "/usr/lib/python2.7/urllib.py", line 1357, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 39: ordinal not in range(128)
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(query)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f7513e5c290>

The same test with Python 3:

Python 3.4.1 (default, Jul  6 2014, 20:01:46) 
[GCC 4.9.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from SPARQLWrapper import SPARQLWrapper, XML, POST, GET, URLENCODED, POSTDIRECTLY
/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
>>> uquery = 'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query = uquery.encode('UTF-8')
>>> uquery
'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query
b'INSERT DATA { <urn:michel> <urn:says> "\xc3\xa9" }'
>>> wrapper = SPARQLWrapper('http://localhost:3030/ukpp/sparql', 'http://localhost:3030/ukpp/update')
>>> wrapper.setMethod(POST)
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f8587444048>
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(query)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 319, in setQuery
    self.queryType   = self._parseQueryType(query)
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 335, in _parseQueryType
    query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)
  File "/usr/lib/python3.4/re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
    return QueryResult(self._query())
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 485, in _query
    response = urlopener(request)
  File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 453, in open
    req = meth(req)
  File "/usr/lib/python3.4/urllib/request.py", line 1120, in do_request_
    raise TypeError(msg)
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(query)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 319, in setQuery
    self.queryType   = self._parseQueryType(query)
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 335, in _parseQueryType
    query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)
  File "/usr/lib/python3.4/re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object

SmartWrapper.Value should have repr

When using the SmartWrapper, I found it discouraging to get output that looked like this:

>>> results.bindings
[{u'o': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f610>,
  u'p': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f5d0>,
  u's': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f550>},
 {u'o': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f6d0>,
  u'p': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f690>,
  u's': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f650>}]

Defining __repr__ on SmartWrapper.Value helps a lot. Here's one implementation:

def __repr__(self):
    cls = self.__class__.__name__
    return "%s(%s:%r)" % (cls, self.type, self.value)

which provides enough transparency to see what the underlying value is but still makes it clear that you're looking at a wrapped Value.

python3: nosetests fails with inability to import Wrapper

Trying to package SPARQLWrapper 1.5.2 on Fedora 20; while the Python 2.7 flavour packages just fine, the Python 3 version of nosetests fails with the following error (replicated in a fresh virtualenv using Python 3.3 and running "nosetests-3.3" in the untarred SPARQLWrapper 1.5.2 source directory):

~/rpmbuild/BUILD/python3-python-SPARQLWrapper-1.5.2-1.fc20 ~/rpmbuild/BUILD/SPARQLWrapper-1.5.2
+ nosetests-3.3
E
======================================================================
ERROR: Failure: ImportError (No module named 'Wrapper')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.3/site-packages/nose/failure.py", line 38, in runTest
raise self.exc_val.with_traceback(self.tb)
File "/usr/lib/python3.3/site-packages/nose/loader.py", line 413, in loadTestsFromName
addr.filename, addr.module)
File "/usr/lib/python3.3/site-packages/nose/importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/usr/lib/python3.3/site-packages/nose/importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/usr/lib64/python3.3/imp.py", line 185, in load_module
return load_package(name, filename)
File "/usr/lib64/python3.3/imp.py", line 155, in load_package
return _bootstrap.SourceFileLoader(name, path).load_module(name)
File "", line 586, in _check_name_wrapper
File "", line 1024, in load_module
File "", line 1005, in load_module
File "", line 562, in module_for_loader_wrapper
File "", line 870, in _load_module
File "", line 313, in call_with_frames_removed
File "/home/makerpm/rpmbuild/BUILD/python3-python-SPARQLWrapper-1.5.2-1.fc20/SPARQLWrapper/_init.py", line 185, in
from Wrapper import SPARQLWrapper, XML, JSON, TURTLE, N3, RDF, GET, POST, SELECT, CONSTRUCT, ASK, DESCRIBE
ImportError: No module named 'Wrapper'

Unit Tests

SPARQLWrapper doesn't have unit-tests. Unit-tests should be added using Nose framework (per instruction given in RDFLib developers guide)

Release 1.6.3

1.6.3 looks ready: https://github.com/RDFLib/sparqlwrapper/issues?q=milestone%3A1.6.3+is%3Aclosed

and the SPARQLStore needs it: RDFLib/rdflib#397

New error in existing notebook

Hello,
I developed a notebook six months ago and it was working perfectly. Yesterday I have to run it again, getting this error.

QueryBadFormed: QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed.

Response:
Could not properly handle "PREFIX rdf%3A %3Chttp%3A//www." in ARC2_SPARQLPlusParser

Heres is the code:

prefixes = """PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX dc: <http://purl.org/dc/terms/> 
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>  
PREFIX dcat-ext: <http://vocabularies.aginfra.eu/dcatext#> 
PREFIX schema: <http://schema.org/>"""

endpoint = "http://ring.ciard.net/sparql1" #put your endpoint here

sparql = SPARQLWrapper2(endpoint)

sparql.setQuery(prefixes + """
SELECT ?dataset ?description
WHERE {
    ?dataset rdf:type dcat:Dataset .
    ?dataset dc:description ?description .
} 
""")

#sparql.setReturnFormat(JSON)
ret = sparql.query()

Thanks a lot

Design considerations towards 2.x

It's not the right time, but just because the discussion at issue #36, I'd like to share this idea/methodology I had with the whole community.

The current SPARQLWrapper 1.x is pretty old, originally designed in 2007, and maintained since that with important evolution at different levels (SPARQL 1.1 Protocol, RDFLib 4.x, Pythion 3.x. So at some point we should start to think about a major version 2.x with a renewed API. And I'd like to follow a community-driven process that could satisfy everybody. That means not only the authors (@iherman, @dayures, @indeyets and myself) would push their ideas, but everybody who uses the library should feed this process.

Currently there is not date for such milestone. This is just the starting point...

Keepalive support now does not properly work on Python 2.x

@xflr6 has reported on wikier/keepalive#1 that now the import fails on Python 2.x :-/

Fix epydoc warnings

Running epydoc (make doc) reveals some minor mistakes in the documentation that should be fixed.

SPARQLWrapper 1.6.3 doesn't work in py3

SPARQLWrapper/__init__.py is utf-8 encoded but setup.py tries to read it as ascii thus havoc ensues.

Downloading from URL https://pypi.python.org/packages/source/S/SPARQLWrapper/SPARQLWrapper-1.6.3.tar.gz#md5=c78f6de1f61570577632e29692a1b029 (from https://pypi.python.org/simple/SPARQLWrapper/)
  Running setup.py (path:/tmp/pip_build_root/SPARQLWrapper/setup.py) egg_info for package SPARQLWrapper
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip_build_root/SPARQLWrapper/setup.py", line 38, in <module>
        for line in open('SPARQLWrapper/__init__.py'):
      File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7672: ordinal not in range(128)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip_build_root/SPARQLWrapper/setup.py", line 38, in <module>

    for line in open('SPARQLWrapper/__init__.py'):

  File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode

    return codecs.ascii_decode(input, self.errors)[0]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7672: ordinal not in range(128)

Possible circular dependecy

I am trying to package RDFLib but I am confused with a circular dependency, RDFLib requires sparqlwrapper (https://github.com/RDFLib/rdflib/blob/master/setup.py#L45), but sparqlwrapper requires RDFLib (https://github.com/RDFLib/sparqlwrapper/blob/master/setup.py#L24). Am I interpreting something wrong here?

JSON-LD results support

SPARQL-endpoints have support for JSON-LD output for CONSTRUCT queries these days (list consists at least of Virtuoso and StarDog).

sparqlwrapper supports only N3/Turtle and XML.

Timeout handling

An user has asked in the mailing list about such detail:

https://groups.google.com/forum/#!topic/rdflib-dev/17lGRznEeMY

Right now we fully deliver such detail in the transport library (urllib2), so we should take a look on it, to both:

no timeout by default
add a method to the api to allow custom timeoouts

[Q] release process/policy?

@wikier what is your approach to releases?

the changelog is already quite large, but code wasn't seriously tested (I guess I'll switch my efforts to the testing). If tests prove that code works, then this version will be suitable for my use-case.

Did you see the topic in the mailing list? (by the way: is it better to post questions here or there?) Do you think that is a good idea? If it is, should it be implemented in 1.6 or 1.7 would be better? (I suppose you use something like http://semver.org/ — correct me if I'm wrong)

nosetests3 fails with invalid syntax exception

Hi.

Here's the result of nosetests3 passed during a build script for Debian package of 1.6.0 :

$ nosetests3
E
======================================================================
ERROR: Failure: SyntaxError (invalid syntax (wrapper_test.py, line 246))
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/failure.py", line 39, in runTest
    raise self.exc_val.with_traceback(self.tb)
  File "/usr/lib/python3/dist-packages/nose/loader.py", line 414, in loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/python3/dist-packages/nose/importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/python3/dist-packages/nose/importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/usr/lib/python3.3/imp.py", line 180, in load_module
    return load_source(name, filename, file)
  File "/usr/lib/python3.3/imp.py", line 119, in load_source
    _LoadSourceCompatibility(name, pathname, file).load_module(name)
  File "<frozen importlib._bootstrap>", line 584, in _check_name_wrapper
  File "<frozen importlib._bootstrap>", line 1022, in load_module
  File "<frozen importlib._bootstrap>", line 1003, in load_module
  File "<frozen importlib._bootstrap>", line 560, in module_for_loader_wrapper
  File "<frozen importlib._bootstrap>", line 853, in _load_module
  File "<frozen importlib._bootstrap>", line 980, in get_code
  File "<frozen importlib._bootstrap>", line 313, in _call_with_frames_removed
  File "/home/olivier/svn/svn.debian.org/python-modules/packages/git_sparqlwrapper/build-area/sparql-wrapper-python-1.6.0/.pybuild/pythonX.Y_3.3/build/test/wrapper_test.py", line 246
    except QueryBadFormed, e:
                         ^
SyntaxError: invalid syntax

----------------------------------------------------------------------
Ran 1 test in 0.045s

FAILED (errors=1)

Query type detection breaks on prefixes that contain '#' character

tried to run this query:

            PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
            PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
            PREFIX weather: <http://hal.zamia.org/weather/>
            PREFIX dbo:     <http://dbpedia.org/ontology/> 
            PREFIX dbr:     <http://dbpedia.org/resource/> 
            PREFIX dbp:     <http://dbpedia.org/property/> 
            PREFIX xml:     <http://www.w3.org/XML/1998/namespace> 
            PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#> 
        
       SELECT DISTINCT ?location ?cityid ?timezone ?label
       WHERE {
          ?location weather:cityid ?cityid .
          ?location weather:timezone ?timezone .
          ?location rdfs:label ?label .
       }

unfortunately the fix for bug #32

query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)

will break prefixes and turn my query into this:

PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns            PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema            PREFIX weather: <http://hal.zamia.org/weather/>
            PREFIX dbo:     <http://dbpedia.org/ontology/> 
            PREFIX dbr:     <http://dbpedia.org/resource/> 
            PREFIX dbp:     <http://dbpedia.org/property/> 
            PREFIX xml:     <http://www.w3.org/XML/1998/namespace> 
            PREFIX xsd:     <http://www.w3.org/2001/XMLSchema        
       SELECT DISTINCT ?location ?cityid ?timezone ?label
       WHERE {
          ?location weather:cityid ?cityid .
          ?location weather:timezone ?timezone .
          ?location rdfs:label ?label .
       }

which is not only broken in itself but will actually send

self.pattern.search(query)

into what appears to be an endless loop

query type detection doe not properly ignore comments in sparql

When a query string starts with a comment where a keyword, such as CONSTRUCT is commented out the query_type is still set to CONSTRUCT.

For example, this query should have query_type='SELECT', but v1.6.1 detects it as CONSTRUCT.

#CONSTRUCT {?s ?p ?o} 
SELECT ?s ?p ?o
WHERE {?s ?p ?o}

Release 1.6.4

rdflib.RDF and SPARQLWrapper.RDF aren't to be confused

I wonder whether there could be some way to unify SPARQLWrapper's RDF and rdflib's.

type(rdflib.RDF)
rdflib.namespace._RDFNamespace

type (SPARQLWrapper.RDF)
str

Maybe I'm the only one who would:
from rdflib import *
from SPARQLWrapper import *

Hope this helps.

Best regards,

Tarballs at pypi aren't identical to git snapshot... which is suboptimal

Trying to update the Debian package I noticed that the tarball isn't identical at PyPI or here on github.

I personally think it would be better if there wasn't 2 distribution archives.

query type detection does not work with non-newline queries

Unfortunately the pattern to remove single-line comments (introduced in issue #32) causes single-line queries' type to not be recognised. E.g. if one has the query

prefix whatever: <http://example.org/blah#> ASK { ... }

.. then the "#.*?\n" pattern used (to remove comments in _parseQueryType) prunes the query to:

prefix whatever: <http://example.org/blah#

(Off topic: We strip newlines & indentation whitespace before passing to sparqlwrapper so that Fuseki server query log output stays sane and to reduce overall query size.)

POSTDIRECTLY and unicode endpoint / unicode_literals cause UnicodeDecodeError in urllib2

this caught me a bit off-guard using from __future__ import unicode_literals, which causes all strings to by default be unicode strings, which usually is a good idea...

however, without much thinking this let me specify the endpoint as u'http://localhost:3030/db/sparql'.

try running this against a local fuseki server started with fuseki-server --mem /db:

import SPARQLWrapper as sw
sparql = sw.SPARQLWrapper(u'http://localhost:3030/db/sparql')
sparql.setReturnFormat(sw.JSON)
sparql.setMethod(sw.POST)
sparql.setRequestMethod(sw.POSTDIRECTLY)
q = u'select ?s where { ?s ?p <http://fr.dbpedia.org/resource/Réponse> . }'
sparql.setQuery(q)
res = sparql.queryAndConvert()

returns:

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-8-190f45fb195c> in <module>()
----> 1 res = sparql.queryAndConvert()
...
/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.pyc in _send_output(self, message_body)
    846         # between delayed ack and the Nagle algorithm.
    847         if isinstance(message_body, str):
--> 848             msg += message_body
    849             message_body = None
    850         self.send(msg)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 93: ordinal not in range(128)

just changing the endpoint URI in the second line to a byte string works fine.

Debugging this took me a while, but it seems the problem is caused in urrlib2 & httplib when assembling the final request. Handing them a unicode URI will implicitly cast all headers into a unicode string. Then finally when the request body (which we utf-8 encode in https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/Wrapper.py#L497) is meant to be attached, it tries to add u'header... ' + 'body...\xc3\xa9' (the '\xc3\xa9' is the u'é'.encode('utf-8') from the query), which will result in a UnicodeDecodeError.

So currently SPARQLWrapper is quite relaxed about taking an endpoint URI which is a unicode string in Python 2 while urllib2 will behave in a weird way...

Question is: do we want to fix this? And if so how...

One option would be to run the endpoint URI and all headers that SPARQLWrapper ever takes through str() in Python 2.

Another would be to just fix the POSTDIRECTLY code, as these are the only cases where it causes a problem. Actually this is a bit of a miracle caused only by the fact that all other methods never have non-ascii chars in the request body or the headers. The final request will still be a unicode string (BAD!) but luckily when writing it to the file pointer (network socket in this case) python will fallback to try to encode it with ASCII encoding, which in those lucky cases works.

I guess it's clear what i'd do... if someone else agrees, i'll make a pull request...

manage long results

Hello,

Every query I do is trunkated to 100 results (currently using sparqlwrapper to query an openrdf database).
Would be great to overcome this, happy to help.

Thanks,
Miquel

Authorization header bug when using python 3

When using setCredentials('admin', 'admin') with python 3, the authorization header subsequently sent is invalid. It is

'Authorization': "Basic b'YWRtaW46YWRtaW4='"

rather than

'Authorization': 'Basic YWRtaW46YWRtaW4='

The issue occurs because b64encode returns bytes in python 3. A simple fix is to modify line 510 to add a call to decode

# from
request.add_header("Authorization", "Basic %s" % base64.b64encode(credentials.encode('utf-8')))
# to
request.add_header("Authorization", "Basic %s" % base64.b64encode(credentials.encode('utf-8')).decode('utf-8'))

Use -dev suffix for development versions

With the discussion around issue #40, in IRC @joernhees suggested that we could adopt the same workflow than rdflib core uses:

after a release we up the version but suffix it with -dev
in the release process the last commit removes the "-dev"

That's http://semver.org conform and all the package management tools will see the "-something" as "pre-release".

What do you think?

version style, missing SPARQLWrapper2 class in > 1.6.0

Running some old code I just realized that between versions 1.6.0 and 1.6.1 SPARQLWrapper2 seems to have disappeared. I assumed all rdflib projects are following http://semver.org conventions.

Would this find consensus for SPARQLWrapper as well?

Error install pycurl dependency

We are seeing this error in our deployments as of a few hours ago (~ 1.7.1 release). Pinning SPARQLWrapper==1.6.4 resolves the problem. We are running python 2.7 and latest rdflib.

Collecting pycurl>=7.19.5.1 (from SPARQLWrapper->rdflib->-r requirements.txt (line 1))
Using cached pycurl-7.19.5.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 20, in
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 634, in
ext = get_extension(split_extension_source=split_extension_source)
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 392, in get_extension
ext_config = ExtensionConfiguration()
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 65, in init
self.configure()
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 100, in configure_unix
raise ConfigurationError(msg)
main.ConfigurationError: Could not run curl-config: [Errno 2] No such file or directory

"test" folder in git contains something strange

It looks like https://github.com/RDFLib/sparqlwrapper/tree/master/test has a copy of files in root (except for tests.py and tiny differences).

Looks like an unintended artefact.

Release 1.6.2

SPARQLWrapper > 1.7.1 seems to break upstream rdflib test

i noticed that some PRs on upstream rdflib broke in seemingly unrelated code... (see RDFLib/rdflib#550 )

the error reported is this:

======================================================================
ERROR: testNamedGraphUpdate (test.test_sparqlupdatestore.TestSparql11)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/RDFLib/rdflib/test/test_sparqlupdatestore.py", line 237, in testNamedGraphUpdate
    for v in g.objects(michel, says):
  File "/home/travis/build/RDFLib/rdflib/rdflib/graph.py", line 634, in objects
    for s, p, o in self.triples((subject, predicate, None)):
  File "/home/travis/build/RDFLib/rdflib/rdflib/graph.py", line 424, in triples
    for (s, p, o), cg in self.__store.triples((s, p, o), context=self):
  File "/home/travis/build/RDFLib/rdflib/rdflib/plugins/stores/sparqlstore.py", line 414, in triples
    doc = ElementTree.parse(SPARQLWrapper.query(self).response)
  File "/opt/python/2.7.9/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
    tree.parse(source, parser)
  File "/opt/python/2.7.9/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
    parser.feed(data)
  File "/opt/python/2.7.9/lib/python2.7/xml/etree/ElementTree.py", line 1640, in feed
    self._parser.Parse(data, 0)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 328-329: ordinal not in range(128)

as the PRs locally passed all tests and i still had SPARQLWrapper 1.6.4, i started bisecting... it seems that when depending on SPARQLWrapper > 1.7.1 that one test suddenly breaks.

Any ideas which of the changes in 1.7.1...1.7.2 could be the cause?

ambigous name: "addCustomParameter"

name of addCustomParameter method suggests that parameter would be added to the list, while implementation actually replaces existing parameter with the same name.

SPARQL specification allows to have multiple parameters with the same name given to the endpoint (for example named graphs).

…and may include zero or more default graph URIs (parameter name: default-graph-uri) and named graph URIs (parameter name: named-graph-uri)

Changing behaviour of addCustomParameter would break backwards compatibility, so the proper solution would be to:

introduce setCustomParameter method which does what current implementation does
introduce appendCustomParameter which adds more parameters with the same name
deprecate addCustomParameter

tag v.1.6.0 should be v1.6.0

The tag is misnamed IMHO

Hope this helps.

Connection pooling

Right now, sparqlwrapper uses urllib2, which doesn't reuse http connections opening/closing them each time. In cases when there is a need to execute several queries in row connection reusage might gain several valuable milliseconds.

Switching to urlllib3 will give means to solve this

Choose Content-type for Update Requests

As far as I know, SPARQL update requests are always sent with Content-type application/x-www-form-urlencoded. It would be nice if one could choose to send such requests as application/sparql-update instead. This is useful if one has a SPARQL endpoint that does not work with one of these content types.

Personally, I would like to have this feature for the following reason: RDFLib's SPARQLStore currently has its own implementation of SPARQL Updates and allows the user to choose the content type. I like replace the custom implementation with SPARQLWrapper without breaking the API.

Unable to get query should be an exception

When running a query using an invalid enpoint I get this in stderr:

'Unable to get query: x in endpoint y'

The problem is that this isn't an exception so I can't process it in execution time.

If I get an empty result from a query I can't think of a way to know if the empty result was due to an invalid endpoint or because that page didn't exist in a valid endpoint. When trying to get data from multiple dbpedia endpoints being able to know this is crucial.

rdflib / sparqlwrapper Goto Github PK

sparqlwrapper's Introduction

SPARQL Endpoint interface to Python

About

Installation & Distribution

How to use

Command Line Script

Python package

SELECT examples

ASK example

CONSTRUCT example

DESCRIBE example

SPARQL UPDATE example

SPARQLWrapper2 example

Return formats

Automatic conversion of the results

Partial interpretation of the results

GET or POST

SPARQL Endpoint Implementations

Introduction

ClioPatria

OpenLink Virtuoso

Fuseki

Eclipse RDF4J

RASQAL

Marklogic

AllegroGraph

4store

Blazegraph

GraphDB

Stardog

Development

Requirements

Source code

Community

Issues

Documentation

License

Acknowledgement

sparqlwrapper's People

Contributors

Stargazers

Watchers

Forkers

sparqlwrapper's Issues

Mixlib::ShellOut::ShellCommandFailed

Recommend Projects

Recommend Topics

Recommend Org