Git Product home page Git Product logo

sparqlwrapper's Introduction

image

SPARQL Endpoint interface to Python

Build Status PyPi version

About

SPARQLWrapper is a simple Python wrapper around a SPARQL service to remotely execute your queries. It helps by creating the query invocation and, optionally, converting the result into a more manageable format.

Installation & Distribution

You can install SPARQLWrapper from PyPI:

$ pip install sparqlwrapper

You can install SPARQLWrapper from GitHub:

$ pip install git+https://github.com/rdflib/sparqlwrapper#egg=sparqlwrapper

You can install SPARQLWrapper from Debian:

$ sudo apt-get install python-sparqlwrapper

Note

Be aware that there could be a gap between the latest version of SPARQLWrapper and the version available as Debian package.

Also, the source code of the package can be downloaded in .zip and .tar.gz formats from GitHub SPARQLWrapper releases. Documentation is included in the distribution.

How to use

You can use SPARQLWrapper either as a Python command line script or as a Python package.

Command Line Script

To use as a command line script, you will need to install SPARQLWrapper and then a command line script called rqw (spaRQl Wrapper) will be available within the Python environment into which it is installed. run $ rql -h to see all the script's options.

Python package

Here are a series of examples of different queries executed via SPARQLWrapper as a python package.

SELECT examples

Simple use of this module is as follows where a live SPARQL endpoint is given and the JSON return format is used:

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper(
    "http://vocabs.ardc.edu.au/repository/api/sparql/"
    "csiro_international-chronostratigraphic-chart_geologic-time-scale-2020"
)
sparql.setReturnFormat(JSON)

# gets the first 3 geological ages
# from a Geological Timescale database,
# via a SPARQL endpoint
sparql.setQuery("""
    PREFIX gts: <http://resource.geosciml.org/ontology/timescale/gts#>

    SELECT *
    WHERE {
        ?a a gts:Age .
    }
    ORDER BY ?a
    LIMIT 3
    """
)

try:
    ret = sparql.queryAndConvert()

    for r in ret["results"]["bindings"]:
        print(r)
except Exception as e:
    print(e)

This should print out something like this:

{'a': {'type': 'uri', 'value': 'http://resource.geosciml.org/classifier/ics/ischart/Aalenian'}}
{'a': {'type': 'uri', 'value': 'http://resource.geosciml.org/classifier/ics/ischart/Aeronian'}}
{'a': {'type': 'uri', 'value': 'http://resource.geosciml.org/classifier/ics/ischart/Albian'}}

The above result is the response from the given endpoint, retrieved in JSON, and converted to a Python object, ret, which is then iterated over and printed.

ASK example

This query gets a boolean response from DBPedia's SPARQL endpoint:

from SPARQLWrapper import SPARQLWrapper, XML

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
    ASK WHERE { 
        <http://dbpedia.org/resource/Asturias> rdfs:label "Asturias"@es
    }    
""")
sparql.setReturnFormat(XML)
results = sparql.query().convert()
print(results.toxml())

You should see something like:

<?xml version="1.0" ?>
<sparql
    xmlns="http://www.w3.org/2005/sparql-results#"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd">
<head/>
    <boolean>true</boolean>
</sparql>

CONSTRUCT example

CONSTRUCT queries return RDF, so queryAndConvert() here produces an RDFlib Graph object which is then serialized to the Turtle format for printing:

from SPARQLWrapper import SPARQLWrapper

sparql = SPARQLWrapper("http://dbpedia.org/sparql")

sparql.setQuery("""
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX sdo: <https://schema.org/>

    CONSTRUCT {
      ?lang a sdo:Language ;
      sdo:alternateName ?iso6391Code .
    }
    WHERE {
      ?lang a dbo:Language ;
      dbo:iso6391Code ?iso6391Code .
      FILTER (STRLEN(?iso6391Code)=2) # to filter out non-valid values
    }
    LIMIT 3
""")

results = sparql.queryAndConvert()
print(results.serialize())

Results from this query should look something like this:

@prefix schema: <https://schema.org/> .

<http://dbpedia.org/resource/Arabic> a schema:Language ;
    schema:alternateName "ar" .

<http://dbpedia.org/resource/Aragonese_language> a schema:Language ;
    schema:alternateName "an" .

<http://dbpedia.org/resource/Uruguayan_Spanish> a schema:Language ;
    schema:alternateName "es" .

DESCRIBE example

Like CONSTRUCT queries, DESCRIBE queries also produce RDF results, so this example produces an RDFlib Graph object which is then serialized into the JSON-LD format and printed:

from SPARQLWrapper import SPARQLWrapper

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("DESCRIBE <http://dbpedia.org/resource/Asturias>")

results = sparql.queryAndConvert()
print(results.serialize(format="json-ld"))

The result for this example is large but starts something like this:

[
    {
        "@id": "http://dbpedia.org/resource/Mazonovo",
        "http://dbpedia.org/ontology/subdivision": [
            {
                "@id": "http://dbpedia.org/resource/Asturias"
            }
    ],
...

SPARQL UPDATE example

UPDATE queries write changes to a SPARQL endpoint, so we can't easily show a working example here. However, if https://example.org/sparql really was a working SPARQL endpoint that allowed updates, the following code might work:

from SPARQLWrapper import SPARQLWrapper, POST, DIGEST

sparql = SPARQLWrapper("https://example.org/sparql")
sparql.setHTTPAuth(DIGEST)
sparql.setCredentials("some-login", "some-password")
sparql.setMethod(POST)

sparql.setQuery("""
    PREFIX dbp:  <http://dbpedia.org/resource/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

    WITH <http://example.graph>
    DELETE {
       dbo:Asturias rdfs:label "Asturies"@ast
    }
    """
)

results = sparql.query()
print results.response.read()

If the above code really worked, it would delete the triple dbo:Asturias rdfs:label "Asturies"@ast from the graph http://example.graph.

SPARQLWrapper2 example

There is also a SPARQLWrapper2 class that works with JSON SELECT results only and wraps the results to make processing of average queries even simpler.

from SPARQLWrapper import SPARQLWrapper2

sparql = SPARQLWrapper2("http://dbpedia.org/sparql")
sparql.setQuery("""
    PREFIX dbp:  <http://dbpedia.org/resource/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

    SELECT ?label
    WHERE {
        dbp:Asturias rdfs:label ?label
    }
    LIMIT 3
    """
                )

for result in sparql.query().bindings:
    print(f"{result['label'].lang}, {result['label'].value}")

The above should print out something like:

en, Asturias
ar, أشتورية
ca, Astúries

Return formats

The expected return formats differs per query type (SELECT, ASK, CONSTRUCT, DESCRIBE...).

Note

From the SPARQL specification, The response body of a successful query operation with a 2XX response is either:

  • SELECT and ASK: a SPARQL Results Document in XML, JSON, or CSV/TSV format.
  • DESCRIBE and CONSTRUCT: an RDF graph serialized, for example, in the RDF/XML syntax, or an equivalent RDF graph serialization.

The package, though it does not contain a full SPARQL parser, makes an attempt to determine the query type when the query is set. This should work in most of the cases, but there is a possibility to set this manually, in case something goes wrong.

Automatic conversion of the results

To make processing somewhat easier, the package can do some conversions automatically from the return result. These are:

  • for XML, the xml.dom.minidom is used to convert the result stream into a Python representation of a DOM tree.
  • for JSON, the json package to generate a Python dictionary.
  • for CSV or TSV, a simple string.
  • For RDF/XML and JSON-LD, the RDFLib package is used to convert the result into a Graph instance.
  • For RDF Turtle/N3, a simple string.

There are two ways to generate this conversion:

  • use ret.convert() in the return result from sparql.query() in the code above
  • use sparql.queryAndConvert() to get the converted result right away, if the intermediate stream is not used

For example, in the code below:

try :
    sparql.setReturnFormat(SPARQLWrapper.JSON)
    ret = sparql.query()
    d = ret.convert()
except Exception as e:
    print(e)

the value of d is a Python dictionary of the query result, based on the SPARQL Query Results JSON Format.

Partial interpretation of the results

Further help is to offer an extra, partial interpretation of the results, again to cover most of the practical use cases. Based on the SPARQL Query Results JSON Format, the SPARQLWrapper.SmartWrapper.Bindings class can perform some simple steps in decoding the JSON return results. If SPARQLWrapper.SmartWrapper.SPARQLWrapper2 is used instead of SPARQLWrapper.Wrapper.SPARQLWrapper, this result format is generated. Note that this relies on a JSON format only, ie, it has to be checked whether the SPARQL service can return JSON or not.

Here is a simple code that makes use of this feature:

from SPARQLWrapper import SPARQLWrapper2

sparql = SPARQLWrapper2("http://example.org/sparql")
sparql.setQuery("""
    SELECT ?subj ?prop
    WHERE {
        ?subj ?prop ?obj
    }
    """
)

try:
    ret = sparql.query()
    print(ret.variables)  # this is an array consisting of "subj" and "prop"
    for binding in ret.bindings:
        # each binding is a dictionary. Let us just print the results
        print(f"{binding['subj'].value}, {binding['subj'].type}")
        print(f"{binding['prop'].value}, {binding['prop'].type}")
except Exception as e:
    print(e)

To make this type of code even easier to realize, the [] and in operators are also implemented on the result of SPARQLWrapper.SmartWrapper.Bindings. This can be used to check and find a particular binding (ie, particular row in the return value). This features becomes particularly useful when the OPTIONAL feature of SPARQL is used. For example:

from SPARQLWrapper import SPARQLWrapper2

sparql = SPARQLWrapper2("http://example.org/sparql")
sparql.setQuery("""
    SELECT ?subj ?obj ?opt
    WHERE {
        ?subj <http://a.b.c> ?obj .
        OPTIONAL {
            ?subj <http://d.e.f> ?opt
        }
    }
    """
)

try:
    ret = sparql.query()
    print(ret.variables)  # this is an array consisting of "subj", "obj", "opt"
    if ("subj", "prop", "opt") in ret:
        # there is at least one binding covering the optional "opt", too
        bindings = ret["subj", "obj", "opt"]
        # bindings is an array of dictionaries with the full bindings
        for b in bindings:
            subj = b["subj"].value
            o = b["obj"].value
            opt = b["opt"].value
            # do something nice with subj, o, and opt

    # another way of accessing to values for a single variable:
    # take all the bindings of the "subj"
    subjbind = ret.getValues("subj")  # an array of Value instances
    ...
except Exception as e:
    print(e)

GET or POST

By default, all SPARQL services are invoked using HTTP GET verb. However, POST might be useful if the size of the query extends a reasonable size; this can be set in the query instance.

Note that some combinations may not work yet with all SPARQL processors (e.g., there are implementations where POST + JSON return does not work). Hopefully, this problem will eventually disappear.

SPARQL Endpoint Implementations

Introduction

From SPARQL 1.1 Specification:

The response body of a successful query operation with a 2XX response is either:

  • SELECT and `ASK`: a SPARQL Results Document in XML, JSON, or CSV/TSV format.
  • DESCRIBE and `CONSTRUCT`: an RDF graph serialized, for example, in the RDF/XML syntax, or an equivalent RDF graph serialization.

The fact is that the parameter key for the choice of the output format is not defined. Virtuoso uses format, Fuseki uses output, rasqual seems to use results, etc... Also, in some cases HTTP Content Negotiation can/must be used.

ClioPatria

Website

The SWI-Prolog Semantic Web Server

Documentation

Search 'sparql' in http://cliopatria.swi-prolog.org/help/http.

Uses

Parameters and Content Negotiation.

Parameter key

format.

Parameter value

MUST be one of these values: rdf+xml, json, csv, application/sparql-results+xml or application/sparql-results+json.

Website

OpenLink Virtuoso

Parameter key

format or output.

JSON-LD (application/ld+json)

supported (in CONSTRUCT and DESCRIBE).

  • Parameter value, like directly: "text/html" (HTML), "text/x-html+tr" (HTML (Faceted Browsing Links)), "application/vnd.ms-excel", "application/sparql-results+xml" (XML), "application/sparql-results+json" (JSON), "application/javascript" (Javascript), "text/turtle" (Turtle), "application/rdf+xml" (RDF/XML), "text/plain" (N-Triples), "text/csv" (CSV), "text/tab-separated-values" (TSV)
  • Parameter value, like indirectly: "HTML" (alias text/html), "JSON" (alias application/sparql-results+json), "XML" (alias application/sparql-results+xml), "TURTLE" (alias text/rdf+n3), JavaScript (alias application/javascript) See http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSSparqlProtocol#Additional HTTP Response Formats -- SELECT
  • For a SELECT query type, the default return mimetype (if Accept: */* is sent) is application/sparql-results+xml
  • For a ASK query type, the default return mimetype (if Accept: */* is sent) is text/html
  • For a CONSTRUCT query type, the default return mimetype (if Accept: */* is sent) is text/turtle
  • For a DESCRIBE query type, the default return mimetype (if Accept: */* is sent) is text/turtle

Fuseki

Website

Fuseki

Uses

Parameters and Content Negotiation.

Parameter key

format or output (Fuseki 1, Fuseki 2).

JSON-LD (application/ld+json)

supported (in CONSTRUCT and DESCRIBE).

Eclipse RDF4J

Website

Eclipse RDF4J (formerly known as OpenRDF Sesame)

Documentation

https://rdf4j.eclipse.org/documentation/rest-api/#the-query-operation, https://rdf4j.eclipse.org/documentation/rest-api/#content-types

Uses

Only content negotiation (no URL parameters).

Parameter

If an unexpected parameter is used, the server ignores it.

JSON-LD (application/ld+json)

supported (in CONSTRUCT and DESCRIBE).

  • SELECT
    • application/sparql-results+xml (DEFAULT if Accept: */* is sent))
    • application/sparql-results+json (also application/json)
    • text/csv
    • text/tab-separated-values
    • Other values: application/x-binary-rdf-results-table
  • ASK
    • application/sparql-results+xml (DEFAULT if Accept: */* is sent))
    • application/sparql-results+json
    • Other values: text/boolean
    • Not supported: text/csv
    • Not supported: text/tab-separated-values
  • CONSTRUCT
    • application/rdf+xml
    • application/n-triples (DEFAULT if Accept: */* is sent)
    • text/turtle
    • text/n3
    • application/ld+json
    • Other acceptable values: application/n-quads, application/rdf+json, application/trig, application/trix, application/x-binary-rdf
    • text/plain (returns application/n-triples)
    • text/rdf+n3 (returns text/n3)
    • text/x-nquads (returns application/n-quads)
  • DESCRIBE
    • application/rdf+xml
    • application/n-triples (DEFAULT if Accept: */* is sent)
    • text/turtle
    • text/n3
    • application/ld+json
    • Other acceptable values: application/n-quads, application/rdf+json, application/trig, application/trix, application/x-binary-rdf
    • text/plain (returns application/n-triples)
    • text/rdf+n3 (returns text/n3)
    • text/x-nquads (returns application/n-quads)

RASQAL

Website

RASQAL

Documentation

http://librdf.org/rasqal/roqet.html

Parameter key

results.

JSON-LD (application/ld+json)

NOT supported.

Uses roqet as RDF query utility (see http://librdf.org/rasqal/roqet.html) For variable bindings, the values of FORMAT vary upon what Rasqal supports but include simple for a simple text format (default), xml for the SPARQL Query Results XML format, csv for SPARQL CSV, tsv for SPARQL TSV, rdfxml and turtle for RDF syntax formats, and json for a JSON version of the results.

For RDF graph results, the values of FORMAT are ntriples (N-Triples, default), rdfxml-abbrev (RDF/XML Abbreviated), rdfxml (RDF/XML), turtle (Turtle), json (RDF/JSON resource centric), json-triples (RDF/JSON triples) or rss-1.0 (RSS 1.0, also an RDF/XML syntax).

Marklogic

Website

Marklogic

Uses

Only content negotiation (no URL parameters).

JSON-LD (application/ld+json)

NOT supported.

You can use following methods to query triples:

  • SPARQL mode in Query Console. For details, see Querying Triples with SPARQL
  • XQuery using the semantics functions, and Search API, or a combination of XQuery and SPARQL. For details, see Querying Triples with XQuery or JavaScript.
  • HTTP via a SPARQL endpoint. For details, see Using Semantics with the REST Client API.

Formats are specified as part of the HTTP Accept headers of the REST request. When you query the SPARQL endpoint with REST Client APIs, you can specify the result output format (See https://docs.marklogic.com/guide/semantics/REST#id_54258. The response type format depends on the type of query and the MIME type in the HTTP Accept header.

This table describes the MIME types and Accept Header/Output formats (MIME type) for different types of SPARQL queries. (See https://docs.marklogic.com/guide/semantics/REST#id_54258 and https://docs.marklogic.com/guide/semantics/loading#id_70682)

  • SELECT
    • application/sparql-results+xml
    • application/sparql-results+json
    • text/html
    • text/csv
  • ASK queries return a boolean (true or false).
  • CONSTRUCT or DESCRIBE
    • application/n-triples
    • application/rdf+json
    • application/rdf+xml
    • text/turtle
    • text/n3
    • application/n-quads
    • application/trig

AllegroGraph

Website

AllegroGraph

Documentation

https://franz.com/agraph/support/documentation/current/http-protocol.html

Uses

Only content negotiation (no URL parameters).

Parameter

The server always looks at the Accept header of a request, and tries to generate a response in the format that the client asks for. If this fails, a 406 response is returned. When no Accept, or an Accept of / is specified, the server prefers text/plain, in order to make it easy to explore the interface from a web browser.

JSON-LD (application/ld+json)

NOT supported.

  • SELECT
    • application/sparql-results+xml (DEFAULT if Accept: / is sent)
    • application/sparql-results+json (and application/json)
    • text/csv
    • text/tab-separated-values
    • OTHERS: application/sparql-results+ttl, text/integer, application/x-lisp-structured-expression, text/table, application/processed-csv, text/simple-csv, application/x-direct-upis
  • ASK
    • application/sparql-results+xml (DEFAULT if Accept: / is sent)
    • application/sparql-results+json (and application/json)
    • Not supported: text/csv
    • Not supported: text/tab-separated-values
  • CONSTRUCT
    • application/rdf+xml (DEFAULT if Accept: / is sent)
    • text/rdf+n3
    • OTHERS: text/integer, application/json, text/plain, text/x-nquads, application/trix, text/table, application/x-direct-upis
  • DESCRIBE
    • application/rdf+xml (DEFAULT if Accept: / is sent)
    • text/rdf+n3

4store

Website

4store

Documentation

https://4store.danielknoell.de/trac/wiki/SparqlServer/

Uses

Parameters and Content Negotiation.

Parameter key

output.

Parameter value

alias. If an unexpected alias is used, the server is not working properly.

JSON-LD (application/ld+json)

NOT supported.

  • SELECT
    • application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
    • application/sparql-results+json or application/json (alias json)
    • text/csv (alias csv)
    • text/tab-separated-values (alias tsv). Returns "text/plain" in GET.
    • Other values: text/plain, application/n-triples
  • ASK
    • application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
    • application/sparql-results+json or application/json (alias json)
    • text/csv (alias csv)
    • text/tab-separated-values (alias tsv). Returns "text/plain" in GET.
    • Other values: text/plain, application/n-triples
  • CONSTRUCT
    • application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
    • text/turtle (alias "text")
  • DESCRIBE
    • application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
    • text/turtle (alias "text")
Valid alias for SELECT and ASK

"json", "xml", csv", "tsv" (also "text" and "ascii")

Valid alias for DESCRIBE and CONSTRUCT

"xml", "text" (for turtle)

Blazegraph

Website

Blazegraph (Formerly known as Bigdata) & NanoSparqlServer

Documentation

https://wiki.blazegraph.com/wiki/index.php/REST_API#SPARQL_End_Point

Uses

Parameters and Content Negotiation.

Parameter key

format (available since version 1.4.0). Setting this parameter will override any Accept Header that is present

Parameter value

alias. If an unexpected alias is used, the server is not working properly.

JSON-LD (application/ld+json)

NOT supported.

  • SELECT
    • application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
    • application/sparql-results+json or application/json (alias json)
    • text/csv
    • text/tab-separated-values
    • Other values: application/x-binary-rdf-results-table
  • ASK
    • application/sparql-results+xml (alias xml) (DEFAULT if Accept: / is sent))
    • application/sparql-results+json or application/json (alias json)
  • CONSTRUCT
    • application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
    • text/turtle (returns text/n3)
    • text/n3
  • DESCRIBE
    • application/rdf+xml (alias xml) (DEFAULT if Accept: / is sent)
    • text/turtle (returns text/n3)
    • text/n3
Valid alias for SELECT and ASK

"xml", "json"

Valid alias for DESCRIBE and CONSTRUCT

"xml", "json" (but it returns unexpected "application/sparql-results+json")

GraphDB

Website

GraphDB, formerly known as OWLIM (OWLIM-Lite, OWLIM-SE)

Documentation

https://graphdb.ontotext.com/documentation/free/

Uses

Only content negotiation (no URL parameters).

Note

If the Accept value is not within the expected ones, the server returns a 406 "No acceptable file format found."

JSON-LD (application/ld+json)

supported (in CONSTRUCT and DESCRIBE).

  • SELECT
    • application/sparql-results+xml, application/xml (.srx file)
    • application/sparql-results+json, application/json (.srj file)
    • text/csv (DEFAULT if Accept: / is sent)
    • text/tab-separated-values
  • ASK
    • application/sparql-results+xml, application/xml (.srx file)
    • application/sparql-results+json (DEFAULT if Accept: / is sent), application/json (.srj file)
    • NOT supported: text/csv, text/tab-separated-values
  • CONSTRUCT
    • application/rdf+xml, application/xml (.rdf file)
    • text/turtle (.ttl file)
    • application/n-triples (.nt file) (DEFAULT if Accept: / is sent)
    • text/n3, text/rdf+n3 (.n3 file)
    • application/ld+json (.jsonld file)
  • DESCRIBE
    • application/rdf+xml, application/xml (.rdf file)
    • text/turtle (.ttl file)
    • application/n-triples (.nt file) (DEFAULT if Accept: / is sent)
    • text/n3, text/rdf+n3 (.n3 file)
    • application/ld+json (.jsonld file)

Stardog

Website

Stardog

Documentation

https://www.stardog.com/docs/#_http_headers_content_type_accept (looks outdated)

Uses

Only content negotiation (no URL parameters).

Parameter key

If an unexpected parameter is used, the server ignores it.

JSON-LD (application/ld+json)

supported (in CONSTRUCT and DESCRIBE).

  • SELECT
    • application/sparql-results+xml (DEFAULT if Accept: / is sent)
    • application/sparql-results+json
    • text/csv
    • text/tab-separated-values
    • Other values: application/x-binary-rdf-results-table
  • ASK
    • application/sparql-results+xml (DEFAULT if Accept: / is sent)
    • application/sparql-results+json
    • Other values: text/boolean
    • Not supported: text/csv
    • Not supported: text/tab-separated-values
  • CONSTRUCT
    • application/rdf+xml
    • text/turtle (DEFAULT if Accept: / is sent)
    • text/n3
    • application/ld+json
    • Other acceptable values: application/n-triples, application/x-turtle, application/trig, application/trix, application/n-quads
  • DESCRIBE
    • application/rdf+xml
    • text/turtle (DEFAULT if Accept: / is sent)
    • text/n3
    • application/ld+json
    • Other acceptable values: application/n-triples, application/x-turtle, application/trig, application/trix, application/n-quads

Development

Requirements

The RDFLib package is used for RDF parsing.

This package is imported in a lazy fashion, i.e. only when needed. If the user never intends to use the RDF format, the RDFLib package is not imported and the user does not have to install it.

Source code

The source distribution contains:

  • SPARQLWrapper: the Python package. You should copy the directory somewhere into your PYTHONPATH. Alternatively, you can also run the distutils scripts: python setup.py install
  • test: some unit and integrations tests. In order to run the tests some packages have to be installed before. So please install the dev packages: pip install '.[dev]'
  • scripts: some scripts to run the package against some SPARQL endpoints.
  • docs: the documentation.

Community

Community support is available through the RDFlib developer's discussion group rdflib-dev. The archives. from the old mailing list are still available.

Issues

Please, report any issue to github.

Documentation

The SPARQLWrapper documentation is available online.

Other interesting documents are the latest SPARQL 1.1 Specification (W3C Recommendation 21 March 2013) and the initial SPARQL Specification (W3C Recommendation 15 January 2008).

License

The SPARQLWrapper package is licensed under W3C license.

Acknowledgement

The package was greatly inspired by Lee Feigenbaum's similar package for Javascript.

Developers involved:

Organizations involved:

sparqlwrapper's People

Contributors

amarillion avatar aucampia avatar bcogrel avatar chrysn avatar cmarat avatar cottrell avatar danmichaelo avatar dayures avatar edwardbetts avatar eggplants avatar gromgull avatar hugovk avatar indeyets avatar joernhees avatar lamby avatar marcelometal avatar nicholascar avatar nicholsn avatar olberger avatar pandawill avatar satra avatar t0b3 avatar trevorandersen avatar white-gecko avatar wikier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sparqlwrapper's Issues

packaging broken

Hi guys, seeing another pip install failure on our build server as of release 1.7.3

Mixlib::ShellOut::ShellCommandFailed

Expected process to exit with [0], but received '1'
---- Begin output of "bash" "/tmp/chef-script20151105-4556-o9zodr" ----
STDOUT: Collecting rdflib (from -r requirements.txt (line 1))
Downloading rdflib-4.2.1.tar.gz (889kB)
Collecting statsd (from -r requirements.txt (line 2))
Downloading statsd-3.2.1-py2.py3-none-any.whl
Collecting lshash==0.0.4dev (from -r requirements.txt (line 3))
Downloading lshash-0.0.4dev.tar.gz
Collecting knowsis.commons[stats](from -r requirements.txt %28line 4%29)
Downloading https://repo.fury.io/knowsis/
Collecting isodate (from rdflib->-r requirements.txt (line 1))
Downloading isodate-0.5.4.tar.gz
Collecting pyparsing (from rdflib->-r requirements.txt (line 1))
Downloading pyparsing-2.0.5-py2.py3-none-any.whl
Collecting SPARQLWrapper (from rdflib->-r requirements.txt (line 1))
Downloading SPARQLWrapper-1.7.3.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 20, in
File "/tmp/pip-build-uBrAyp/SPARQLWrapper/setup.py", line 45, in
requirements = list(parse_requirements('requirements.txt', session=PipSession()))
File "/home/knowsis/releases/7283cfc3f18343475889e947570c2f1a295f5c36/env/lib/python2.7/site-packages/pip/req/req_file.py", line 77, in parse_requirements
filename, comes_from=comes_from, session=session
File "/home/knowsis/releases/7283cfc3f18343475889e947570c2f1a295f5c36/env/lib/python2.7/site-packages/pip/download.py", line 416, in get_file_content
'Could not open requirements file: %s' % str(exc)
pip.exceptions.InstallationError: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'

QueryResult.convert() should be able to provide predictable result format

Currently, result of QueryResult.convert depends on the input type:

  • SPARQL_XML results in DOM object
  • SPARQL_JSON results in associative array (decoded json)
  • RDF_N3 results in string
  • RDF_XML and RDF_JSONLD result in rdflib.ConjunctiveGraph

It is low-level stuff and while it works in a controlled environment (users of library might get a working workflow via trial-and-error), in real-world usecases endpoint might give results in unexpected format, which will be handled differently and users application will just break as result will be of the wrong type.

convert() should have only 2 options for returned values (similar to what is listed in spec):

  1. SPARQL_* should return "SPARQL Results Document" in the form of object which gives access to metadata and provides iterator for getting rows. Both resources and literals should be provided as corresponding rdflib objects (I think #15 is just about this)
  2. RDF_* should return "RDF graph" in the form of rdflib.ConjunctiveGraph

So, in most cases, users of the library should never need to specify desired response-type. Whatever it is on the low-level, sparqlwrapper should present it via proper high-level objects

setCredentials is detached from endpoints

SPARQLWrapper takes endpoint and updateEndpoint URLs as parameters of constructor

separately, it has method named setCredentials which allows to specify login/password which are used for Basic Authentication.

There are 2 problems with this:

  1. it is possible to have endpoints for queries and updates which require different credentials and current implementation doesn't provide good solution for this case
  2. endpoints might use different authentication schemes (Digest Authentication, OAuth, …)

I propose to separate 3 entities:

  • Set of classes which implement informal Authentication protocol (I wonder if we can reuse some existing package for this)
  • Endpoint class which implements SPARQL 1.1 Protocol which take Auth-object as parameter and takes care of HTTP part
  • Wrapper which takes 1 or 2 Endpoint-objects as parameters and provides user-friendly API

This approach will also allow us to add endpoint implementations which use custom triplestore-specific protocols as a bonus.

This is a large change, so should be implemented in 2.x branch.

SPARQL/Update query-types

Right now, 3 query-types for SPARQL/Update are defined: INSERT, DELETE, MODIFY and query-type is detected via regular expression.

The problem is, that there is no MODIFY keyword in SPARQL/Update grammer as modify query is defined as ( 'WITH' iri )? ( DeleteClause InsertClause? | InsertClause ) UsingClause* 'WHERE' GroupGraphPattern.

On the other hand, there is a bunch of query-types unknown to SPARQLWrapper (LOAD, CLEAR, DROP, etc.)

setCredentials "Invalid header value" / "QueryBadFormed"

Using SPARQLWrapper 1.6.4 (pip).

When setCredentials() is used the resulting query request Authorization header seems to contain an erroneous trailing newline.

This results in an "Invalid header value" ValueError with Python 2.7, or a "QueryBadFormed" exception with Python 2.6.

The following fix to Wrapper.py seems to correct this behaviour:

# Replace this
request.add_header("Authorization", "Basic %s" % base64.encodestring(credentials.encode('utf-8')))
# With this
request.add_header("Authorization", "Basic %s" % base64.b64encode(credentials.encode('utf-8')))

Build system broken

There are two several problems:

urs@speedy:~/p/RDFLib/sparqlwrapper$ pip install --user .
Unpacking /home/urs/p/RDFLib/sparqlwrapper
  Running setup.py (path:/tmp/pip-0_cPU0-build/setup.py) egg_info for package from file:///home/urs/p/RDFLib/sparqlwrapper
    SPARQLWrapper/Wrapper.py:101: RuntimeWarning: JSON-LD disabled because no suitable support has been found
      warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip-0_cPU0-build/setup.py", line 32, in <module>
        author = SPARQLWrapper__authors__,
    NameError: name 'SPARQLWrapper__authors__' is not defined
    Complete output from command python setup.py egg_info:
    SPARQLWrapper/Wrapper.py:101: RuntimeWarning: JSON-LD disabled because no suitable support has been found

  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)

Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip-0_cPU0-build/setup.py", line 32, in <module>

    author = SPARQLWrapper__authors__,

NameError: name 'SPARQLWrapper__authors__' is not defined

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip-0_cPU0-build
Storing debug log for failure in /home/urs/.pip/pip.log

There is a dot missing between SPARQLWrapper_ and __authors__.

Having fixed this, it works for Python 2, but running pip3 for Python 3 exposes another problem:

urs@speedy:~/p/RDFLib/sparqlwrapper$ pip3 install --user .
Unpacking /home/urs/p/RDFLib/sparqlwrapper
  Running setup.py (path:/tmp/pip-s1q1yjhc-build/setup.py) egg_info for package from file:///home/urs/p/RDFLib/sparqlwrapper
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip-s1q1yjhc-build/setup.py", line 24, in <module>
        import SPARQLWrapper
      File "/tmp/pip-s1q1yjhc-build/SPARQLWrapper/__init__.py", line 187, in <module>
        from Wrapper import SPARQLWrapper
    ImportError: No module named 'Wrapper'
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip-s1q1yjhc-build/setup.py", line 24, in <module>

    import SPARQLWrapper

  File "/tmp/pip-s1q1yjhc-build/SPARQLWrapper/__init__.py", line 187, in <module>

    from Wrapper import SPARQLWrapper

ImportError: No module named 'Wrapper'

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip-s1q1yjhc-build
Storing debug log for failure in /home/urs/.pip/pip.log

It's probably a bad idea to import SPARQLWrapper before it has been tranformed by 2to3, because setup.py is also run by Python 3. (It seems this particular error occurs because relative imports in Python 3 don't work without a leading dot anymore.)

allow for manual setting of Accept header or headers in general

as can be seen in http://stackoverflow.com/a/30306013/1423333 there are endpoints out there which don't understand the ",".join(_SPARQL_JSON) https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/Wrapper.py#L451 Accept header, but only want "one" mime-type :(

in https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/Wrapper.py#L506 we unconditionally add the Accept header.

I think it would be much better if one could manually set headers somehow and the _createRequest() method wouldn't override them if they're there already.

Could be done in one go when switching to the Requests lib (see #51)?

enhance warning for missing rdflib-jsonld

Currently a default install of SPARQLWrapper issues this warning for every user:

In [1]: from SPARQLWrapper import SPARQLWrapper, JSON
/usr/local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)

Several issues with this:

  • seems weird to me... if i just installed something, then want to use it and the first thing it tells me is "Warning, i'm not complete" i get a bad feeling...
  • if missing jsonld support is worth a warning, why doesn't SPARQLWrapper depend on it so it's auto installed?
  • in any case the warning should tell me that i maybe want to install the rdflib-jsonld package... this might seem obvious to us, but newcomers might be quite confused without this hint.
  • if it's really optional (so we don't expect that a sane endpoint unasked replies with JSONLD), could this warning maybe be delayed until someone tries tosetReturnFormat(JSONLD)?

release notes?

Is there anywhere I can find a concise set of release notes for v1.7.0?

I got an error when using setCredentials

ValueError: Invalid header value 'Basic pokwepdokwo\n'

pokwepdokwo is the actual encoding of user:password that I see also in the Authorization field in the Request header for example in Chrome, so there is a wrongfully added \n at the end, is this an error or it's me do something wrong?

Unicode problems

There are still some unicode related issues in 1.6.2. I should have spotted them sooner,
sorry for that. The most problematic one is caused by urllib.urlencode
handling unicode objects the wrong way. I wasn't aware of this issue.

Personally, I would make sure that internally, you only have unicode objects,
i.e. that SPARQLWrapper.setQuery and similar methods convert str objects to
unicode objects. Then, decode them before applying urlencode and assembling
the HTTP request. I hope 2to3 is then able to apply the correct transformations.

Python 2.7.6 (default, Mar 22 2014, 15:40:47) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from SPARQLWrapper import SPARQLWrapper, XML, POST, GET, URLENCODED, POSTDIRECTLY
/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
>>> uquery = u'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query = uquery.encode('UTF-8')
>>> uquery
u'INSERT DATA { <urn:michel> <urn:says> "\xe9" }'
>>> query
'INSERT DATA { <urn:michel> <urn:says> "\xc3\xa9" }'
>>> wrapper = SPARQLWrapper('http://localhost:3030/ukpp/sparql', 'http://localhost:3030/ukpp/update')

POSTDIRECTLY only works for unicode objects. Except for the unclear error
message, this is not necessarily wrong, because a SPARQL query is in Unicode
and the SPARQL protocol mandates UTF-8 as charset.

>>> wrapper.setMethod(POST)
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f7513e5c450>
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(query)
>>> wrapper.query()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
    return QueryResult(self._query())
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 483, in _query
    request = self._createRequest()
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 442, in _createRequest
    request.data = self.queryString.encode('UTF-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 39: ordinal not in range(128)

When using URLENCODED, it doesn't work with a unicode object, because for
some reason, urllib.urlencode can't handle unicode objects correctly.

>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
    return QueryResult(self._query())
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 483, in _query
    request = self._createRequest()
  File "/home/urs/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 448, in _createRequest
    request.data = urllib.urlencode(parameters, True)
  File "/usr/lib/python2.7/urllib.py", line 1357, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 39: ordinal not in range(128)
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(query)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f7513e5c290>

The same test with Python 3:

Python 3.4.1 (default, Jul  6 2014, 20:01:46) 
[GCC 4.9.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from SPARQLWrapper import SPARQLWrapper, XML, POST, GET, URLENCODED, POSTDIRECTLY
/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py:100: RuntimeWarning: JSON-LD disabled because no suitable support has been found
  warnings.warn("JSON-LD disabled because no suitable support has been found", RuntimeWarning)
>>> uquery = 'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query = uquery.encode('UTF-8')
>>> uquery
'INSERT DATA { <urn:michel> <urn:says> "é" }'
>>> query
b'INSERT DATA { <urn:michel> <urn:says> "\xc3\xa9" }'
>>> wrapper = SPARQLWrapper('http://localhost:3030/ukpp/sparql', 'http://localhost:3030/ukpp/update')
>>> wrapper.setMethod(POST)
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
<SPARQLWrapper.Wrapper.QueryResult object at 0x7f8587444048>
>>> wrapper.setRequestMethod(POSTDIRECTLY)
>>> wrapper.setQuery(query)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 319, in setQuery
    self.queryType   = self._parseQueryType(query)
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 335, in _parseQueryType
    query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)
  File "/usr/lib/python3.4/re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(uquery)
>>> wrapper.query()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 515, in query
    return QueryResult(self._query())
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 485, in _query
    response = urlopener(request)
  File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 453, in open
    req = meth(req)
  File "/usr/lib/python3.4/urllib/request.py", line 1120, in do_request_
    raise TypeError(msg)
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.
>>> wrapper.setRequestMethod(URLENCODED)
>>> wrapper.setQuery(query)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 319, in setQuery
    self.queryType   = self._parseQueryType(query)
  File "/home/urs/.local/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py", line 335, in _parseQueryType
    query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)
  File "/usr/lib/python3.4/re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object

SmartWrapper.Value should have __repr__

When using the SmartWrapper, I found it discouraging to get output that looked like this:

>>> results.bindings
[{u'o': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f610>,
  u'p': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f5d0>,
  u's': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f550>},
 {u'o': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f6d0>,
  u'p': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f690>,
  u's': <SPARQLWrapper.SmartWrapper.Value at 0x106b6f650>}]

Defining __repr__ on SmartWrapper.Value helps a lot. Here's one implementation:

def __repr__(self):
    cls = self.__class__.__name__
    return "%s(%s:%r)" % (cls, self.type, self.value)

which provides enough transparency to see what the underlying value is but still makes it clear that you're looking at a wrapped Value.

python3: nosetests fails with inability to import Wrapper

Trying to package SPARQLWrapper 1.5.2 on Fedora 20; while the Python 2.7 flavour packages just fine, the Python 3 version of nosetests fails with the following error (replicated in a fresh virtualenv using Python 3.3 and running "nosetests-3.3" in the untarred SPARQLWrapper 1.5.2 source directory):

~/rpmbuild/BUILD/python3-python-SPARQLWrapper-1.5.2-1.fc20 ~/rpmbuild/BUILD/SPARQLWrapper-1.5.2
+ nosetests-3.3
E
======================================================================
ERROR: Failure: ImportError (No module named 'Wrapper')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.3/site-packages/nose/failure.py", line 38, in runTest
raise self.exc_val.with_traceback(self.tb)
File "/usr/lib/python3.3/site-packages/nose/loader.py", line 413, in loadTestsFromName
addr.filename, addr.module)
File "/usr/lib/python3.3/site-packages/nose/importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/usr/lib/python3.3/site-packages/nose/importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/usr/lib64/python3.3/imp.py", line 185, in load_module
return load_package(name, filename)
File "/usr/lib64/python3.3/imp.py", line 155, in load_package
return _bootstrap.SourceFileLoader(name, path).load_module(name)
File "", line 586, in _check_name_wrapper
File "", line 1024, in load_module
File "", line 1005, in load_module
File "", line 562, in module_for_loader_wrapper
File "", line 870, in _load_module
File "", line 313, in call_with_frames_removed
File "/home/makerpm/rpmbuild/BUILD/python3-python-SPARQLWrapper-1.5.2-1.fc20/SPARQLWrapper/_init
.py", line 185, in
from Wrapper import SPARQLWrapper, XML, JSON, TURTLE, N3, RDF, GET, POST, SELECT, CONSTRUCT, ASK, DESCRIBE
ImportError: No module named 'Wrapper'

New error in existing notebook

Hello,
I developed a notebook six months ago and it was working perfectly. Yesterday I have to run it again, getting this error.

QueryBadFormed: QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed.

Response:
Could not properly handle "PREFIX rdf%3A %3Chttp%3A//www." in ARC2_SPARQLPlusParser

Heres is the code:

prefixes = """PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX dc: <http://purl.org/dc/terms/> 
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>  
PREFIX dcat-ext: <http://vocabularies.aginfra.eu/dcatext#> 
PREFIX schema: <http://schema.org/>"""

endpoint = "http://ring.ciard.net/sparql1" #put your endpoint here

sparql = SPARQLWrapper2(endpoint)

sparql.setQuery(prefixes + """
SELECT ?dataset ?description
WHERE {
    ?dataset rdf:type dcat:Dataset .
    ?dataset dc:description ?description .
} 
""")

#sparql.setReturnFormat(JSON)
ret = sparql.query()

Thanks a lot

Design considerations towards 2.x

It's not the right time, but just because the discussion at issue #36, I'd like to share this idea/methodology I had with the whole community.

The current SPARQLWrapper 1.x is pretty old, originally designed in 2007, and maintained since that with important evolution at different levels (SPARQL 1.1 Protocol, RDFLib 4.x, Pythion 3.x. So at some point we should start to think about a major version 2.x with a renewed API. And I'd like to follow a community-driven process that could satisfy everybody. That means not only the authors (@iherman, @dayures, @indeyets and myself) would push their ideas, but everybody who uses the library should feed this process.

Currently there is not date for such milestone. This is just the starting point...

Fix epydoc warnings

Running epydoc (make doc) reveals some minor mistakes in the documentation that should be fixed.

SPARQLWrapper 1.6.3 doesn't work in py3

SPARQLWrapper/__init__.py is utf-8 encoded but setup.py tries to read it as ascii thus havoc ensues.

Downloading from URL https://pypi.python.org/packages/source/S/SPARQLWrapper/SPARQLWrapper-1.6.3.tar.gz#md5=c78f6de1f61570577632e29692a1b029 (from https://pypi.python.org/simple/SPARQLWrapper/)
  Running setup.py (path:/tmp/pip_build_root/SPARQLWrapper/setup.py) egg_info for package SPARQLWrapper
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip_build_root/SPARQLWrapper/setup.py", line 38, in <module>
        for line in open('SPARQLWrapper/__init__.py'):
      File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7672: ordinal not in range(128)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip_build_root/SPARQLWrapper/setup.py", line 38, in <module>

    for line in open('SPARQLWrapper/__init__.py'):

  File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode

    return codecs.ascii_decode(input, self.errors)[0]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7672: ordinal not in range(128)

JSON-LD results support

SPARQL-endpoints have support for JSON-LD output for CONSTRUCT queries these days (list consists at least of Virtuoso and StarDog).

sparqlwrapper supports only N3/Turtle and XML.

[Q] release process/policy?

@wikier what is your approach to releases?

the changelog is already quite large, but code wasn't seriously tested (I guess I'll switch my efforts to the testing). If tests prove that code works, then this version will be suitable for my use-case.

Did you see the topic in the mailing list? (by the way: is it better to post questions here or there?) Do you think that is a good idea? If it is, should it be implemented in 1.6 or 1.7 would be better? (I suppose you use something like http://semver.org/ — correct me if I'm wrong)

nosetests3 fails with invalid syntax exception

Hi.

Here's the result of nosetests3 passed during a build script for Debian package of 1.6.0 :

$ nosetests3
E
======================================================================
ERROR: Failure: SyntaxError (invalid syntax (wrapper_test.py, line 246))
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/failure.py", line 39, in runTest
    raise self.exc_val.with_traceback(self.tb)
  File "/usr/lib/python3/dist-packages/nose/loader.py", line 414, in loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/python3/dist-packages/nose/importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/python3/dist-packages/nose/importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/usr/lib/python3.3/imp.py", line 180, in load_module
    return load_source(name, filename, file)
  File "/usr/lib/python3.3/imp.py", line 119, in load_source
    _LoadSourceCompatibility(name, pathname, file).load_module(name)
  File "<frozen importlib._bootstrap>", line 584, in _check_name_wrapper
  File "<frozen importlib._bootstrap>", line 1022, in load_module
  File "<frozen importlib._bootstrap>", line 1003, in load_module
  File "<frozen importlib._bootstrap>", line 560, in module_for_loader_wrapper
  File "<frozen importlib._bootstrap>", line 853, in _load_module
  File "<frozen importlib._bootstrap>", line 980, in get_code
  File "<frozen importlib._bootstrap>", line 313, in _call_with_frames_removed
  File "/home/olivier/svn/svn.debian.org/python-modules/packages/git_sparqlwrapper/build-area/sparql-wrapper-python-1.6.0/.pybuild/pythonX.Y_3.3/build/test/wrapper_test.py", line 246
    except QueryBadFormed, e:
                         ^
SyntaxError: invalid syntax

----------------------------------------------------------------------
Ran 1 test in 0.045s

FAILED (errors=1)

Query type detection breaks on prefixes that contain '#' character

tried to run this query:

            PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
            PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
            PREFIX weather: <http://hal.zamia.org/weather/>
            PREFIX dbo:     <http://dbpedia.org/ontology/> 
            PREFIX dbr:     <http://dbpedia.org/resource/> 
            PREFIX dbp:     <http://dbpedia.org/property/> 
            PREFIX xml:     <http://www.w3.org/XML/1998/namespace> 
            PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#> 
        
       SELECT DISTINCT ?location ?cityid ?timezone ?label
       WHERE {
          ?location weather:cityid ?cityid .
          ?location weather:timezone ?timezone .
          ?location rdfs:label ?label .
       }

unfortunately the fix for bug #32

query = re.sub(re.compile("#.*?\n" ), "" , query) # remove all occurance singleline comments (issue #32)

will break prefixes and turn my query into this:

PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns            PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema            PREFIX weather: <http://hal.zamia.org/weather/>
            PREFIX dbo:     <http://dbpedia.org/ontology/> 
            PREFIX dbr:     <http://dbpedia.org/resource/> 
            PREFIX dbp:     <http://dbpedia.org/property/> 
            PREFIX xml:     <http://www.w3.org/XML/1998/namespace> 
            PREFIX xsd:     <http://www.w3.org/2001/XMLSchema        
       SELECT DISTINCT ?location ?cityid ?timezone ?label
       WHERE {
          ?location weather:cityid ?cityid .
          ?location weather:timezone ?timezone .
          ?location rdfs:label ?label .
       }

which is not only broken in itself but will actually send

self.pattern.search(query)

into what appears to be an endless loop

query type detection doe not properly ignore comments in sparql

When a query string starts with a comment where a keyword, such as CONSTRUCT is commented out the query_type is still set to CONSTRUCT.

For example, this query should have query_type='SELECT', but v1.6.1 detects it as CONSTRUCT.

#CONSTRUCT {?s ?p ?o} 
SELECT ?s ?p ?o
WHERE {?s ?p ?o}

rdflib.RDF and SPARQLWrapper.RDF aren't to be confused

I wonder whether there could be some way to unify SPARQLWrapper's RDF and rdflib's.

type(rdflib.RDF)
rdflib.namespace._RDFNamespace

type (SPARQLWrapper.RDF)
str

Maybe I'm the only one who would:
from rdflib import *
from SPARQLWrapper import *

Hope this helps.

Best regards,

query type detection does not work with non-newline queries

Unfortunately the pattern to remove single-line comments (introduced in issue #32) causes single-line queries' type to not be recognised. E.g. if one has the query

prefix whatever: <http://example.org/blah#> ASK { ... }

.. then the "#.*?\n" pattern used (to remove comments in _parseQueryType) prunes the query to:

prefix whatever: <http://example.org/blah#

(Off topic: We strip newlines & indentation whitespace before passing to sparqlwrapper so that Fuseki server query log output stays sane and to reduce overall query size.)

POSTDIRECTLY and unicode endpoint / unicode_literals cause UnicodeDecodeError in urllib2

this caught me a bit off-guard using from __future__ import unicode_literals, which causes all strings to by default be unicode strings, which usually is a good idea...

however, without much thinking this let me specify the endpoint as u'http://localhost:3030/db/sparql'.

try running this against a local fuseki server started with fuseki-server --mem /db:

import SPARQLWrapper as sw
sparql = sw.SPARQLWrapper(u'http://localhost:3030/db/sparql')
sparql.setReturnFormat(sw.JSON)
sparql.setMethod(sw.POST)
sparql.setRequestMethod(sw.POSTDIRECTLY)
q = u'select ?s where { ?s ?p <http://fr.dbpedia.org/resource/Réponse> . }'
sparql.setQuery(q)
res = sparql.queryAndConvert()

returns:

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-8-190f45fb195c> in <module>()
----> 1 res = sparql.queryAndConvert()
...
/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.pyc in _send_output(self, message_body)
    846         # between delayed ack and the Nagle algorithm.
    847         if isinstance(message_body, str):
--> 848             msg += message_body
    849             message_body = None
    850         self.send(msg)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 93: ordinal not in range(128)

just changing the endpoint URI in the second line to a byte string works fine.

Debugging this took me a while, but it seems the problem is caused in urrlib2 & httplib when assembling the final request. Handing them a unicode URI will implicitly cast all headers into a unicode string. Then finally when the request body (which we utf-8 encode in https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/Wrapper.py#L497) is meant to be attached, it tries to add u'header... ' + 'body...\xc3\xa9' (the '\xc3\xa9' is the u'é'.encode('utf-8') from the query), which will result in a UnicodeDecodeError.

So currently SPARQLWrapper is quite relaxed about taking an endpoint URI which is a unicode string in Python 2 while urllib2 will behave in a weird way...

Question is: do we want to fix this? And if so how...

One option would be to run the endpoint URI and all headers that SPARQLWrapper ever takes through str() in Python 2.

Another would be to just fix the POSTDIRECTLY code, as these are the only cases where it causes a problem. Actually this is a bit of a miracle caused only by the fact that all other methods never have non-ascii chars in the request body or the headers. The final request will still be a unicode string (BAD!) but luckily when writing it to the file pointer (network socket in this case) python will fallback to try to encode it with ASCII encoding, which in those lucky cases works.

I guess it's clear what i'd do... if someone else agrees, i'll make a pull request...

manage long results

Hello,

Every query I do is trunkated to 100 results (currently using sparqlwrapper to query an openrdf database).
Would be great to overcome this, happy to help.

Thanks,
Miquel

Authorization header bug when using python 3

When using setCredentials('admin', 'admin') with python 3, the authorization header subsequently sent is invalid. It is

'Authorization': "Basic b'YWRtaW46YWRtaW4='"

rather than

'Authorization': 'Basic YWRtaW46YWRtaW4='

The issue occurs because b64encode returns bytes in python 3. A simple fix is to modify line 510 to add a call to decode

# from
request.add_header("Authorization", "Basic %s" % base64.b64encode(credentials.encode('utf-8')))
# to
request.add_header("Authorization", "Basic %s" % base64.b64encode(credentials.encode('utf-8')).decode('utf-8'))

Use -dev suffix for development versions

With the discussion around issue #40, in IRC @joernhees suggested that we could adopt the same workflow than rdflib core uses:

  • after a release we up the version but suffix it with -dev
  • in the release process the last commit removes the "-dev"

That's http://semver.org conform and all the package management tools will see the "-something" as "pre-release".

What do you think?

Error install pycurl dependency

We are seeing this error in our deployments as of a few hours ago (~ 1.7.1 release). Pinning SPARQLWrapper==1.6.4 resolves the problem. We are running python 2.7 and latest rdflib.

Collecting pycurl>=7.19.5.1 (from SPARQLWrapper->rdflib->-r requirements.txt (line 1))
Using cached pycurl-7.19.5.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 20, in
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 634, in
ext = get_extension(split_extension_source=split_extension_source)
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 392, in get_extension
ext_config = ExtensionConfiguration()
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 65, in init
self.configure()
File "/tmp/pip-build-2nJEva/pycurl/setup.py", line 100, in configure_unix
raise ConfigurationError(msg)
main.ConfigurationError: Could not run curl-config: [Errno 2] No such file or directory

SPARQLWrapper > 1.7.1 seems to break upstream rdflib test

i noticed that some PRs on upstream rdflib broke in seemingly unrelated code... (see RDFLib/rdflib#550 )

the error reported is this:

======================================================================
ERROR: testNamedGraphUpdate (test.test_sparqlupdatestore.TestSparql11)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/RDFLib/rdflib/test/test_sparqlupdatestore.py", line 237, in testNamedGraphUpdate
    for v in g.objects(michel, says):
  File "/home/travis/build/RDFLib/rdflib/rdflib/graph.py", line 634, in objects
    for s, p, o in self.triples((subject, predicate, None)):
  File "/home/travis/build/RDFLib/rdflib/rdflib/graph.py", line 424, in triples
    for (s, p, o), cg in self.__store.triples((s, p, o), context=self):
  File "/home/travis/build/RDFLib/rdflib/rdflib/plugins/stores/sparqlstore.py", line 414, in triples
    doc = ElementTree.parse(SPARQLWrapper.query(self).response)
  File "/opt/python/2.7.9/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
    tree.parse(source, parser)
  File "/opt/python/2.7.9/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
    parser.feed(data)
  File "/opt/python/2.7.9/lib/python2.7/xml/etree/ElementTree.py", line 1640, in feed
    self._parser.Parse(data, 0)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 328-329: ordinal not in range(128)

as the PRs locally passed all tests and i still had SPARQLWrapper 1.6.4, i started bisecting... it seems that when depending on SPARQLWrapper > 1.7.1 that one test suddenly breaks.

Any ideas which of the changes in 1.7.1...1.7.2 could be the cause?

ambigous name: "addCustomParameter"

name of addCustomParameter method suggests that parameter would be added to the list, while implementation actually replaces existing parameter with the same name.

SPARQL specification allows to have multiple parameters with the same name given to the endpoint (for example named graphs).

…and may include zero or more default graph URIs (parameter name: default-graph-uri) and named graph URIs (parameter name: named-graph-uri)

Changing behaviour of addCustomParameter would break backwards compatibility, so the proper solution would be to:

  • introduce setCustomParameter method which does what current implementation does
  • introduce appendCustomParameter which adds more parameters with the same name
  • deprecate addCustomParameter

Connection pooling

Right now, sparqlwrapper uses urllib2, which doesn't reuse http connections opening/closing them each time. In cases when there is a need to execute several queries in row connection reusage might gain several valuable milliseconds.

Switching to urlllib3 will give means to solve this

Choose Content-type for Update Requests

As far as I know, SPARQL update requests are always sent with Content-type application/x-www-form-urlencoded. It would be nice if one could choose to send such requests as application/sparql-update instead. This is useful if one has a SPARQL endpoint that does not work with one of these content types.

Personally, I would like to have this feature for the following reason: RDFLib's SPARQLStore currently has its own implementation of SPARQL Updates and allows the user to choose the content type. I like replace the custom implementation with SPARQLWrapper without breaking the API.

Unable to get query should be an exception

When running a query using an invalid enpoint I get this in stderr:

'Unable to get query: x in endpoint y'

The problem is that this isn't an exception so I can't process it in execution time.

If I get an empty result from a query I can't think of a way to know if the empty result was due to an invalid endpoint or because that page didn't exist in a valid endpoint. When trying to get data from multiple dbpedia endpoints being able to know this is crucial.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.