Git Product home page Git Product logo

rdfscript's Introduction

RDFScript

A scripting language for creating RDF graphs.

  • Define, extend, and expand parameterisable templates for common patterns of RDF triples.
  • Share your RDFScript templates with collaborators, or import their templates for use in your own script.
  • Serialise as turtle, n3, rdf/xml, or easily extend RDFScript with a custom serialiser.
  • Perform complex manipulations of the graph by hooking into Python code defined as extensions.

Get Started

Dependencies

RDFScript requires Python 3.x.

Python package dependencies are listed in setup.py.

Install

  1. Download or clone repository. git clone https://github.com/lgrozinger/rdfscript.git
  2. Navigate to RDFScript directory. cd rdfscript

(This method requires setuptools, which can be installed from your package manager on most *nix systems, and is probably called python3-setuptools or similar) 3a. As a non-root user. python setup.py install --user

(This method requires pip) 3b.python -m pip install rdflib lxml requests ply pathlib pysbolgraph

Example usage

Running the example in examples/templates.rdfsh

python run.py -s rdfxml examples/templates.rdfsh -o <output-file>

Run the REPL

python run.py -s rdfxml -o <output-file>

Display command line options, including available serialisations.

python run.py -h

The example files in examples/ are commented with some explaination of the language.

SBOL example

  1. Get the ShortBOL2 template libraries. In a suitable directory git clone https://github.com/lgrozinger/ShortBOL2
  2. In the rdfscript directory python run.py -s sbolxml <YOUR_SHORTBOL2_DIR>/example.rdfsh -o <output-file>
  3. output-file is an SBOL file.

Contributing

dev is for the latest code that should work.

Extensions

Extensions provide a way to execute Python code to do additional, more complex processing of the RDF triples generated by expanding templates.

Extensions intended as built-ins should be added to the extensions package. 'Third party' extensions can also be added at the command line, as long as they are in the Python PATH.

For an example of an extension, see AtLeastOne in https://github.com/lgrozinger/rdfscript/blob/dev/extensions/cardinality.py

Language

Please branch off of dev for feature development, or work directly on dev for bug-fixes.

Testing

A test suite using the Python3 unittest package is under the test/ directory.

Run the tests from the project root using python -m unittest

Acknowledgements

The concept, design and implementation draws heavily on work from Matt Pocock on ShortBOL. Many thanks to Matt

rdfscript's People

Contributors

lgrozinger avatar mattycrowther avatar

Stargazers

 avatar

Watchers

 avatar  avatar

rdfscript's Issues

Templates should be stored away from ordinary variables.

Right now, env.lookup() will return either the RDF term bound to the identifier, or the Template object bound to the identifier.

Suggest env.lookup() should only ever return RDF terms, and a new get_template() or similar be written for templates, so that identifiers which happen to be templates can be used for property names, variables etc.

Syntax change for QNames

The syntax for QNames in 1.0 is <prefix>:<localname>, not <prefix>.<localname>.

This is because <expr> is a first-class, <toplevel> construct in 2.0, but not 1.0, and QName's can be expressions.

This problem could be resolved (although it is perhaps not even a problem), by including some precedence rules in the parser.py grammar for <expr>s and <instanceexp>.

Keep track of parameter position in templates.

Positional information needs to be kept in Parameter objects, so that arguments can be correctly assigned to parameters.

Parameter constructor takes a position argument, but the list is built backwards, which means the easy solutions are all hacky.

Multi-line strings

It is not currently possible to parse multi-line strings using parser.py.

In 1.0, multi-line strings are delimited with '{' and '}'. Although multi-line strings should be supported in future versions, I prefer the pythonic syntax, using either \ string continuation or the triple quotes.

e.g.

string = "This is a "\ 
"multi-line string"

or ...

string = """This is a 
multi-line string"""

Type checking module

There is a lot of places in the codebase where it is necessary to check what type of syntactic construct something is.

This happens enough to justify a TypeCheck module that will offer functions like isX(thing) that will return True iff. thing is a valid X construct.

Wanted: error handling system

A graceful error handling system need to be written and linked to Env, so that Env can signal error conditions such as wrong type, no such prefix etc.

Env.assign() should check its arguments

Env.assign() should check that:

  • identifier is a URIRef, and does not already have an assignment, or handle this case in some defined way
  • value is really a rdflib.term

Whats with the QName's?

QNames, as far as I know, are an XML thing.

Should these then really be appearing in the rdfscript syntax, since it works with rdf graphs?

Perhaps renaming QName to Identifier would be clearer?

Tests wanted!

Test coverage is poor.

The only tests that exist are for the reader and the parser, there are no tests addressing how the rdf graph is constructed for example.

Proper path searching for imports

The import statement should search the path for the given import. At the moment only the directory containing the file being parsed is searched, and the code that does this is clunky.

ShortBOL1.0 is indentation and whitespace aware.

The syntax for ShortBOL2.0 will necessarily be different to that of ShortBOL1.0 if 2.0 does not keep track of indentation levels and certain whitespace.

An example of this is instance expressions in 1.0, where the indentation defines a block statement filled with property expressions.

A choice needs to be made whether the 2.0 syntax should mirror exactly that of 1.0, and if so, 2.0 must keep track of indents at a minimum.

Name evaluation should check that prefixes evaluate to Uris

The case in which a chained prefix evaluates to a value is not handled, for example:


@prefix p = <uri>
p.x = "value not  a uri"
p.x.y

Will cause an error because p.x is not a Uri.

The expected behaviour would be for p.x.y to evaluate to <urixy>

ShortBOL1.0 EBNF grammar wanted.

It would be useful to have a EBNF grammar that could be processed by PLY.yacc, even if just as an exercise to nail down the syntax of 1.0

`import` doesn't work with '~'

Absolute imports don't work. The Importer import_file method should make a last attempt at absolute import if all else fails.

pathlib not available for Python2

The Importer module relies on pathlib, which is only available from Python 3.4.

Python2 compatibility depends on a os.path based implementation. Then the project should support Python2.7 at least.

`this` keyword or similar

It would be useful to be able to refer to the name of the expansion within templates and expansions, particularly if this could be passed over to extensions.

As the title says, similar to the this keyword in Java, self in Python etc.

Operators?

Should the language include operators on its data types?

For example:

  • compose(concatenate) two URIs? Split URIs?
  • arithmetic?
  • test for equality?

Extensions accessing internals

Giving extension authors access to the Env object of the interpreter is probably fine while no-one is writing their own extensions and the number of built-ins is small. However, before a first release, the Env should be covered up and extensions given access to some abstracted version, otherwise Env would have to be extremely stable.

So, a new abstraction on Env is needed that can remain stable for extensions, but allows Env itself to change.

@defaultPrefix is ugly

Suggest changing @defaultPrefix x to just @prefix x.

Also @prefix on its own could evaluate to the current default prefix.

Extensions

Suggestion to add a `@extension' pragma or similar, that will hook into a custom written extension to do some processing of RDF triples.

Eg:

Identified(name, type) =>
  sbol.displayId = name
  @extension sbol.Identity

sbol.Identity might do something like set persistentIdentity Identity, version with compliant URIs based on sbol.displayId

List expression type

ShortBOL2 needs a list expression type so that multiple value property expressions can be implemented.

This is so that templates for things like SBOL2 ComponentDefinitions can have many sequence properties.

Indentation in the REPL

Since indentation can't be used in the REPL, and would be clunky anyway, users can't define templates or add properties to expansions at the command line.

Either:

  • Change the grammar in the REPL to allow for properties to be added.
  • Change the grammar so no longer relies on indentation levels.
  • Change the REPL so it knows when to about line continuation and indentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.