Git Product home page Git Product logo

kustvakt's Introduction

Kustvakt

DOI

Kustvakt is a user and policy management component for KorAP (Diewald et al., 2016). It manages user access to resources (i.e. corpus data) typically bound with some licensing schemes. The licensing schemes of IDS resources provided through KorAP (DeReKo) are very complex involving the user access location and purposes (Kupietz & Lüngen, 2014). To manage user access to resources, Kustvakt performs query rewriting with document restrictions (Bański et al., 2014).

Kustvakt acts as a middleware in KorAP binding other components, such as Koral a query serializer and Krill a search component, together. As the KorAP's API provider, it provides services, e.g. searching and retrieving annotation data of a match/hit, that can be used by a KorAP client, e.g. Kalamar (a KorAP web user interface) and KorapSRU (the CLARIN FCS endpoint for KorAP).

Versions

  • Kustvakt lite version

    provides basic services including search, match info, statistic and annotation services, without user and policy management.

  • Kustvakt full version

    provides user and policy management and extended services, in addition to the basic services. This version requires a database (Sqlite is provided) and an LDAP system (UnboundID InMemoryDirectoryServer is provided) for user authentication.

Recent changes on the project are described in the change logs (Changes files).

Setup

Prerequisites: Jdk 11, Git, Maven 3

Clone the latest version of Kustvakt

git clone [email protected]:KorAP/Kustvakt.git

Since Kustvakt requires Krill and Koral, please install Krill and Koral in your maven local repository according to the required versions specified in Kustvakt/full/pom.xml. For packaging Kustvakt, change into the Kustvakt folder.

Packaging Kustvakt full version

cd full
mvn clean package

Packaging Kustvakt lite version

cd full
mvn package -P lite

The jar file is located in the target/ folder.

Running Kustvakt Server

java -jar target/Kustvakt-full-[version].jar    

will run Kustvakt full version with the example kustvakt.conf configuration file included. See Customizing kustvakt configuration.

Kustvakt full version requires a Krill index and LDAP configuration. By default, Kustvakt uses the sample-index located in the parent directory of the jar file and the embedded LDAP server example.

Running Kustvakt with a custom Spring XML configuration

Kustvakt can be run using an external Spring XML configuration file, e.g. using test-config-icc.xml located in data folder:

cd target/
java -jar Kustvakt-full-[version].jar --spring-config data/test-config-icc.xml 

Generating an OAuth2 super client

An OAuth2 super client is required to be able to use web services that require user authentication. Kustvakt can generate a super client automatically. See Setting Initial Super Client for User Authentication.

Web-services

All web-services including their usage examples are described in the wiki.

Some request examples:

  • search
curl 'http://localhost:8089/api/v1.0/search?q=Wasser&ql=poliqarp'
  • search public metadata
curl 'http://localhost:8089/api/v1.0/search?q=Wasser&ql=poliqarp&fields=textSigle,title,availablility&access-rewrite-disabled=true'
  • match info
curl 'http://localhost:8089/api/v1.0/corpus/GOE/AGA/01784/p4145-4146?foundry=opennlp'

Shutting down Kustvakt Server

Kustvakt server can be shut down by sending a POST request with a shutdown token. When Kustvakt server is started, a shutdown token is automatically generated and written to a shutdownToken file with the following format:

token=[shutdown-token]

A shutdown request can be sent as follows.

curl -H "Content-Type: application/x-www-form-urlencoded" 
"http://localhost:8089/shutdown" -d @shutdownToken  

Customizing Kustvakt configuration

Copy the default Kustvakt configuration file (e.g. full/src/main/resources/kustvakt.conf or lite/src/main/resources/kustvakt-lite.conf), to the same folder as the Kustvakt jar files (/target). Please do not change the name of the configuration file.

Setting Index Directory

Set krill.indexDir in the configuration file to the location of your Krill index (relative path to the jar). In Kustvakt's root directory, there is a sample index, e.g.

krill.indexDir = ../../sample-index

Changing Kustvakt Server Port and Host

server.port = 8089
server.host = localhost

Setting Default Foundries

The following properties define the default foundries used for specific layers. For instance in a rewrite, a default foundry may be added to a Koral query missing a foundry.

default.foundry.partOfSpeech = tt
default.foundry.lemma = tt
default.foundry.orthography = opennlp
default.foundry.dependency = malt
default.foundry.constituent = corenlp
default.foundry.morphology = marmot
default.foundry.surface = base

Advanced Setup

Advanced setup such as LDAP configurations, setting a test environment, database properties and mail configurations for email notifications, are described in the wiki.

License

Kustvakt is published under the BSD-2 License. It is developed as part of KorAP, the Corpus Analysis Platform at the Leibniz Institute for the German Language (IDS), member of the Leibniz Association.

Contributions

Contributions to Kustvakt are very welcome!

Ideally, any contributions should be committed via KorAP Gerrit server to facilitate code reviewing (see Gerrit Code Review - A Quick Introduction). However, we are also happy to accept comments and pull requests via GitHub.

Please note that unless you explicitly state otherwise any contribution intentionally submitted for inclusion into Kustvakt shall – as Kustvakt itself – be under the BSD-2 License.

Publication

Diewald, Nils/Hanl, Michael/Margaretha, Eliza/Bingel, Joachim/Kupietz, Marc/Bański, Piotr/Witt, Andreas (2016): KorAP Architecture – Diving in the Deep Sea of Corpus Data. In: Calzolari, Nicoletta/Choukri, Khalid/Declerck, Thierry/Goggi, Sara/Grobelnik, Marko/Maegaard, Bente/Mariani, Joseph/Mazo, Helene/Moreno, Asuncion/Odijk, Jan/Piperidis, Stelios (Hrsg.): Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia. Paris: European Language Resources Association (ELRA), 2016. S. 3586-3591.

Bański, Piotr/Diewald, Nils/Hanl, Michael/Kupietz, Marc/Witt, Andreas (2014): Access Control by Query Rewriting. The Case of KorAP. In: Proceedings of the Ninth Conference on International Language Resources and Evaluation (LREC’14). European Language Resources Association (ELRA), 2014. S. 3817-3822.

References

Kupietz, Marc/Lüngen, Harald (2014): Recent Developments in DeReKo. In: Calzolari, Nicoletta et al. (eds.): Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). Reykjavik: ELRA, 2378-2385.

kustvakt's People

Contributors

abcpro1 avatar akron avatar bodmo avatar dependabot[bot] avatar kupietz avatar margaretha avatar michaelhanl avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.