Git Product home page Git Product logo

mod-source-record-storage's Introduction

mod-source-record-storage

Copyright (C) 2018-2023 The Open Library Foundation

This software is distributed under the terms of the Apache License, Version 2.0. See the file "LICENSE" for more information.

Introduction

FOLIO compatible source record storage module.

Provides PostgreSQL based storage to complement the data import module. Written in Java, using the raml-module-builder and uses Maven as its build system.

Compiling

Docker is now required to build mod-source-record-storage. docker-maven-plugin is used to create a Postgres Container for running Liquibase scripts and generating jOOQ schema DAOs for type safe SQL query building.

   mvn install

See that it says "BUILD SUCCESS" near the end.

Docker

Build the docker container with:

   docker build -t mod-source-record-storage .

Test that it runs with:

   docker run -t -i -p 8081:8081 mod-source-record-storage

Installing the module

Follow the guide of Deploying Modules sections of the Okapi Guide and Reference, which describe the process in detail.

First of all you need a running Okapi instance. (Note that specifying an explicit 'okapiurl' might be needed.)

   cd .../okapi
   java -jar okapi-core/target/okapi-core-fat.jar dev

We need to declare the module to Okapi:

curl -w '\n' -X POST -D -   \
   -H "Content-type: application/json"   \
   -d @target/ModuleDescriptor.json \
   http://localhost:9130/_/proxy/modules

That ModuleDescriptor tells Okapi what the module is called, what services it provides, and how to deploy it.

Deploying the module

Next we need to deploy the module. There is a deployment descriptor in target/DeploymentDescriptor.json. It tells Okapi to start the module on 'localhost'.

Deploy it via Okapi discovery:

curl -w '\n' -D - -s \
  -X POST \
  -H "Content-type: application/json" \
  -d @target/DeploymentDescriptor.json  \
  http://localhost:9130/_/discovery/modules

Then we need to enable the module for the tenant:

curl -w '\n' -X POST -D -   \
    -H "Content-type: application/json"   \
    -d @target/TenantModuleDescriptor.json \
    http://localhost:9130/_/proxy/tenants/<tenant_name>/modules

Interaction with Kafka

There are several properties that should be set for modules that interact with Kafka: KAFKA_HOST, KAFKA_PORT, OKAPI_URL, ENV(unique env ID). After setup, it is good to check logs in all related modules for errors. Data import consumers and producers work in separate verticles that are set up in RMB's InitAPI for each module. That would be the first place to check deploy/install logs.

Environment variables that can be adjusted for this module and default values:

  • Relevant from the Iris release, module versions from 5.0.0:
    • "srs.kafka.ParsedMarcChunkConsumer.instancesNumber": 1
    • "srs.kafka.DataImportConsumer.instancesNumber": 1
    • "srs.kafka.ParsedRecordChunksKafkaHandler.maxDistributionNum": 100
    • "srs.kafka.DataImportConsumer.loadLimit": 5
    • "srs.kafka.DataImportConsumerVerticle.maxDistributionNum": 100
    • "srs.kafka.ParsedMarcChunkConsumer.loadLimit": 5
  • Relevant from the Juniper release, module versions from 5.1.0:
    • "srs.kafka.QuickMarcConsumer.instancesNumber": 1
    • "srs.kafka.QuickMarcKafkaHandler.maxDistributionNum": 100
  • Relevant from the Juniper release(module version from 5.1.0) to Kiwi release (module version from 5.2.0)
    • "srs.kafka.cache.cleanup.interval.ms": 3600000
    • "srs.kafka.cache.expiration.time.hours": 3
  • Relevant from the Morning Glory release(module version from 5.4.0):
    • "srs.cleanup.last.updated.days": 7
    • "srs.cleanup.limit": 100
    • "srs.cleanup.cron.expression": 0 0 0 * * ?
  • Relevant from the Orchid release, module versions from 5.6.0:
    • "srs.kafka.AuthorityLinkChunkKafkaHandler.maxDistributionNum": 100
    • "srs.kafka.AuthorityLinkChunkConsumer.loadLimit": 2
  • Relevant from the Poppy release, module versions from 5.7.0:
    • "srs.linking-rules-cache.expiration.time.hours": 12

Database schemas

The mod-source-record-storage module uses relational approach and Liquibase to define database schemas.

Database schemas are described in Liquibase scripts using XML syntax. Every script file should contain only one "databaseChangeLog" that consists of at least one "changeset" describing the operations on tables. Scripts should be named using following format: yyyy-MM-dd--hh-mm-schema_change_description.
yyyy-MM-dd--hh-mm - date of script creation;
schema_change_description - short description of the change.

Each "changeset" should be uniquely identified by the "author" and "id" attributes. It is advised to use the Github username as "author" attribute. The "id" attribute value should be defined in the same format as the script file name.

If needed, database schema name can be obtained using Liquibase context property ${database.defaultSchemaName}.

Liquibase scripts are stored in /resources/liquibase/ directory. Scripts files for module and tenant schemas are stored separately in /resources/liquibase/module/scripts and /resources/liquibase/tenant/scripts respectively.
To simplify the tracking of schemas changes, the tenant versioning is displayed in the directories structure:

/resources/liquibase
    /tenant/scripts
              /v-1.0.0
                  /2019-08-14--14-00-create-tenant-table.xml
              /v-2.0.0
                  /2019-09-03--11-00-change-id-column-type.xml
    /tenant/scripts
              /v-1.0.0
                  /2019-09-06--15-00-create-record-field-table.xml

Database redesign

The database has recently been redesigned to use standard relational table design with less usage of JSONB columns and more use of foreign key constraints and default B-tree indexes optimized for single value columns. The rational was to improve performance of data retrieval and data import. A significant change was the addition of leader_record_status column on the records table that is populated via a trigger on insert and update on the marc_records table. This provides ability to query on status of MARC record quickly and also condition appropriate leader record status that indicate the record has been deleted.

Source Record Storage ER Diagram

During the redesign we opted to use jOOQ for type safe fluent SQL building. The jOOQ type safe tables and resources are generated during the generate-source Maven lifecycle using vertx-jooq reactive Vert.x generator. The code is generated from the database metadata. For this to occur during build, liquibase-maven-plugin is used to consume the Liquibase changelog and provision a temporary database started using embedded-postgresql-maven-plugin.

jOOQ affords plain SQL strings, but it is not recommended. Use of type safe Java abstraction including variable binding eliminates SQL injection vulnerabilities.

REST Client for mod-source-record-storage

For using module's endpoints it provides generated by RMB client. This client is packaged into the lightweight jar.

Maven dependency

    <dependency>
      <groupId>org.folio</groupId>
      <artifactId>mod-source-record-storage-client</artifactId>
      <version>x.y.z</version>
      <type>jar</type>
    </dependency>

Where x.y.z - version of mod-source-record-storage.

Usage

SourceStorageClient is generated by RMB and provides methods for all modules endpoints described in the RAML file

    // create records client object with okapi url, tenant id and token
    SourceStorageRecordsClient client = new SourceStorageRecordsClient("localhost", "diku", "token");

Client methods work with generated by RMB data classes based on json schemas. mod-source-record-storage-client jar contains only generated by RMB DTOs and clients.

    // create new record entity
    Record record = new Record();
    record.setRecordType(Record.RecordType.MARC_BIB);
    record.setRawRecord(new RawRecord().withContent("content"));

Example with sending request to the mod-source-record-storage for creating new Record

    // send request to mod-source-record-storage
    client.postSourceStorageRecords(null, record, response -> {
      // processing response
      if (response.statusCode() == 201) {
        System.out.println("Record is successfully created.");
      }
    });

Load sample data for module testing

To load sample data after module initialization, you need to POST testMarcRecordsCollection DTO to /source-storage/populate-test-marc-records.

{
  "rawRecords": [
    ...
  ]
}

Issue tracker

See project MODSOURCE at the FOLIO issue tracker.

mod-source-record-storage's People

Contributors

abdulkhakimov avatar aivariusupov avatar aliaksandr-fedasiuk avatar aliemeshka avatar anatolii-starkov avatar bisecomsergiy avatar celticmask avatar dcrossleyau avatar dependabot[bot] avatar igor-gorchakov avatar ihardy avatar jeremythuff avatar julianladisch avatar katerynasenchenko avatar kyryloshulzhenko avatar maksat-galymzhan avatar natali-zaitseva avatar okolawole-ebsco avatar oleksandr-dekin avatar oleksiikuzminov avatar oliinyko avatar psmagin avatar rladdusaw avatar romanchernetskyi avatar ruslanlavrov avatar serhiinosko avatar shans-kaluhin avatar viacheslavkol avatar vrohach avatar wwelling avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.