benjamin-heasly / openhds-rest Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 3.0 3.45 MB

RESTful service for Health and Demographic Surveillance

Java 100.00%

openhds-rest's People

Contributors

Stargazers

Watchers

Forkers

munk reloadbrain rtmill

openhds-rest's Issues

Refactor ResourceLinkAssembler to avoid instanceof

ResourceLinkAssembler uses instanceof to add extra links for objects of type ExtIdIdentifiable. This is brittle design.

Refactor ResourceLinkAssembler around an interface and separate implementations for Uuididenrtifiable and ExtIdIdentifiable. Let appropriate implementations get Autowired to appropriate controllers. Then dynamic dispatch will replace instanceof.

Add OpenHDS census entities

Add remaining OpenHDS census entities, besides Location.

Add HAL "Curies" for OpenHDS rels

We should make our API "Discoverable" by using curies, as described here ("API Discoverability" near the bottom:
http://stateless.co/hal_specification.html

This would mean having a resource(s) that serves documentation. This would mean having documentation at all.

Support bulk POST as well as bulk GET

I (Ben) would like to allow bulk POST of data for populating the system. This will make it easier to integrate with large existing datasets, for project bootstrapping or migration.

Bulk requests will only make sense for POST, not PUT, because it will handle multiple records at once.

As with bulk GETs, this should make use of the same XML and JSON message converter beans used by the rest of the application.

This should use the same conventions for representing streams/collections that the application currently uses for GET. For XML, the root element has a plural entity name and contains a list of elements representing single entity records:

<entities>
  <entity>…</entity>
  …
</entities>

For JSON there is an array at the root, containing an object element for each single entity record:

[{}, {}, …]

The implementations will need to use a XML or JSON stream parsers to identify and buffer each whole record in memory. Then it can pass each record to the appropriate message converter bean for unmarshalling.

This approach will be similar to what clients have to do when they parse bulk data. For example, we used a similar approach for the openhds-tablet in Bioko.

The implementation might be a subclass of EntityIterator, perhaps named StreamParsingEntityIterator. The constructor could take an InputStream, an ObjectMapper, and a helper able to identify and buffer each whole record in memory, perhaps named StreamRecordParser.

It would be nice to implement this behavior in a Spring message converter. That way bulk inputs could be handled "magically" and REST controller methods could be implemented cleanly in terms of EntityIterator.

If this doesn't work out, we can implement this behavior explicitly with controller methods that accept the HTTP request and chew through it directly.

Work out database isolation for tests

Currently each integration test must clean up the databases to prevent leaking into other tests.

This is not the way.

We can use Spring DbUint support to set up and tear down the database for each test. I'm holding off on this because it will be a good synergy with Wolfe's master's project.

Revise SampleDataGenerator to generate structured data sets of various sizes.

Some motivation:

Currently, SampleDataGenerator can generate one arbitrary data set, just enough to pass integration tests that require pre-existing data. The records created are ad-hoc and a bit messy.
SampleDataGenerator has become a large class which imports all of the repositories. Since it touches all entities, it's a hot spot for merge conflicts.
In order to generate valid data, SampleDataGenerator duplicates some service logic.
It would be useful to generate structured data sets of various sizes, to facilitate project bootstrapping, client development, and performance testing.
The SampleDataGenerator needs to know the order in which it's safe to create or delete entities.

So I'd like to revise the SampleDataGenerator.

Some goals:

Accept command line arguments that specify data set size:
- default is make sure essential data are in place (like the first User), but don't mess with existing data
- a size argument can specify "generate a data set with at least this many records"
Generate structured data in meaningful chunks, like whole-families.
Use the services instead of duplicating service logic.
Don't import every service. Break the problem into smaller chunks, perhaps LocationGenerator and FamilyGenerator
Don't hard-code the ordering of entity creation / deletion. Instead, let smaller chunks like LocationGenerator declare some ordering information which the SampleDataGenerator can obey.

Add indexes to support expected queries

Jpa 2.1 allows us to specify indexes on fields we want to query. We should add a bunch of these for queries on fields like extId and insertDate

Here's an example:
http://stackoverflow.com/questions/17620405/the-annotation-index-is-disallowed-for-this-location

Add H5S links using extIds in addition to Uuids

Currently, we have H5S links that follow UUID references between entities.

Some entities also have extId ids. It would be useful to expose links based on extIds as well.

One caveat: extIds can change and won't always be unique. So extId links will have to be resolved like queries (0 or many results), rather than canonical locations for unique resources.

Factor LocationRestController PUT and POST behavior into superclass

LocationRestController implements proof of concept PUT and POST behavior.

We should implement most of this once, in the entity controller superclass.

Some controller subclasses may override to un-support these methods. More generally, we can protect these operations with authorizations at the service level.

Create JpaRepository type hierarchy that mirrors Services

I think it would make sense to have a repository hierarchy that mirrored the new service hierarchy where things like "findByExtId()" could be written once in the "AuditableCollectedRepository" and used by all entity specific repositories where finding by extId would be helpful.

Add a Dockerfile

With Spring Boot, our app config and deployment is pretty simple.

So it should be easy for us to write a Docker file that specifies a Docker image which will make it really easy to deploy the app.

Version 1 is write the Dockerfile with simple config like we use for testing. This may bundle-in dependencies like MySQL. Then set up DockerHub integration to trigger re-builds of the image whenever we commit to master.

Version 2 is to set up integration with our CI service (see #65) so that we can automatically deploy a container after each successful build.

Set up continuous integration

Set up a continuous build and deployment server. This will facilitate testing and collaboration.

It looks like people like Travis as a free, easy to use CI service:
http://stackshare.io/travis-ci

Version 1 is to set up the GitHub integration so that all pushes trigger the integrations tests. gradle test or similar.

Make site properties configurable

Instead of static site properties that can only be modified at deployment time, we should make site properties queryable and configurable through the REST API. This will facilitate integration with clients that need to discover the site properties.

This will also move towards my (Ben's) goal of deploying a vanilla instance of the openhds-rest server and doing all the site baseline/config/bootstrapping through the REST API. Down with obscure config files!

SiteProperty can be a simple UUID entity with a name, string-value pair.

At startup, we can pre-populate the SiteProperty table with required properties found the default site.properties file. These must be present for the application to work.

Then at runtime, the values can be customized and added to by a user with sufficient privileges.

Support queries by ExtId

For those entities that implement ExtIdentifiable, expose REST queries based on extId.

Note that extIds are not always unique, so extId queries must be allows to return collections.

ProjectCodes should accommodate code look-up as well as set-based value constraints.

The existing openhds-server has two kinds of "code" configuration in two files:

codes.properties enables simple name -> value lookup
value-constraint.xml enables constraints of the form “is value x a member of set Y”?

The ProjectCodes model and service should be extended accommodate both operations.

The ProjectCodes model currently has

codeName
codeValue

It should add

codeGroup
description

This model will accommodate all the data found in codes.properties and value-constraint.xml.

The ProjectCodes service should then expose operations of the form

lookup (codeName) -> value
predicate (codeGroup, value x) -> is x one of the codeValues in codeGroup?

The lookup assumes codeNames are still unique -- codeGroup is metadata, not a namespace.

So in summary, adding codeGroups to the model and service operations will make ProjectCodes more expressive and useful.

Adding a description to each code will just be handy. It's a good idea from the existing openhds-server.

Salt and hash passwords for Users and FieldWorkers

Currently User and FieldWorker passwords are stored as plain text.

Incorporate bcrypt (or better?) to salt and hash passwords.

Make the "password" fields transient and ignored by the database.

Only persist the salted, hashed "passwordHash" fields.

Add "raw" REST results

For large data transfers, we should add "raw" resource flavors.

These should be based on repository query methods that accept an insertDate range and return a Stream of results.

The results should be Marshalled incrementally and written directly to the response body.

The results should never be loaded into memory all at once.

Replace EntityControllerRegistry with Spring EntityLinks

I wrote an EntityControllerRegistry class which helps us map from entities to controllers. This makes it easy to build links for any given entity instance.

Turns out this is a solved problem with EntityLinks. So let's use it.

A nice think about EntityLinks is that the mapping from entity class to controller class is done with controller-level annotations, not explicit code.

UuidRestControler should support DELETE

Allow clients to make DELETE request for single records.

Internally, this would mark the record as "voided" and not actually delete it.

As a consequence, GET responses should exclude voided records.

Add OpenHDS update entities

Migrate OpenHDS Update entities to this new project.

Grow tutorial code into something maintainable

This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/

The tutorial puts almost all the code and config in one java file.

Reorganize the code to be readable and maintainable, with familiar package names like
"domain" and "resource".

Refactor Domain Hierarchy?

Do we want to change the hierarchy of the OpenHDS Domain classes to

UuidIdentifiable -> Auditable -> AuditableCreated (No extId) -> AuditableCollected (Has extId)

This would mean that anything collected would have extIds and that the "ExtIdentifiable" type may be redundant.

This hierarchy also means that UuidIdentifiable could be a class and not an interface.

The design would be cleaner/more straight forward - but I like I am overlooking something major in terms of motivation for the current way it's designed with UuidIdentifiable and ExtIdentifiable being separate interfaces.

I assumed the current design is the way it is because of the fact that there are 'type irregularities' like FieldWorker who has an extId but isn't collectable. I think it is desirable to have cases like this 'squashed' into the proposed hierarchy (i.e. making FieldWorker have a FieldWorkerId instead of an extId, etc).

We spoke about this in person - but I'd like to come to a solid conclusion on steps forward.

Replace Accounts and Bookmarks with OpenHDS domain

This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/

The tutorial deals with "Account" and "Bookmark" entities.

Replace these with OpenHDS the OpenHDS Location entity. This would be the first of many entities to be migrated from OpenHDS.

Create Super Tests for Services

Create a set of integration tests that test the fields shared among OpenHDS entities which can be extended for each specific entity.

This will allow us to write only 1 set of tests for the fundamental functionality of all the entities like get, get all, etc.

Entity specific implementations will come later as the domain documentation and specification is created with the services.

Add a category field for ErrorLogs

In Bioko we found it useful to query/filter for ErrorLogs by some category, like form type.

Add a category string to the ErrorLog model. Also expose this string as a query parameter in the ErrorLog resource.

Detect update conflicts? With record revision numbers?

Currently, changes to entities can be tracked by insertDateTime and lastModifiedDateTime. But update conflicts are not explicitly detected. Rather, the last update always wins.

Should the service layer attempt to detect revision conflicts between records? For example, it might use an optimistic locking design, in which clients submitting updates must also submit an expected timestamp or revision number as part of the update record. The update would only succeed if the expected value matched the current value in the server database.

Would lastModifiedDateTime timestamps be sufficient for detecting revision conflicts? I suspect these would lead to some racey edge cases. Should we then add an explicit revision number to each record?

Work out correct CORS headers

This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/

For now I'm allowing all origins (Access-Control-Allow-Origin = *) and allowing default verbs from the tutorial.

Work out which origins and verbs we really want.

Add OpenHDS error module

Add the OpenHDS error logging module which logs errors and makes them available for query.

Implement Spring Validator(s)

Validation in the services will be handled by implementing the Spring Validator service as described here:
http://docs.spring.io/spring/docs/current/spring-framework-reference/html/validation.html

This will provide an obvious and convenient location for all validation to take place as well as a central area to catch exceptions and log them before throwing them.

Add Location Registration

Add the ability to PUT or POST a new Location. This should use a LocationRegistration DTO. This be part of the "inbound" boundary of the application.

Make persistence configurable

This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/

The tutorial includes mysql as an explicit dependency.

Reconfigure to make the persistence configurable from the container.

HATEOAS link for each entity UUID

Generalize HATEOAS link building to apply to OpenHDS entities in a generic way. Each entity representation should contain:

actual content
"self" link
UUID links corresponding to "stubs" produced by our ShallowCopier

Swagger to allow exploration of the API

Swagger provides a UI for exploring REST APIs that works well with Spring Boot applications. As I understand it, it infers the various paths from the controllers and constructs examples for the request body. We use it at my work for providing easy, discoverable documentations for anyone who wants to integrate with a service, whether it's an internal team or an external vendor.

Replay Bioko data into openhds-rest

As a real-world test of openhds-rest, I (Ben) want to replay a large dataset from the Bioko CIMS project into an openhds-rest instance.

This will require an ETL tool.

The tool must reads records from the existing Bioko CIMS form database, convert each record to a JSON entity registration, and POST it to the openhds-rest instance.

This would be a good test for bulk POSTs. See #68 .

Pentaho Kettle is one possibility for this ETL work.

Talend Open Studio looks nicer to me.

Wire up Users with Spring Security UserDetailsService

This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/

The tutorial punts on security, creating Granted Authorities on the fly.

Reimplement security to use best practices. This probably means local database plus OAuth2.

Populate ProjectCodes repository at startup

When the server starts, it should read default project codes from a properties file, and add each code to the project code repository if it doesn't already exist.

This will ensure that required codes exist.

This should never over-write codes that have already been customized by a project.

Work out container config

This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/

For now, I'm testing and deploying the "easy way" from IntelliJ, with an embedded Tomcat.

Work out production war-based deployment and container config.

Tests to verify correct/sensible rendering of HAL+Json, HAL+Xml

It took some fussing with message converters and dependencies to get good rendering of responses with HAL+Json and HAL+Xml. So we should add some integration tests that verify we are still getting good rendering.

Check for things like:

Paged responses include embedded collections with generic names like "locations", not implementaiton-specific names like "locationList".
Xml Links look like single elements with "rel" and "href" attributes (not nested elements)

English documentation of the REST API

The repository should include English prose documentation of the REST API.

This should include a broad overview, some expected usages, a summary of all current endpoints, and pictures.

Sample Registrations

Add /sampleRegistration resources for each entity.

These should serve up templates for clients that want to submit registrations, using JSON or XML as requested by the client.

These should include "flat" entities with all top-level fields represented. "uuid" fields that reference related entities should be filled in to reference the "UNKNOWN" entity records.

Factor out common test behaviors for entity controllers

Based on the current LocationRestControllerTest and UserRestControllerTest, factor out common behaviors to test for all RestControllers. These should include:

paged queries with a few parameters
single records at canonical location with correct and incorrect ids
PUT and POST new records with valid and invalid content and ids
PUT and POST update records with valid and invalid content and ids
correctly rejected unauthenticated user
correctly rejected unauthorized user
JSON input and ouput
XML input and output

And where appropriate:

single records at external id location with correct and incorrect ids

What "complex" operations should openhds-rest support

From a domain point of view, we need to know what operations openhds-rest needs to support.

Some simple operations, like create or update a single entity, seem obvious.

Others are more complex. For example, in the CIMS Bioko project, we had a concept of a "household", which incorporated all of the Census entities in a typical pattern, like Location 1:1 SocialGroup, and Relationships only recorded between household members and heads.

For Bioko, we had a dedicated endpoint that could register an Individual as a household "head" or "member", with many side effects following the household pattern. The "head" and "member" registrations are two examples of "complex" operations.

Should openhds-rest support these complex operations for household head and member? If so, should it explicitly model a "household"?

Are there other "complex" operations that openhds-rest should support?

Add "home" controller

Add a "home" controller that clients can hit first. This should return a simple greeting for content.

It should return links to each known controller. The "rels" should be plural entity names and the links should point to the base path for each corresponding controller.

It should also return a "self" link.

Use cases, documentation, and tests for essential functionality.

We would like to gather requirements related to essential functionality. These should be motivated by the health and demographics domain. They should have the flavor, "openhds-rest must be able to do at least such-and-such".

We should document these requirements in the repository. We should also write well-commented integration tests which play out each story and verify expected behavior.

Work out correct/sensible rendering of HAL as Json and Xml

Currently the application returns HAL-style Json and Xml which contains correct data and links.

But the text rendering looks odd to me. I want to figure out if this needs to be corrected, or if this is correctly obeying some HAL conventions.

For example, for paged data, the Json "_embedded" field contains a sub-field named after the run time type of a Java object, in this case locationList. This seems brittle. Shouldn't the sub-field have a well-known name that's independent of Java types?

_embedded: {
  locationList:
    0:  {
      uuid: "76ae08cb-a64d-44cc-9073-0212273c9ac3"
      insertBy: {
        uuid: "bb1bb44e-8b30-476e-ba2e-a7fdbfaa3a1e"
...

The same data in Xml is called "content" instead of "_embedded" and contains extra nested tags. Shouldn't the Xml resemble the Json more closely?

<content>
  <content>
    <uuid>76ae08cb-a64d-44cc-9073-0212273c9ac3</uuid>
    <insertBy>
      <uuid>bb1bb44e-8b30-476e-ba2e-a7fdbfaa3a1e</uuid>
...

Extract interface for ExtIdentifiable Entities

We should explicitly model ExtIds instead of leaving it up to each Entity class to declare extId fields on an ad-hoc basis.

This will help with H5S link building, making it easier to automatically add "rels" based on extIds.

It will also allow us to factor API queries by ExtId and put this controller logic in one place.

Add OpenHDS events module

Add the events module, which keeps track of system events and facilitates integration with external system.

IT for HATEOAS link traversal

Write an integration test to prove that we can follow links from the home controller and access entity data.

The test may not hard-code or build any Urls! It may only look up Urls by parsing HAL Json and looking for well known "rels" like "self" and "locations".

For example:

get from the home controller
follow "locations" rel and get
follow "self" rel of the first location listed and get
follow "insertBy" rel to the location's user and get
make sure the user has a "self" rel.

Refactor findBy insert/void/modified service methods

Currently the service methods for findByInsertBy(), findByVoidBy(), findByModifiedBy() and their respective date methods all have very similar logic and heavy code duplication.

Will refactor the implementation so beneath the public facing methods so that common functionality is factored out and reused between the methods.

Add date range queries for Uuid Rest Controllers

Clients should be able to include a date range for paged queries. This should just be adding parameters to the current paged queries. It should not require a new endpoint.

benjamin-heasly / openhds-rest Goto Github PK

openhds-rest's People

Contributors

Stargazers

Watchers

Forkers

openhds-rest's Issues

Recommend Projects

Recommend Topics

Recommend Org