benjamin-heasly / openhds-rest Goto Github PK
View Code? Open in Web Editor NEWRESTful service for Health and Demographic Surveillance
RESTful service for Health and Demographic Surveillance
ResourceLinkAssembler uses instanceof to add extra links for objects of type ExtIdIdentifiable. This is brittle design.
Refactor ResourceLinkAssembler around an interface and separate implementations for Uuididenrtifiable and ExtIdIdentifiable. Let appropriate implementations get Autowired to appropriate controllers. Then dynamic dispatch will replace instanceof.
Add remaining OpenHDS census entities, besides Location.
We should make our API "Discoverable" by using curies, as described here ("API Discoverability" near the bottom:
http://stateless.co/hal_specification.html
This would mean having a resource(s) that serves documentation. This would mean having documentation at all.
I (Ben) would like to allow bulk POST of data for populating the system. This will make it easier to integrate with large existing datasets, for project bootstrapping or migration.
Bulk requests will only make sense for POST, not PUT, because it will handle multiple records at once.
As with bulk GETs, this should make use of the same XML and JSON message converter beans used by the rest of the application.
This should use the same conventions for representing streams/collections that the application currently uses for GET. For XML, the root element has a plural entity name and contains a list of elements representing single entity records:
<entities>
<entity>…</entity>
…
</entities>
For JSON there is an array at the root, containing an object element for each single entity record:
[{}, {}, …]
The implementations will need to use a XML or JSON stream parsers to identify and buffer each whole record in memory. Then it can pass each record to the appropriate message converter bean for unmarshalling.
This approach will be similar to what clients have to do when they parse bulk data. For example, we used a similar approach for the openhds-tablet in Bioko.
The implementation might be a subclass of EntityIterator, perhaps named StreamParsingEntityIterator. The constructor could take an InputStream, an ObjectMapper, and a helper able to identify and buffer each whole record in memory, perhaps named StreamRecordParser.
It would be nice to implement this behavior in a Spring message converter. That way bulk inputs could be handled "magically" and REST controller methods could be implemented cleanly in terms of EntityIterator.
If this doesn't work out, we can implement this behavior explicitly with controller methods that accept the HTTP request and chew through it directly.
Currently each integration test must clean up the databases to prevent leaking into other tests.
This is not the way.
We can use Spring DbUint support to set up and tear down the database for each test. I'm holding off on this because it will be a good synergy with Wolfe's master's project.
Some motivation:
So I'd like to revise the SampleDataGenerator.
Some goals:
Jpa 2.1 allows us to specify indexes on fields we want to query. We should add a bunch of these for queries on fields like extId and insertDate
Here's an example:
http://stackoverflow.com/questions/17620405/the-annotation-index-is-disallowed-for-this-location
Currently, we have H5S links that follow UUID references between entities.
Some entities also have extId ids. It would be useful to expose links based on extIds as well.
One caveat: extIds can change and won't always be unique. So extId links will have to be resolved like queries (0 or many results), rather than canonical locations for unique resources.
LocationRestController implements proof of concept PUT and POST behavior.
We should implement most of this once, in the entity controller superclass.
Some controller subclasses may override to un-support these methods. More generally, we can protect these operations with authorizations at the service level.
I think it would make sense to have a repository hierarchy that mirrored the new service hierarchy where things like "findByExtId()" could be written once in the "AuditableCollectedRepository" and used by all entity specific repositories where finding by extId would be helpful.
With Spring Boot, our app config and deployment is pretty simple.
So it should be easy for us to write a Docker file that specifies a Docker image which will make it really easy to deploy the app.
Version 1 is write the Dockerfile with simple config like we use for testing. This may bundle-in dependencies like MySQL. Then set up DockerHub integration to trigger re-builds of the image whenever we commit to master.
Version 2 is to set up integration with our CI service (see #65) so that we can automatically deploy a container after each successful build.
Set up a continuous build and deployment server. This will facilitate testing and collaboration.
It looks like people like Travis as a free, easy to use CI service:
http://stackshare.io/travis-ci
Version 1 is to set up the GitHub integration so that all pushes trigger the integrations tests. gradle test
or similar.
Instead of static site properties that can only be modified at deployment time, we should make site properties queryable and configurable through the REST API. This will facilitate integration with clients that need to discover the site properties.
This will also move towards my (Ben's) goal of deploying a vanilla instance of the openhds-rest server and doing all the site baseline/config/bootstrapping through the REST API. Down with obscure config files!
SiteProperty can be a simple UUID entity with a name, string-value pair.
At startup, we can pre-populate the SiteProperty table with required properties found the default site.properties file. These must be present for the application to work.
Then at runtime, the values can be customized and added to by a user with sufficient privileges.
For those entities that implement ExtIdentifiable, expose REST queries based on extId.
Note that extIds are not always unique, so extId queries must be allows to return collections.
The existing openhds-server has two kinds of "code" configuration in two files:
codes.properties
enables simple name -> value lookupvalue-constraint.xml
enables constraints of the form “is value x a member of set Y”?The ProjectCodes model and service should be extended accommodate both operations.
The ProjectCodes model currently has
It should add
This model will accommodate all the data found in codes.properties
and value-constraint.xml
.
The ProjectCodes service should then expose operations of the form
The lookup assumes codeNames are still unique -- codeGroup is metadata, not a namespace.
So in summary, adding codeGroups to the model and service operations will make ProjectCodes more expressive and useful.
Adding a description to each code will just be handy. It's a good idea from the existing openhds-server.
Currently User and FieldWorker passwords are stored as plain text.
Incorporate bcrypt (or better?) to salt and hash passwords.
Make the "password" fields transient and ignored by the database.
Only persist the salted, hashed "passwordHash" fields.
For large data transfers, we should add "raw" resource flavors.
These should be based on repository query methods that accept an insertDate range and return a Stream of results.
The results should be Marshalled incrementally and written directly to the response body.
The results should never be loaded into memory all at once.
I wrote an EntityControllerRegistry class which helps us map from entities to controllers. This makes it easy to build links for any given entity instance.
Turns out this is a solved problem with EntityLinks. So let's use it.
A nice think about EntityLinks is that the mapping from entity class to controller class is done with controller-level annotations, not explicit code.
Allow clients to make DELETE request for single records.
Internally, this would mark the record as "voided" and not actually delete it.
As a consequence, GET responses should exclude voided records.
Migrate OpenHDS Update entities to this new project.
This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/
The tutorial puts almost all the code and config in one java file.
Reorganize the code to be readable and maintainable, with familiar package names like
"domain" and "resource".
Do we want to change the hierarchy of the OpenHDS Domain classes to
UuidIdentifiable -> Auditable -> AuditableCreated (No extId) -> AuditableCollected (Has extId)
This would mean that anything collected would have extIds and that the "ExtIdentifiable" type may be redundant.
This hierarchy also means that UuidIdentifiable could be a class and not an interface.
The design would be cleaner/more straight forward - but I like I am overlooking something major in terms of motivation for the current way it's designed with UuidIdentifiable and ExtIdentifiable being separate interfaces.
I assumed the current design is the way it is because of the fact that there are 'type irregularities' like FieldWorker who has an extId but isn't collectable. I think it is desirable to have cases like this 'squashed' into the proposed hierarchy (i.e. making FieldWorker have a FieldWorkerId instead of an extId, etc).
We spoke about this in person - but I'd like to come to a solid conclusion on steps forward.
This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/
The tutorial deals with "Account" and "Bookmark" entities.
Replace these with OpenHDS the OpenHDS Location entity. This would be the first of many entities to be migrated from OpenHDS.
Create a set of integration tests that test the fields shared among OpenHDS entities which can be extended for each specific entity.
This will allow us to write only 1 set of tests for the fundamental functionality of all the entities like get, get all, etc.
Entity specific implementations will come later as the domain documentation and specification is created with the services.
In Bioko we found it useful to query/filter for ErrorLogs by some category, like form type.
Add a category string to the ErrorLog model. Also expose this string as a query parameter in the ErrorLog resource.
Currently, changes to entities can be tracked by insertDateTime and lastModifiedDateTime. But update conflicts are not explicitly detected. Rather, the last update always wins.
Should the service layer attempt to detect revision conflicts between records? For example, it might use an optimistic locking design, in which clients submitting updates must also submit an expected timestamp or revision number as part of the update record. The update would only succeed if the expected value matched the current value in the server database.
Would lastModifiedDateTime timestamps be sufficient for detecting revision conflicts? I suspect these would lead to some racey edge cases. Should we then add an explicit revision number to each record?
This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/
For now I'm allowing all origins (Access-Control-Allow-Origin = *) and allowing default verbs from the tutorial.
Work out which origins and verbs we really want.
Add the OpenHDS error logging module which logs errors and makes them available for query.
Validation in the services will be handled by implementing the Spring Validator service as described here:
http://docs.spring.io/spring/docs/current/spring-framework-reference/html/validation.html
This will provide an obvious and convenient location for all validation to take place as well as a central area to catch exceptions and log them before throwing them.
Add the ability to PUT or POST a new Location. This should use a LocationRegistration DTO. This be part of the "inbound" boundary of the application.
This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/
The tutorial includes mysql as an explicit dependency.
Reconfigure to make the persistence configurable from the container.
Generalize HATEOAS link building to apply to OpenHDS entities in a generic way. Each entity representation should contain:
Swagger provides a UI for exploring REST APIs that works well with Spring Boot applications. As I understand it, it infers the various paths from the controllers and constructs examples for the request body. We use it at my work for providing easy, discoverable documentations for anyone who wants to integrate with a service, whether it's an internal team or an external vendor.
As a real-world test of openhds-rest, I (Ben) want to replay a large dataset from the Bioko CIMS project into an openhds-rest instance.
This will require an ETL tool.
The tool must reads records from the existing Bioko CIMS form database, convert each record to a JSON entity registration, and POST it to the openhds-rest instance.
This would be a good test for bulk POSTs. See #68 .
Pentaho Kettle is one possibility for this ETL work.
Talend Open Studio looks nicer to me.
This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/
The tutorial punts on security, creating Granted Authorities on the fly.
Reimplement security to use best practices. This probably means local database plus OAuth2.
When the server starts, it should read default project codes from a properties file, and add each code to the project code repository if it doesn't already exist.
This will ensure that required codes exist.
This should never over-write codes that have already been customized by a project.
This project is bootstrapped from a Spring REST tutorial: http://spring.io/guides/tutorials/bookmarks/
For now, I'm testing and deploying the "easy way" from IntelliJ, with an embedded Tomcat.
Work out production war-based deployment and container config.
It took some fussing with message converters and dependencies to get good rendering of responses with HAL+Json and HAL+Xml. So we should add some integration tests that verify we are still getting good rendering.
Check for things like:
The repository should include English prose documentation of the REST API.
This should include a broad overview, some expected usages, a summary of all current endpoints, and pictures.
Add /sampleRegistration resources for each entity.
These should serve up templates for clients that want to submit registrations, using JSON or XML as requested by the client.
These should include "flat" entities with all top-level fields represented. "uuid" fields that reference related entities should be filled in to reference the "UNKNOWN" entity records.
Based on the current LocationRestControllerTest and UserRestControllerTest, factor out common behaviors to test for all RestControllers. These should include:
And where appropriate:
From a domain point of view, we need to know what operations openhds-rest needs to support.
Some simple operations, like create or update a single entity, seem obvious.
Others are more complex. For example, in the CIMS Bioko project, we had a concept of a "household", which incorporated all of the Census entities in a typical pattern, like Location 1:1 SocialGroup, and Relationships only recorded between household members and heads.
For Bioko, we had a dedicated endpoint that could register an Individual as a household "head" or "member", with many side effects following the household pattern. The "head" and "member" registrations are two examples of "complex" operations.
Should openhds-rest support these complex operations for household head and member? If so, should it explicitly model a "household"?
Are there other "complex" operations that openhds-rest should support?
Add a "home" controller that clients can hit first. This should return a simple greeting for content.
It should return links to each known controller. The "rels" should be plural entity names and the links should point to the base path for each corresponding controller.
It should also return a "self" link.
We would like to gather requirements related to essential functionality. These should be motivated by the health and demographics domain. They should have the flavor, "openhds-rest must be able to do at least such-and-such".
We should document these requirements in the repository. We should also write well-commented integration tests which play out each story and verify expected behavior.
Currently the application returns HAL-style Json and Xml which contains correct data and links.
But the text rendering looks odd to me. I want to figure out if this needs to be corrected, or if this is correctly obeying some HAL conventions.
For example, for paged data, the Json "_embedded" field contains a sub-field named after the run time type of a Java object, in this case locationList
. This seems brittle. Shouldn't the sub-field have a well-known name that's independent of Java types?
_embedded: {
locationList:
0: {
uuid: "76ae08cb-a64d-44cc-9073-0212273c9ac3"
insertBy: {
uuid: "bb1bb44e-8b30-476e-ba2e-a7fdbfaa3a1e"
...
The same data in Xml is called "content" instead of "_embedded" and contains extra nested tags. Shouldn't the Xml resemble the Json more closely?
<content>
<content>
<uuid>76ae08cb-a64d-44cc-9073-0212273c9ac3</uuid>
<insertBy>
<uuid>bb1bb44e-8b30-476e-ba2e-a7fdbfaa3a1e</uuid>
...
We should explicitly model ExtIds instead of leaving it up to each Entity class to declare extId fields on an ad-hoc basis.
This will help with H5S link building, making it easier to automatically add "rels" based on extIds.
It will also allow us to factor API queries by ExtId and put this controller logic in one place.
Add the events module, which keeps track of system events and facilitates integration with external system.
Write an integration test to prove that we can follow links from the home controller and access entity data.
The test may not hard-code or build any Urls! It may only look up Urls by parsing HAL Json and looking for well known "rels" like "self" and "locations".
For example:
Currently the service methods for findByInsertBy(), findByVoidBy(), findByModifiedBy() and their respective date methods all have very similar logic and heavy code duplication.
Will refactor the implementation so beneath the public facing methods so that common functionality is factored out and reused between the methods.
Clients should be able to include a date range for paged queries. This should just be adding parameters to the current paged queries. It should not require a new endpoint.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.