The kpler from mirekr

Prerequisites

Install Java 17
Install Maven
Install Docker

Local build and start up

Building service

mvn clean install

Starting service

java -jar target/kpler-*.jar

Swagger UI

To see available endpoints, you can navigate in your browser to Swagger UI

Requesting JSON and XML responses

By default, the service returns data in JSON format, to request in XML format, provide request header Accept=application/xml

Dockerisation

Once the app is build, you can create docker image by running command: docker build --tag=kpler:latest .

Then you can start dockerised version by docker run -p 8888:8080 kpler:latest

Then verify by browsing http://localhost:8888/swagger-ui/index.html

Happy dockering :)

Discussion and consideration

Logging and monitoring: Service is using standard Logger to log into the console, it's up to the environment to decide where logs are hosted. It can be provided directly by hosting provider (like CloudWatch on AWS) or monitored via Elastic Search.

To log incoming requests and the responses, we annotate each method with @LoggableRequest, see example in the KplerController.java. This is picked up by LoggingAspect that logs incoming parameters and logs response.

For information about memory usage, request and response times, service implements Micrometer Prometeus, that could be plugged into Grafana or similar monitoring system. You can see raw data available via url, interesting section worth noting is "# TYPE http_server_requests_seconds summary" displaying number of requests per endpoints, average times and http code responses.

Correlation ids: Requests and responses should implement "correlationId" field for traceability across systems. At the moment there are no downstream services but correlationIds are, to display the idea, used in the request/reponse in the ingest.

Upload performance:

Single Record upload / streaming: Service allows to push single record in order to ensure as much "up-to-date" data as possible. This relies on upstream services to push data as they come. It is allowed via standard rest endpoint, in the future we could add functionality to consume data from Kafka or any other messaging platform that would be supported by upstream services
Bulk upload: At the moment, service implements simple ingest endpoint (POST at http://localhost:8080/kpler) accepting JSON body. Assuming the data set will grow, this might become too time-consuming, for example when using API Gateway in AWS max request timeout is 30s potentionally causing issues. This could be mitigated using async processing. We could introduce endpoint POST /kpler/upload returning upload url, for example presign url for AWS s3 bucket - depends on the technology available with callback capabilities.

Proposed request structure

  {
    "callback": {
      "url": string
    }
  }

Proposed response structure

{
  "correlationId": string,
  "uploadUrl": string,
  "method": string
}

Proposed callback response structure

  {
    "correlationId": string,
    "status": enum(OK, ERROR),
    "errors": {}
  }

Quality

Unit testing: The service code is unit tested, for ensuring quality part of the mvn clean install, jacoco reporting runs generating results in the /target/site/jacoco folder. Results can be view in browser and asserted in deployment pipelines to ensure code is tested.

Integration testing / BDD style: On top of the unit tests, there should set of integration tests making sure components are autowiring correctly and endpoints are able to request and response to requests. Implemented as IT with notes in reference to the Cucumber style, that could be improvement to write them in this style around scenarios - see KplerApplicationIT Example:

  Scenario: Sunny upload and data query
  When Ingest endpoint is requested correlationId "correlationId" with data
  |mmis|timestamp|||||
  Then Service should process data and respone with correlationId "correlationId"
  Then I reuqest full data set matching
  |data example||||
  Then I request data for mmsi
  |data example||||

Others

User Rate Limiting: Service itself doesn't enforce rate limiting, this is done via nginx for greater flexibility and avoiding in the future implementing same limiting over different services provided by the company. At the moment it has simple setup on limiting per ip address but in the future this could be extended via Lua scripting language to add checks and limits per logged in user and their purchased plans.

To start nginx: docker run -it -p 666:666 -v $(pwd)/nginx/nginx.local:/etc/nginx/nginx.conf nginx and access service via http://localhost:666

Data consistency / ids On insert to the database, id per position record is generated as combination of the mmsi, stationId and timestamp. This allows multiple processing without overhead of checking existence of the record. If the record exists, it overrides data with the latest information.

mirekr / kpler Goto Github PK

kpler's Introduction

kpler's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent