The bptl from gemeenteutrecht

Implement Valid Sign callback URL

ValidSign supports webhooks: https://apidocs.validsign.nl/#operation/api.callback.get

BPTL should subscribe to the webhooks to receive events about packages created via the work unit. When the package is signed, it can then send a BPMN message back to the Camunda process to let the process continue.

Implement an endpoint to handle Valid Sign callbacks. The payload looks like:

{
  "name": "PACKAGE_COMPLETE",
  "sessionUser": "...",
  "packageId": "......",
  "createdDate": "2017-05-02T20:17:58.408Z"
}

Implement a page to set up the Valid Sign webhook subscription. It looks like the relevant event(s) are: PACKAGE_COMPLETE

Documentation used: https://apidocs.validsign.nl/validsign_integrator_guide.pdf, page cxliv (144)

Extract more error information from Camunda

Camunda often throws HTTP 500 errors if the process execution cannot continue because of mistakes in the process model.

Possibly the response body contains more information than what we currently extract and show. We must see if we can get more info and add it to the BPTL error information.

Additional Xential tests

Webhook endpoint
Interactive URL endpoint
Document creation handler (sends message to Camunda after document creation)

Set up 'tasks'

Tasks are considered to be the smallest units that a worker will perform. A task performs a set of actions that make up a logical unit.

Camunda has the ability to define external tasks, which have topics. We expect different organizations to use different topics for the same thing, so this must be configurable on our end.

A Django app tasks with a Task model would need at least:

Task model with fields:
- topic
- connection to the relevant task in code
An initial ZGW task to create a zaak:
- Create a Zaak in a/the configured Zaken API. Map some process variables to attributes for the Zaak object.
- Set the initial status for the Zaak
- (nth: set the initiator of the Zaak)
- set the Zaak URL/UUID as a process variable
A secondary task is to set a particular status for a zaak
- Retrieve the relevant zaak from process variables
- Retrieve which status needs to be set from process variables (volgnummer or URL of statustype)
- Make API call to set status

Implement a work unit to set a Rol for a zaak

There should be a work unit to set a rol for a zaak, mapping to the rol_create operation.

Input parameters for the task:

zaakUrl: the URL reference of the zaak to create a rol for
omschrijving: the roltype omschrijving, use this in combination with zaak.zaaktype to find the correct rolType reference
betrokkene: a JSON object containing the rest of the body for rol_create

In bptl.work_units.zgw.tasks.zaak.CreateZaakTask.create_rol you can see an existing implementation that gets you there 90% of the way.

zaakUrl and omschrijving should be used to complete the rest of the body betrokkene to POST to the API.

Upstream issue: https://github.com/GemeenteUtrecht/ZGW/issues/518

Implement a work unit to lock and unlock a DRC document

Upstream issue: https://github.com/GemeenteUtrecht/ZGW/issues/513

There should be BPTL tasks/work units to "freeze" a document by locking it, and also a task to unlock it again after locking it.

Acceptance criteria:

Set up a mapper that can be used by tasks

Often, process variables will contain information that needs to be mapped to attributes of a Zaak, Status, Document...

It would be convenient to have a mapper interface where process variables can be specified that map directly into attributes.

In a second iteration, we can add transformations to this, such as adding a prefix or merging multiple variables into a single attribute.

Adding (a set of) mappers to a task would then first run the mappers as part of the task to set the correct variables.

Support Toelichting variable in set-status task

So I can add a Toelichting when I set a new status.

https://github.com/GemeenteUtrecht/bptl/blob/master/src/bptl/work_units/zgw/tasks/zaak.py#L121

Implement Xential integration

First pass: we use the integration from Contezza.

Webscripts documentation: see https://github.com/GemeenteUtrecht/ZGW/issues/540#issuecomment-726762996
Utrecht webscript index - https://alfresco.utrechtproeftuin.nl/alfresco/s/index/family/xential

Two flows:

silent
interactive

Input task variables:

nodeRef
templateUuid
filename
templateVariables (JSON)

Output:

buildId (if available)
xentialTemplateUrl (if available)

Refactor tasks documentation

Currently task documentation is mostly visible via the /taskmappings endpoint, which poses a number of problems:

long list of configured tasks, lack of overview/grouping
only displays tasks that are connected to a topic, not all available tasks

One of the goals is to have BPTL documentation available for non-technical people so they get an idea of which generic building blocks are available. A grouping of the tasks per topic/domain (the python packages are already structured this way!) in the presentation layer would provide a first improvement.

Other improvements are:

documentation (aspects) in Dutch
more textual and less technical documentation

A new documentation section should facilitate this. We can add this to the existing setup and gradually replace existing pages.

Acceptance criteria:

A new entrypoint/page for documentation
All available work units' documentation can be viewed, even if not connected to a topic
Work units are grouped per domain (ZGW, Camunda, ValidSign...)
Documentation is extracted from docstrings - limited to general descriptions.

Nice to have:

Improved documentation parsing/system to document work units. A decorator-like approach could be viable, something like:

@register
@task.variable("zaakUrl", "URL reference to the zaak", required=True, type=str)
def example_task(task: BaseTask):
    do_stuff()
    return {}

but this could also be handled by function ArgSpecs or other formats (Sphinx arg docs for example).

Set up a task to set the result of a zaak

Check for document build errors

After requesting to Xential to build a document, the status of the build can be retrieved through a call to the endpoint:
/document/{document_id}. The process should be configured to check for the status. In the case where the response reports that an error has occurred, this should terminate the process.

To do:

Create celery beat task that checks the status of the documents in Xential. (similarly to https://github.com/GemeenteUtrecht/bptl/blob/master/src/bptl/conf/base.py#L429)
Configure task in the settings
Write tests
Documentation

Refactor all config to use zgw_consumers.Service

BRP config
ZAC config
...?

Request available Xential templates

There should be some functionality to request the available templates from Xential.

The approach on how to do this is TBC.

Implement fetching and locking external tasks

https://docs.camunda.org/manual/7.12/reference/rest/external-task/fetch/

Set up a task to close a zaak

Silent build: check template variables

When a silent document is built, all template variables need to be submitted to Xential when creating a ticket.
These are currently passed to BPTL as task variables, but no check is done to see if all of them are present.

To Do
Request from Xential which template variables a template needs and then check that they match those in the task variables. This will require the new Xential endpoint for retrieving template variables.

Change result variables of work units to be consistent with CreateZaak work unit result variables

Results contain more information than just URLs, to be able to quickly relay feedback. For consistency purposes, after #39, the other work units should be updated.

SPIKE: investigate Xential + Documenten API integration

Introduction

Xential provide document creation tools, based on the principle of document templates combined with template variables as input. This templating action can be done in the background as a "silent" template or interactively by an end-user filling out the required fields/variables.

The Documenten API is well-known and part of the standard for API's voor Zaakgericht Werken.

High level overview of the (processing) flow

Combining all of this in BPTL should lead to the following flow:

Provide a task with task variables as input:
- template ID - which template in Xential to start
- informatieobjecttype (& possible other required Documenten API fields) - so that we can persist the Xential result document in the Documenten API
- Xential template variables - to be sent along when starting the Xential template
Start the template (silent or interactive)
- if it's an interactive template, Xential returns a URL for the end-user to interactively create the document
Receive the result back from Xential - once the document is created (both silent & interactive). It looks like Xential can send back the created document at a particular URL using the webhook principle.
Store the result in the documents API
(optional) Signal back the process instance via a message event that the document is created (pretty much exactly what the ValidSign implementation does)

Main goals

When you create a ticket in step 2, that ticket is only valid for a limited amount of time. So, template starting/ticket creation should ideally happen just-in-time, when the end-user actually initiates the document creation. Handling this gracefully to reduce friction for end-users is important.
It's important that the resulting document from Xential ends up in the Documents API and possibly as a process variable (the URL of said document). Next process steps can then relate the document to the zaak, for example.
If this cannot all be implemented/explored in the spike, we need an estimate of the amount of work to achieve this and the feasibility. If it can be implemented roughly (proof-of-concept), an estimate of the remaining amount of work to polish this is required.

Other

The current, Alfresco-based implementation provides a view to list the available Xential templates. We need this to, with a direct connection to Xential. This has the lowest prio and may be considered part of the polishing work.

Technical hints

Use the BPTL database and Xential-specific models to store relevant state. E.g. - which process instance correspons with which build ID, mapping to the task so that the variables can be retrieved at a later stage... Look at the ValidSign implementation for inspiration
For interactive templates, users need the Xential URL to complete the document. You don't necessarily have to put that URL in the process as variable - you can delay the actual template start by returning a BPTL-specific URL to include in the process variables. Once the user follows the BPTL URL, it can retrieve the state information from the database, actually start the template and redirect the user to Xential. Securing those URLs (without login-screen) is important but out of scope for the spike.

Admin view to list Xential templates

As an administrator, I need a custom admin view on /admin/xential/templates that fetches the templates from Xential API endpoint and displays the name + UUID so that I can copy the UUID from there

Credentials store: encrypt data at rest

Set up a task to relate documents to a zaak

Check for EnkelvoudigInformatieObject required fields

The BPTL work unit creates a document in the Documenten API after it has been built in Xential. In order to do this, it needs to provide the required fields for a Enkelvoudiginformatieobject to the API.

Currently these fields are expected to be present in the process variable of the task, but no explicit checks are present.

To do
Check that all the required fields have been passed to the task and return an error if they are not. This should happen before the process of building a ticket starts.

Issues with Xential API update

It seems that this week Xential updated the API and the following issues were seen:

In silent mode: the documents were no longer sent to the BPTL callback
In interactive mode: building the document through the Xential interface gave this error:

Het document kan niet worden opgeslagen. Er is geen opslagpad gedefinieerd.

java.lang.RuntimeException: nl.interactionnext.builder.BuilderException: nl.interactionnext.xutil.exceptions.LocalizedException: There is no storage path defined in the manager.
        at nl.interactionnext.builder.DocumentBuilder.doBuildJob(DocumentBuilder.java:255)
        at nl.interactionnext.builder.BuildJob$BuildJobCallable.call(BuildJob.java:53)
        at nl.interactionnext.builder.BuildJob$BuildJobCallable.call(BuildJob.java:35)
        at nl.interactionnext.xutil.concurrent.ThreadContextUtil.doContextualized(ThreadContextUtil.java:110)
        at nl.interactionnext.xutil.events.ThreadContextAwareCallableWrapper.call(ThreadContextAwareCallableWrapper.java:26)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at nl.interactionnext.xutil.work.v3.Job$1.run(Job.java:63)
        at nl.interactionnext.xutil.work.v3.JobExecutorThread.runJob(JobExecutorThread.java:89)
        at nl.interactionnext.xutil.work.v3.JobExecutorThread.run(JobExecutorThread.java:65)
Caused by: nl.interactionnext.builder.BuilderException: nl.interactionnext.xutil.exceptions.LocalizedException: There is no storage path defined in the manager.
        at nl.interactionnext.builder.AbstractBuilder.buildAndRegisterDocument(AbstractBuilder.java:99)
        at nl.interactionnext.builder.DocumentBuilder.doBuildJob(DocumentBuilder.java:180)
        ... 8 more
Caused by: nl.interactionnext.xutil.exceptions.LocalizedException: There is no storage path defined in the manager.
        at nl.interactionnext.dms.fs.filebaseddms.FileBasedDMSRegistration.getStoragePath(FileBasedDMSRegistration.java:171)
        at nl.interactionnext.dms.fs.filebaseddms.FileBasedDMSRegistration.sendFileToDMS(FileBasedDMSRegistration.java:101)
        at nl.interactionnext.dms.DMSRegistration.addFile(DMSRegistration.java:88)
        at nl.interactionnext.builder.DMSRegistrator.addMimeTypes(DMSRegistrator.java:130)
        at nl.interactionnext.builder.DMSRegistrator.registerDocument(DMSRegistrator.java:83)
        at nl.interactionnext.builder.AbstractDocument.register(AbstractDocument.java:46)
        at nl.interactionnext.builder.AbstractBuilder.buildAndRegisterDocument(AbstractBuilder.java:93)
        ... 9 more

Support passing identifying parameters for zaaktype & resultaattype

Currently, the create zaak work unit requires the zaaktype URL to create the zaak. This has a couple of drawbacks:

Whenever a new zaaktype becomes active, the process models need updating
When migrating between environments with different URLs (different subdomain for example, but also possibly migrating between test/acc), process models need to be updated

A similar problem exists for the setting of the zaak result where the resultaattype URL must be specified in the process model.

We can change the work units to accept parameters that can be used to look up these values instead, using the Catalogi API.

For the zaaktype, this means that we can use the combination of zaaktype.identificatie, catalogus.rsin and catalogus.domein. For resultaattype, we should be able to use resultaattype.omschrijving.

Tasks

Set up a task router

While implementing #8, we receive back tasks from different topics. While we can loop over all known topics and fetch and lock tasks like that, it's probably more robust to set up a task router based on topic that knows where to send tasks for further processing.

Hooking that into celery tasks/workers will then provide sufficient scaling options.

Add Documenten API URL to process variables

After BPTL has made a POST request to the Documenten API to store the newly created Xential document, the new URL of the document should be added to the process variables.

Documentation of messages in camunda and process variables:
https://docs.camunda.org/manual/7.14/reference/rest/message/post-message/

Set up project boilerplate

Set up the project boilerplate using https://bitbucket.org/maykinmedia/default-project/src/master/

Python 3.8 and Django 3.0 are okay, if we run into problems, we downgrade to Django 2.2 and possibly Python 3.7

Introduce multiple queues

Celery's basic config is having a single queue named celery where all jobs are submitted.

Since the introduction of long polling to Camunda in 6d03a3d this introduces a risk - if the prefetch multiplication is mis-configured, tasks that should run instantly can be queued inside of a worker behind a long-poll task (which may run for up to 10 minutes in its current config).

To prevent mistakes like that (and being able to ACK early) - it makes sense to have two queues:

a queue for long-poll jobs and only those sort of jobs
a queue for work that should be processed ASAP

This also means that workers must be spun up/assigned to a particular queue (by default they listen to all queues).

The desired configuration YAML in infrastructure should look like:

django_app_k8s_celery_workers:
    - queue: celery
      replicas: 3
    - queue: long-polling
      replicas: 2

Acceptance criteria:

The settings define a second queue dedicated to long-polling
The long-polling related celery tasks publish to that dedicated queue
Other tasks publish to the default celery queue
Workers can be spun up for a particular queue
Infrastructure update
Update documentation on queues

Rework the on-behalf-of credentials system

Current issues:

tokens are included in the process variables, where they can be read by anyone with access to the process instance
tokens expire after an hour (for ZGW at least), so if the process crashes and is retried past that, the tokens need to be updated with valid ones
each work unit needs to be connected with the relevant services and aliases need to be set. These aliases need to be communicated out of band to the applications setting the variables. This is bound to go wrong

Current direction of solution:

Application currently handling the process sets a variable: getCredentialsUrl.
This should be a URL that is impossible to guess, and ideally be based around a hash of some attribute that changes when accessed/when credentials have been obtained -> we want to prevent replay attacks. Possibly BPTL itself needs to be authorized to the application URL out of band with a static token.

Whenever BPTL needs credentials, it makes a POST request:

POST /${getCredentialsUrl} HTTP/1.1

{"api": "https://example.com/api/v1/"}

The application replies with the relevant headers required, e.g.:

{"header": "Authorization", "value": "Bearer <some-jwt>", "api": "https://example.com/api/v1/"}

BPTL uses those credentials to make the required API calls
Whenever other applications take over for the process, they can set their own getCredentialsUrl process variable to indicate that their credentials should now be used (relevant for audit logs!).

This solves:

no need to use aliases
having the URL as process variable is not enough for a replay attack
"refreshing" credentials becomes straight-forward, as they are retrieved when they are needed (and generated on the server if relevant)
each unit knows which services it needs. BPTL can still configure which particular service to use for a topic if there are multiple services providing the same functionality but with different storage.

Securing Xential BPTL endpoints

There are currently 2 endpoints:

The endpoint listening for the request from Xential notifying the creation of a new document.
The endpoint returned to a user used for triggering the start of interactive template building.

For endpoint 2. this can be done using token in a similar way to:
https://github.com/GemeenteUtrecht/zac-lite/blob/main/backend/src/zac_lite/user_tasks/tokens.py
https://github.com/GemeenteUtrecht/zac-lite/blob/main/backend/src/zac_lite/user_tasks/tests/test_token_invalidation.py

Xential & Camunda message correlation fails

20-May-2021 13:47:12.408 SEVERE [http-nio-8080-exec-23] org.camunda.commons.logging.BaseLogger.logError ENGINE-16004 Exception while closing command context: ENGINE-13031 Cannot correlate a message with name 'documentCreated' to a single execution. 2 executions match the correlation keys: CorrelationSet [businessKey=null, processInstanceId=db97972b-b971-11eb-9bee-2a3de7ed312a, processDefinitionId=null, correlationKeys=null, localCorrelationKeys=null, tenantId=null, isTenantIdSet=false]
	org.camunda.bpm.engine.MismatchingMessageCorrelationException: ENGINE-13031 Cannot correlate a message with name 'documentCreated' to a single execution. 2 executions match the correlation keys: CorrelationSet [businessKey=null, processInstanceId=db97972b-b971-11eb-9bee-2a3de7ed312a, processDefinitionId=null, correlationKeys=null, localCorrelationKeys=null, tenantId=null, isTenantIdSet=false]

Rework task input validation

Variables are more often structured than not (simple primitive strings/numbers vs. JSON objects), which can/should be validated using DRF serializers or an implementation on top of that.

It's important that:

Validation is clear in pointing out the problem: variable missing, variable has an invalid value...
Errors are easily readable from the Camunda incident console
Proper documentation on that kind of errors can be raised

Probably a 1.1.0 feature, wouldn't let this block a 1.0 release.

gemeenteutrecht / bptl Goto Github PK

bptl's People

Contributors

Stargazers

Watchers

Forkers

bptl's Issues

Introduction

High level overview of the (processing) flow

Main goals

Other

Technical hints

Recommend Projects

Recommend Topics

Recommend Org