viaacode / borndigital Goto Github PK
View Code? Open in Web Editor NEWBorndigital (pre-)ingests digitally born items into MediaHaven
Borndigital (pre-)ingests digitally born items into MediaHaven
The validation fails and logs the following message.
The schema validation should either be disabled or updated to not throw these errors as often.
2018-12-14 10:40:54,438 [amqpReceiver.01] WARN org.mule.util.xmlsecurity.DefaultXMLSecureFactories - Can't configure XML entity expansion for Validator (com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl), this could introduce XXE and BL vulnerabilities
2018-12-14 10:40:54,438 [amqpReceiver.01] WARN org.mule.util.xmlsecurity.DefaultXMLSecureFactories - org.xml.sax.SAXNotRecognizedException: Property 'http://javax.xml.XMLConstants/property/accessExternalStylesheet' is not recognized.
2018-12-14 10:40:54,440 [amqpReceiver.01] INFO org.mule.api.processor.LoggerMessageProcessor - ERROR: MESSAGE PAYLOAD:
Currently, some items are retried indefinitely, even though they failed in MAM/MH.
Example: pid=v40jt1g69h.
Commit bd2c76f probably fixes this, but review needed (@dietervanhoof).
Having a BOM in the sidecar breaks the code. Since we can't expect the BOM to be omitted, it would be wise to add support for this.
In PI_SIP_DELIVERY_GENERIC_ESSENCE
, the flowVar destinationPath
(for the FXP-request) is being set as such: #["/" + flowVars.cp.toLowerCase() + "/TAPE-SHARE-EVENTS"]
(see referenced line below), where the flowVars.cp
is the value that was posted in the AMQP-message to trigger borndigital.
The CP-key in the AMQP-message, in turn, is the value for the command-line argument to the specific node-watchfolder process that sent the message to the borndigital.input
queue (see documentation here: https://github.com/viaacode/node_watchfolder#arguments).
In anticipation of the switch to a folder structure on OR-id instead of CP-name on MediaHaven's transport-servers (as is already the case on the FTP-servers), borndigital should lookup the cp_name via or-id using the organisations_api, instead of relying on the value for the CP-key in the incoming message (the problem being that a different process posting a message on the borndigital.input
queue might fill in a different value in the message's CP-key, which is the case for batch-intake: see here).
borndigital/src/main/app/deliveries.xml
Line 112 in be4abf4
When BD transfers a sidecar to the destination host (tra-server), it requires the destination path to exist. If not, it fails (Failed to change working directory to <destination_path>. Ftp error: 550. Type: class java.io.IOException
).
The risk for this happening is currently being mitigated by the fact that the FXP service is capable of creating the required destination paths. However, this can only work if FXP is not busy.
As an illustration: BD couldn't create the dir //atv/TAPE-SHARE-EVENTS
and keeps retrying. First retry on 2019-08-13 14:26, last retry (nr. 34705!!) on 2019-08-14 09:55 (after manual shutdown), aka, retrying for over 19 hours!.
The current "solution" is sub-optimal, to say the least.
An ticket for this issue exists: https://www.mulesoft.org/jira/browse/MULE-5192.
As per that ticket, it would seem that the latest version of Mule's FTP connector should be able to create directories on the fly: http://www.mulesoft.org/docs/site/current3/apidocs/org/mule/transport/sftp/SftpClient.html#createSftpDirIfNotExists(org.mule.api.endpoint.ImmutableEndpoint,%20java.lang.String)
Other solutions exist as well: http://www.javaroots.com/2014/09/mule-ftp-create-directory-if-not-exist.html.
Relevant code:
borndigital/src/main/app/deliveries.xml
Lines 197 to 203 in 9decb66
BD chokes on empty sidecars. The error is produced on line:
borndigital/src/main/app/ingest.xml
Line 361 in 1ee3d7c
borndigital/src/main/app/ingest.xml
Line 340 in 1ee3d7c
--> check for empty payload and alert. (Full stacktrace below)
For some content partners, multiple identical essences (and thus, files with the same MD5's) within the organisation is possible.
borndigital/src/main/app/ingest.xml
Line 420 in c0a11ed
now the whole app hangs if the xpath expression fails, catch the exception ack the message and log error in db
Currently:
borndigital-%i.log
.Longer logging retention for such an important flow is helpful, thus, update to, say:
borndigital.log-%i
Lines:
When the sidecar is read, the file is removed.
When the borndigital flow stops then the XML is gone before the package is transfered to the transport server.
Fix by deleting it after the flow is done
borndigital/src/main/app/mappings.xml
Line 552 in 359a0f3
Two seperate references to the EDTF-validating service exist in this subflow: should be refactored to have only one such instance that calls the EDTF-validation webservice.
Remove the abomination that is get_cp_id.xml
. CP_id is set based on CP_name in a if-else structure with 178 choices...
Move to call to organisation_api? Pull in "snapshot" of all cp's via organisations_api?
Moreover, this sub-flow is, of course, not environment-aware, ie., all CP-id's are PRD values.
This sub-flow, however, apparently is only being called in case of "custom cp's":
borndigital/src/main/app/ingest.xml
Lines 205 to 206 in be4abf4
In subflow 'metadata_corrections' XML-tags are being string-replaced, for example:
#[flowVars.mappedXml.replaceAll('<trefwoord>', '<Trefwoord>').replaceAll('</trefwoord>', '</Trefwoord>')]
Bad practice:
Subflow here:
borndigital/src/main/app/mappings.xml
Lines 523 to 526 in 1ee3d7c
The final log-message of the ingest flow could use more information. For example:
borndigital/src/main/app/ingest.xml
Line 232 in c0a11ed
Issue already fixed in https://github.com/viaacode/vrt_dailies/commit/c3cb1129e2df49e952272b3281eae0bc51e42ad6.
Offending lines:
borndigital/src/main/app/mappings.xml
Line 30 in 6ccd7a2
borndigital/src/main/app/mappings.xml
Line 237 in 6ccd7a2
Logger:
borndigital/src/main/app/mappings.xml
Line 540 in c2a01f2
This logger logs the entire XML payload (multiline): useful in debugging mode (ie., TST or QAS) but not in PRD.
So:
Anypoint pointer: part of subflow "validateMetadataFromVIAAtoMAM"
Also, update logger description.
The following error is quite abundant in the borndigital log files:
2018-11-05 14:52:49,781 [amqpReceiver.02] ERROR org.mule.exception.CatchMessagingExceptionStrategy -
********************************************************************************
Message : Execution of the expression "payload.pid" failed. (org.mule.api.expression.ExpressionRuntimeException).
Element : /pollerFlow/processors/2/1/1/0/0/2 @ borndigital-v0.4.6:poller.xml:50
--------------------------------------------------------------------------------
Exception stack is:
Execution of the expression "payload.pid" failed. (org.mule.api.expression.ExpressionRuntimeException). (org.mule.api.MessagingException)
org.mule.mvel2.integration.impl.ClassImportResolverFactory.getVariableResolver(ClassImportResolverFactory.java:112)
org.mule.mvel2.optimizers.impl.refl.nodes.VariableAccessor.getValue(VariableAccessor.java:40)
org.mule.mvel2.optimizers.impl.refl.nodes.NullSafe$1.getValue(NullSafe.java:47)
org.mule.mvel2.optimizers.impl.refl.nodes.NullSafe.getValue(NullSafe.java:62)
org.mule.mvel2.optimizers.impl.refl.nodes.VariableAccessor.getValue(VariableAccessor.java:37)
org.mule.mvel2.ast.ASTNode.getReducedValueAccelerated(ASTNode.java:109)
org.mule.mvel2.MVELRuntime.execute(MVELRuntime.java:86)
org.mule.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
org.mule.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
org.mule.mvel2.MVEL.executeExpression(MVEL.java:953)
org.mule.el.mvel.MVELExpressionExecutor.execute(MVELExpressionExecutor.java:87)
org.mule.el.mvel.MVELExpressionLanguage.evaluateInternal(MVELExpressionLanguage.java:228)
org.mule.el.mvel.MVELExpressionLanguage.evaluate(MVELExpressionLanguage.java:163)
org.mule.el.mvel.MVELExpressionLanguage.evaluate(MVELExpressionLanguage.java:142)
org.mule.expression.DefaultExpressionManager.evaluate(DefaultExpressionManager.java:216)
org.mule.expression.DefaultExpressionManager.evaluate(DefaultExpressionManager.java:187)
org.mule.module.db.internal.resolver.param.DynamicParamValueResolver.resolveParams(DynamicParamValueResolver.java:42)
(156 more...)
(set debug level logging or '-Dmule.verbose.exceptions=true' for everything)
********************************************************************************
2018-11-05 14:52:49,781 [amqpReceiver.02] INFO org.mule.api.processor.LoggerMessageProcessor - Catch error
Root cause: it seems that, somehow, payload.pid
is not there.
The lines "causing" (these lines are, of course, not the cause) this error are:
borndigital/src/main/app/poller.xml
Lines 50 to 54 in a5f239e
--> Research cause and implement solution.
If the dc_title
field is lacking from the input XML, several fallback nodes can/will be used instead. However, when the dc_title
node is present but empty (ie., <dc_title/>
) the output title
field is empty as well.
Offending line here:
borndigital/src/main/app/mappings.xml
Line 48 in c0a11ed
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.