Git Product home page Git Product logo

artefactual / archivematica Goto Github PK

View Code? Open in Web Editor NEW
416.0 44.0 102.0 39.94 MB

Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.

Home Page: http://www.archivematica.org

License: GNU Affero General Public License v3.0

Shell 0.28% Python 86.71% HTML 5.00% CSS 0.55% JavaScript 6.98% Dockerfile 0.18% Makefile 0.30%
digital-preservation archivematica

archivematica's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

archivematica's Issues

Problem: metadata.json ingest breaks if null values are present

When ingesting a transfer with a metadata.json file, Archivematica converts the json to csv. If any of the values in the json are null, the converstion to csv fails, halting processing of that transfer.

The error that shows up in the logs is generated by:

https://github.com/artefactual/archivematica/blob/stable/1.6.x/src/MCPClient/lib/clientScripts/jsonMetadataToCSV.py#L71

and looks like:
DEBUG 2017-07-31 12:56:01 archivematica.mcp.server:taskStandard:check_request_status:95: Task a86940c5-250d-4982-be53-5cbf3abf3e95 finished! Result COMPLETE - {'stdOut': '', 'exitCode': 1, 'stdError': 'Traceback (most recent call last):\n File "/usr/lib/archivematica/MCPClient/clientScripts/jsonMetadataToCSV.py", line 129, in <module>\n sys.exit(main(sip_uuid, json_metadata))\n File "/usr/lib/archivematica/MCPClient/clientScripts/jsonMetadataToCSV.py", line 119, in main\n writer.writerow(object_to_row(fix_encoding(row), headers))\n File "/usr/lib/archivematica/MCPClient/clientScripts/jsonMetadataToCSV.py", line 80, in fix_encoding\n return {key.encode(\'utf-8\'): encode_item(value) for key, value in row.items()}\n File "/usr/lib/archivematica/MCPClient/clientScripts/jsonMetadataToCSV.py", line 80, in <dictcomp>\n return {key.encode(\'utf-8\'): encode_item(value) for key, value in row.items()}\n File "/usr/lib/archivematica/MCPClient/clientScripts/jsonMetadataToCSV.py", line 71, in encode_item\n return [i.encode(\'utf-8\') for i in item]\nAttributeError: \'NoneType\' object has no attribute \'encode\'\n'}

It should be possible to continue processing, ignoring the null value.

Problem: clamdscan can't connect to remote clamd server

The suggested solution is to modify the antivirus checking script so it creates a configuration file on the fly as clamdscan does not support environment variables.

  1. Create a temporary file, e.g. /tmp/tmp.Cc0IfaOF2F
  2. Write config values to it (e.g. TCPSocket 3310 and TCPAddr clamavd). The client script looks up the config values in MCPClient.
  3. Run clamdscan --config-file=/tmp/tmp.Cc0IfaOF2F

Problem: timeouts preventing transfers from starting

Changes introduced into the qa/1.x branch in #531 (updated in #565) have changed the way http requests are working. This is affecting the ability to start large transfers.

The communication between the dashboard and the storage service is synchronous - when a user clicks the 'start transfer' button - the dashboard sends an http request to the storage service to copy files to the activeTransfers watched Directories. The request blocks until the storage service completes the process of moving the files around. For large this this can take several minutes.

Ideally the storage service would be modified to support asynchronous calls (accept a callback with the original request, send 201, and call the callback when the work is completed). For the 1.7.0 release, it may be necessary to revert the #531 change.

Problem: request_file_deletion returning Unicode instead of dict

In archivematicaCommon/lib/storageService.py. Try to delete an AIP from the dashboard in qa/1.x. Still working on figuring out the source of this issue but it appears that Requests' response.json() is returning Unicode/string JSON when we expect loaded/parsed JSON, i.e., a dict...

Problem: AIP download URL uses SS IP that may not be available to user

Dashboard is set up to access the storage service directly using the API, on an IP address or hostname configured in the Admin settings.

Dashboard also provides the user with a generated download URL for downloading the AIP from the storage service. This is generated from the above mentioned configured IP address. However in some setups, that IP address may not be accessible to the user via their browser.

One option would be to have two separate URL settings, one internal and one external. Another would be to give the user a download URL that points to Dashboard, and proxy the content from Storage Service (using something like https://goo.gl/l5bR1F) - this would remove the need to provide the user with an externally available storage URL.

Problem: this repo uses git submodules

We have other ways to manage dependencies, e.g. pip in Python, bower in frontend, etc... git submodules will do it but they also introduce unnecessary complexity in the development workflow. The less solutions we combine the better.

This is the current situation:

base64 is the only one left! It's also a frontend asset manageable from bower, checking in bower_modules/ in the repo if you wish to avoid extra steps during deployment. The submodule was introduced here: 4d07f82. For some reason the symlink is not a symlink anymore, see https://github.com/artefactual/archivematica/blob/qa/1.x/src/dashboard/src/media/js/vendor/base64.js.

My suggestion is to vendor base64 or introduce bower_modules/.
Wait until #506 is merged!

Problem: atom dip upload not working

The atom dip upload feature is not working in qa/1.x

Some problems I found:

  • When using the web interface, the question "Upload DIP" ->"Upload DIP to Atom" asks for the atom slug, but never finishes and the question is always in "Awaiting decision" state.
  • Using am mcp-rpc-cli, when selecting "Upload DIP to Atom" for the "Upload DIP" question, there is a new question, with only one choice ( upload-qubit_v0.0):

<choicesAvailableForUnit> <UUID>3f857f46-6a10-472d-a564-77bd7534f58b</UUID> <unit> <type>DIP</type> <unitXML> <UUID>c8529576-a069-48fc-b056-8e260275e86c</UUID> <currentPath>%sharedPath%watchedDirectories/uploadDIP/atom4-c8529576-a069-48fc-b056-8e260275e86c/</currentPath> </unitXML> </unit> <choices> <choice> <chainAvailable>0</chainAvailable> <description>upload-qubit_v0.0</description> </choice> </choices> </choicesAvailableForUnit>

  • The switch --rsync-command="None"\ is being passed to the script, and that makes rsync fail with error 14

Problem: libxslt1-devel can't be found in CentOS

Here in archivematica/src/dashboard/osdeps/CentOS-7.json in the list of packages is the line

{ "name": "libxslt1-devel", "state": "latest"},

in going to install this version on archtest.denver.archivematica.org, unable to install the above, didn't even seem to be such a package in the source cache, and indeed the slightly differently libxslt-devel was installed instead. Not sure if package name depends on the RH/CentOS variant, or if the above is a typo, or what, but for moment have made a fork of qa/1.x, dev/issue-11231-CentOS-try, with the line modified to

{ "name": "libxslt-devel", "state": "latest"},

  • David H.

Docker integration (merge from Jisc fork)

Integrate work from JiscSD#1 into core Archivematica.

This creates Dockerfiles for Dashboard, MCP Server and MCP Client, along with supporting fixes in the code. Archivematica should be able to run either with or without Docker containers.

Also: JiscSD#10 and JiscSD#11 for configuring gunicorn from Dockerfile.

Possibly also a proper solution for JiscSD/rdss-archivematica#11 (is this docker related?)

See also artefactual/archivematica-storage-service#208

It would also be good to provide a docker-compose config in a separate repo so that all AM services can be built and run. (does this belong in deploy-pub or somewhere else?)

Problem: unable to run migrations in default install

When running the migrations step on a clean install, it fails due to not being able to load the autoslug module.

I think this is caused by 7d0cef6 , but I'm not sure if this is a bug in am, or in the am deployment playbooks. @sevein , what do you think?

Full error:
TASK [external-roles/artefactual.archivematica-src : Run migrations (within dashboard virtualenv)] *** fatal: [santi]: FAILED! => {"changed": false, "cmd": "./manage.py migrate --noinput --settings=settings.common --pythonpath=/usr/lib/archivematica/archivematicaCommon", "failed": true, "msg": "\n:stderr: Traceback (most recent call last):\n File \"./manage.py\", line 10, in <module>\n execute_from_command_line(sys.argv)\n File \"/usr/share/python/archivematica-dashboard/local/lib/python2.7/site-packages/django/core/management/__init__.py\", line 354, in execute_from_command_line\n utility.execute()\n File \"/usr/share/python/archivematica-dashboard/local/lib/python2.7/site-packages/django/core/management/__init__.py\", line 328, in execute\n django.setup()\n File \"/usr/share/python/archivematica-dashboard/local/lib/python2.7/site-packages/django/__init__.py\", line 18, in setup\n apps.populate(settings.INSTALLED_APPS)\n File \"/usr/share/python/archivematica-dashboard/local/lib/python2.7/site-packages/django/apps/registry.py\", line 108, in populate\n app_config.import_models(all_models)\n File \"/usr/share/python/archivematica-dashboard/local/lib/python2.7/site-packages/django/apps/config.py\", line 198, in import_models\n self.models_module = import_module(models_module_name)\n File \"/usr/lib/python2.7/importlib/__init__.py\", line 37, in import_module\n __import__(name)\n File \"/opt/archivematica/archivematica/src/dashboard/src/fpr/models.py\", line 13, in <module>\n from autoslug import AutoSlugField\nImportError: No module named autoslug\n", "path": "/usr/share/python/archivematica-dashboard/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games", "state": "absent", "syspath": ["/tmp/ansible_KbYHSp", "/tmp/ansible_KbYHSp/ansible_modlib.zip", "/tmp/ansible_KbYHSp/ansible_modlib.zip", "/usr/lib/python2.7", "/usr/lib/python2.7/plat-x86_64-linux-gnu", "/usr/lib/python2.7/lib-tk", "/usr/lib/python2.7/lib-old", "/usr/lib/python2.7/lib-dynload", "/usr/local/lib/python2.7/dist-packages", "/usr/lib/python2.7/dist-packages", "/usr/lib/pymodules/python2.7"]}

Transfer browser error does not allow me to start a transfer

Installation details:

How to reproduce:

  • Open dashboard, set up storage service (fresh install)
  • In transfer tab, open web developer tools
  • Error seen:
TypeError: Cannot read property 'uuid' of undefined
    at source_location_browser.list.then._this2.source_locations (transfer_browser.js:56618)
    at processQueue (transfer_browser.js:16925)
    at transfer_browser.js:16941
    at Scope.$eval (transfer_browser.js:18224)
    at Scope.$digest (transfer_browser.js:18037)
    at Scope.$apply (transfer_browser.js:18332)
    at done (transfer_browser.js:12373)
    at completeRequest (transfer_browser.js:12575)
    at XMLHttpRequest.requestLoaded (transfer_browser.js:12508)```

Problem: Verify Checksum โ€œcompleted successfullyโ€ even though not valid

Verify Transfer Checksum โ€œcompleted successfullyโ€ even though the checksum is not valid
Background: testing in the demo-jisc environment for the Jisc RDSS project. I created a transfer with one file in it, which included an md5 checksum. The checksum was provided in the โ€˜request preservationโ€™ message that we mock from the rdss. This is sent using the msgCreator app to the Archivematica Channel Adaptor, which takes the checksum metadata (uuid, checksum type, checksum value) and creates a checksum file that is included in the transfer sent to Archivematica.

I had already tested with a transfer where the checksum was valid, so in this case, I wanted to test what happened when the checksum provided was incorrect (i.e. would not match a new checksum generated by the same file). I changed 1 digit in the checksum value to do this.

Expected Behaviour: When the transfer was processed, I expect the Verify Transfer Checksum to fail; for that failure to trigger the โ€œFailed Transferโ€ micro service; and then stop processing of the transfer.

Actual Behaviour: The โ€œVerify Transfer Checksumโ€ indicated that it โ€œcompleted successfullyโ€ and processing of the Transfer continued (straight through to ingest).

a key detail: when I intentionally corrupted the checksum value, I replaced a hex character with a โ€™tโ€™ (not considering that the checksum digits must all be hex characters). When I re-ran the test with a hex character instead of a t (but still not the original value), the micro service fails as expected. Normally we expect the checksum to be fine, but for the verification to fail because the file is corrupted. In my scenario, the file was fine but the checksum provided with it was corrupted. So this scenario may not happen frequently.

Output from the task is included below for reference. Discussed the defect with Jesรบs who wrote the channel adaptor code. His initial assessment is that this has to do with the verify checksum micro service, and is not due to the way the channel adaptor is creating creating a checksum file from metadata.

โ€” output Job Verify metadata directory checksums


Task 63716561-a312-4eb5-89d2-de615bc5a0f3

File UUID
Unknown
File name
test3 for checksum fail-a932afa6-e2db-4645-82df-529cc38cc3be
Client
31bf6692ac10_2
Exit code
0
Start time
6/27/2017, 6:15:25 PM
End time
6/27/2017, 6:15:25 PM
Created time
6/27/2017, 6:15:25 PM
Duration
< 1 seconds(s)
Module verifyMD5_v0.0
"%sharedPath%currentlyProcessing/test3 for checksum fail-a932afa6-e2db-4645-82df-529cc38cc3be/" "%date%" "%taskUUID%" "a932afa6-e2db-4645-82df-529cc38cc3be"
Standard streams

Standard output (stdout)
File Does not exist: /var/archivematica/sharedDirectory/currentlyProcessing/test3 for checksum fail-a932afa6-e2db-4645-82df-529cc38cc3be/metadata/checksum.sha1
File Does not exist: /var/archivematica/sharedDirectory/currentlyProcessing/test3 for checksum fail-a932afa6-e2db-4645-82df-529cc38cc3be/metadata/checksum.sha256
Standard error (stderr)
md5deep: /var/archivematica/sharedDirectory/currentlyProcessing/test3 for checksum fail-a932afa6-e2db-4645-82df-529cc38cc3be/metadata/checksum.md5: Unable to find any hashes in file, skipped.
md5deep: Unable to load any matching files.
Try md5deep -h for more information.
md5deep: /var/archivematica/sharedDirectory/currentlyProcessing/test3 for checksum fail-a932afa6-e2db-4645-82df-529cc38cc3be/metadata/checksum.md5: Unable to find any hashes in file, skipped.
md5deep: Unable to load any matching files.
Try md5deep -h for more information.

Problem: MCPClient runs client scripts in hard-coded environment

The archivematicaClient module runs the client scripts with and updated environment different from the one inherited.

Here's the snippet:

lib_paths = ['/usr/share/archivematica/dashboard/', '/usr/lib/archivematica/archivematicaCommon']
env_updates = {
    'PYTHONPATH': os.pathsep.join(lib_paths),
    'DJANGO_SETTINGS_MODULE': config.get('MCPClient', 'django_settings_module')
}

executeOrRun("command", command, sInput, printing=False, env_updates=env_updates)

Notice how MCPClient's path is not included in PYTHONPATH.

This brings a few issues:

  • Paths are hard-coded. We can't pass environment PYTHONPATH to client scripts, only to MCPClient.
  • Client scripts consume Dashboard settings.common. This is somehow necessary because MCPClient's settings are different and incompatible, e.g. some client scripts will break if you're not using Dashboard's settings specifically.
  • django_settings_module in clientConf.conf actually refers to settings module in Dashboard, that is confusing.

Solution

Long term: stop importing Dashboard code (ha! main.models and shared database...) from client scripts.

Short term: ... TODO! :)

Multi-process bagit.validate breaks AIP re-ingest

To re-create: using AM qa/1.x and SS qa/0.x, create an AIP and then attempt to do a metadata-only reingest on it. AM should spin its wheels for 5 seconds and then notify you that "Error re-ingesting package: An unknown error occurred." If you insert some logging lines before and after https://github.com/artefactual/archivematica-storage-service/blob/qa/0.x/storage_service/locations/models/package.py#L843, you will see that the log messages after that line do not get called: the call to bag.validate(processes=4) hangs indefinitely. If you replace that call with the original bag.validate() everything works as normal.

Note that the bag should be valid and validation should passโ€”and does pass when the processes kwarg is NOT passed in.

I have tried to recreate this issue in a simplified environment but have not had success. The following python script tries multi-process BagIt validation (on a separate thread for good measure) using a bag created on the fly as well as on a preexisting bag called 'breaking-transfer' that you must place in the same directory as the script. Create a file called testbagit.py and copy this code into it and then call python testbagit.py; you should see bagit behaving correctly:

import bagit
import os
import sys
import shutil
import time
from uuid import uuid4
from threading import Thread


DIR_TO_CREATE = 'bagittestdir'
REAL_BROKEN_BAG = 'breaking-transfer'
REAL_BROKEN_BAG_COPY = REAL_BROKEN_BAG + '-copy'


def create_test_dir():
    try:
        os.mkdir(DIR_TO_CREATE)
    except OSError:
        pass
    for _ in range(20):
        fname = str(uuid4()) + '.txt'
        fpath = os.path.join(DIR_TO_CREATE, fname)
        with open(fpath, 'w') as fd:
            fd.write('bagit test!')


def create_bag():
    create_test_dir()
    bag = bagit.make_bag(DIR_TO_CREATE, {'Contact-Name': 'John Kunze'})


def _validate_bag(bag_path):
    print 'starting to validate bag %s' % bag_path
    sys.stdout.flush()
    bag = bagit.Bag(bag_path)
    try:
        success = bag.validate(processes=4)
        print 'bag %s is valid!' % bag_path
    except bagit.BagValidationError as e:
        print 'got error on bag validate on %s' % bag_path
        print e
    print 'done validating bag %s' % bag_path
    sys.stdout.flush()


def validate_bag(bag_path):
    #_validate_bag(bag_path)
    t = Thread(target=_validate_bag, args=(bag_path,))
    t.start()
    t.join()


def validate_created_bag():
    validate_bag(DIR_TO_CREATE)


def invalidate_created_bag():
    path = os.path.join(DIR_TO_CREATE, 'data')
    fname = os.listdir(path)[0]
    fpath = os.path.join(path, fname)
    with open(fpath, 'w') as fd:
        fd.write('I changed this file!')


def destroy_test_dir():
    shutil.rmtree(DIR_TO_CREATE)


def destroy_broken_bag_copy():
    shutil.rmtree(REAL_BROKEN_BAG_COPY)


def copy_broken_bag():
    shutil.copytree(REAL_BROKEN_BAG, REAL_BROKEN_BAG_COPY)


def validate_broken_bag():
    validate_bag(REAL_BROKEN_BAG_COPY)


def experiment1():
    """Create a dir, make it into a bag, invalidate the bag by changing a file,
    attempt to validate it with multiple processes, then destroy it.
    """
    create_bag()
    invalidate_created_bag()
    validate_created_bag()
    destroy_test_dir()


def experiment2():
    """Copy an existing purportedly broken bag, attempt to validate it on a
    separate thread with multiple processes, then destroy the copy.
    """
    copy_broken_bag()
    validate_broken_bag()
    destroy_broken_bag_copy()


if __name__ == '__main__':
    experiment1()
    experiment2()

Does anyone have any idea what is causing this issue? Are you able to re-create the issue given my description above? I plan to create a SS PR that reverts the bag.validate(processes=4) calls to bag.validate() but obviously I would prefer to keep the multi-process call figure out the real cause of the problem...

FITS download error installing via automated ubuntu github install

I am using Ubuntu 16.04. I tried to install via the github instructions

https://www.archivematica.org/en/docs/archivematica-1.6/admin-manual/installation/installation/#automated-ubuntu-github-install

I have to rerun vagrant provision a lot, mostly due to what seem to be download errors, but as of last night and this morning, I keep getting stuck at the FITS section which will not let me proceed. I tried a different internet connection as well (hotspotting from phone) and couldn't proceed either. I am able to manually download the file that seems to be the problem, but I know enough about virtualbox/vagrant to know how to fix this.

Here's the FITS error followed by the full terminal output:

failed: [am-local] (item={u'state': u'latest', u'name': u'fits'}) => {"cache_update_time": 1498202565, "cache_updated": false, "failed": true, "item": {"name": "fits", "state": "latest"}, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\"     install 'fits'' failed: E: Failed to fetch http://packages.archivematica.org/1.6.x/ubuntu-externals/pool/main/f/fits/fits_0.8.4-1~14.04_amd64.deb  GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://packages.archivematica.org/1.6.x/ubuntu-externals/pool/main/f/fits/fits_0.8.4-1~14.04_amd64.deb  GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://packages.archivematica.org/1.6.x/ubuntu-externals/pool/main/f/fits/fits_0.8.4-1~14.04_amd64.deb  GnuTLS recv error (-9): A TLS packet with unexpected length was received.", "", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following extra packages will be installed:\n  default-jdk default-jre default-jre-headless fonts-dejavu-extra\n  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev\n  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev\n  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre\n  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools\n  xtrans-dev\nSuggested packages:\n  libice-doc libsm-doc libxcb-doc libxt-doc openjdk-7-demo openjdk-7-source\n  visualvm\nThe following NEW packages will be installed:\n  default-jdk default-jre default-jre-headless fits fonts-dejavu-extra\n  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev\n  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev\n  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre\n  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools\n  xtrans-dev\n0 upgraded, 25 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 68.9 MB/90.7 MB of archives.\nAfter this operation, 150 MB of additional disk space will be used.\nGet:1 http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04 [68.9 MB]\nErr http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04\n  GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following extra packages will be installed:", "  default-jdk default-jre default-jre-headless fonts-dejavu-extra", "  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev", "  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev", "  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre", "  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools", "  xtrans-dev", "Suggested packages:", "  libice-doc libsm-doc libxcb-doc libxt-doc openjdk-7-demo openjdk-7-source", "  visualvm", "The following NEW packages will be installed:", "  default-jdk default-jre default-jre-headless fits fonts-dejavu-extra", "  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev", "  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev", "  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre", "  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools", "  xtrans-dev", "0 upgraded, 25 newly installed, 0 to remove and 0 not upgraded.", "Need to get 68.9 MB/90.7 MB of archives.", "After this operation, 150 MB of additional disk space will be used.", "Get:1 http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04 [68.9 MB]", "Err http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04", "  GnuTLS recv error (-9): A TLS packet with unexpected length was received."]}

~/deploy-pub/playbooks/archivematica$ vagrant provision 
==> am-local: Running provisioner: ansible...
    am-local: Running ansible-playbook...

PLAY [am-local] ****************************************************************

TASK [Gathering Facts] *********************************************************
ok: [am-local]

TASK [include_vars] ************************************************************
ok: [am-local]

TASK [Install packages for development convenience] ****************************
ok: [am-local] => (item=[u'fish'])

TASK [artefactual.elasticsearch : Install python-software-properties] **********
ok: [am-local]

TASK [artefactual.elasticsearch : Update repositories] *************************
ok: [am-local] => (item=ppa:webupd8team/java)

TASK [artefactual.elasticsearch : Accept Oracle license prior JDK installation] ***
changed: [am-local]

TASK [artefactual.elasticsearch : Install dependencies] ************************
ok: [am-local]

TASK [artefactual.elasticsearch : Install dependencies] ************************
ok: [am-local] => (item=[u'htop', u'ntp', u'unzip'])

TASK [artefactual.elasticsearch : Configuring user and group] ******************
ok: [am-local]

TASK [artefactual.elasticsearch : user] ****************************************
ok: [am-local]

TASK [artefactual.elasticsearch : command] *************************************
[DEPRECATION WARNING]: always_run is deprecated. Use check_mode = no instead..

This feature will be removed in version 2.4. Deprecation warnings can be 
disabled by setting deprecation_warnings=False in ansible.cfg.
changed: [am-local]

TASK [artefactual.elasticsearch : Download Elasticsearch deb] ******************
skipping: [am-local]

TASK [artefactual.elasticsearch : Uninstalling previous version if applicable] ***
skipping: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
skipping: [am-local]

TASK [artefactual.elasticsearch : Install Elasticsearch deb] *******************
skipping: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
ok: [am-local]

TASK [artefactual.elasticsearch : Configuring directories] *********************
ok: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
ok: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
ok: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
ok: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
changed: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
changed: [am-local]

TASK [artefactual.elasticsearch : Configuring open file limits] ****************
changed: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
ok: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
skipping: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
ok: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
ok: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
ok: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
ok: [am-local]

TASK [artefactual.elasticsearch : lineinfile] **********************************
ok: [am-local]

TASK [artefactual.elasticsearch : Installing AWS Plugin] ***********************
skipping: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
skipping: [am-local]

TASK [artefactual.elasticsearch : Installing Plugins by Name] ******************
skipping: [am-local] => (item=elasticsearch_plugins) 

TASK [artefactual.elasticsearch : Installing Plugins by URL] *******************
skipping: [am-local] => (item=elasticsearch_plugins) 

TASK [artefactual.elasticsearch : file] ****************************************
skipping: [am-local]

TASK [artefactual.elasticsearch : Installing Custom JARs] **********************
skipping: [am-local] => (item=elasticsearch_custom_jars) 

TASK [artefactual.elasticsearch : file] ****************************************
skipping: [am-local]

TASK [artefactual.elasticsearch : Installing Marvel Plugin] ********************
skipping: [am-local]

TASK [artefactual.elasticsearch : file] ****************************************
skipping: [am-local]

TASK [artefactual.elasticsearch : Configuring Elasticsearch Node] **************
ok: [am-local]

TASK [artefactual.elasticsearch : template] ************************************
ok: [am-local]

TASK [artefactual.elasticsearch : Ensure Elasticsearch is started on boot] *****
ok: [am-local]

TASK [artefactual.percona : Obtaining percona public key] **********************
ok: [am-local]

TASK [artefactual.percona : Adding percona repository] *************************
ok: [am-local]

TASK [artefactual.percona : Adding percona repository (use wily packages as xenial package N/A yet)] ***
skipping: [am-local]

TASK [artefactual.percona : Update apt cache] **********************************
ok: [am-local]

TASK [artefactual.percona : Install percona database server] *******************
ok: [am-local] => (item=[u'percona-server-server-5.5', u'percona-server-client-5.5', u'percona-toolkit', u'percona-xtrabackup', u'python-mysqldb'])

TASK [artefactual.percona : Adjust permissions of datadir] *********************
ok: [am-local]

TASK [artefactual.percona : Ensure that percona is running] ********************
ok: [am-local]

TASK [artefactual.percona : Ensure that percona is enabled] ********************
ok: [am-local]

TASK [artefactual.percona : Update the my.cnf] *********************************
ok: [am-local]

TASK [artefactual.percona : Set the root password] *****************************
ok: [am-local] => (item=am-local)
ok: [am-local] => (item=127.0.0.1)
ok: [am-local] => (item=::1)
ok: [am-local] => (item=localhost)

TASK [artefactual.percona : Set the root password] *****************************
skipping: [am-local] => (item=127.0.0.1) 
skipping: [am-local] => (item=::1) 
skipping: [am-local] => (item=localhost) 

TASK [artefactual.percona : Copy .my.cnf file into the root home folder] *******
ok: [am-local]

TASK [artefactual.percona : Ensure anonymous users are not in the database] ****
ok: [am-local] => (item=am-local)
ok: [am-local] => (item=localhost)

TASK [artefactual.percona : Remove the test database] **************************
ok: [am-local]

TASK [artefactual.percona : Make sure the MySQL databases are present] *********

TASK [artefactual.percona : Make sure the MySQL users are present] *************

TASK [artefactual.gearman : Install gearman packages] **************************
ok: [am-local] => (item=[u'gearman-job-server', u'gearman-tools'])

TASK [artefactual.gearman : Populate gearman defaults] *************************
ok: [am-local]

TASK [artefactual.gearman : Install upstart service] ***************************
ok: [am-local]

TASK [artefactual.gearman : Ensure that gearman is enabled and running] ********
ok: [am-local]

TASK [artefactual.clamav : Install packages required by role (Ubuntu)] *********
ok: [am-local] => (item=[u'git'])

TASK [artefactual.clamav : Install clamav packages (Ubuntu)] *******************
ok: [am-local] => (item=[u'clamav', u'clamav-daemon'])

TASK [artefactual.clamav : stat] ***********************************************
ok: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
ok: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
ok: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
ok: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
ok: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
ok: [am-local]

TASK [artefactual.clamav : set_fact] *******************************************
ok: [am-local]

TASK [artefactual.clamav : debug] **********************************************
skipping: [am-local]

TASK [artefactual.clamav : Get clamav signatures from artefactual-labs repo] ***
skipping: [am-local]

TASK [artefactual.clamav : Copy clamav signatures to /var/lib/clamav] **********
skipping: [am-local] => (item=bytecode.cvd) 
skipping: [am-local] => (item=daily.cvd) 
skipping: [am-local] => (item=main.cvd) 

TASK [artefactual.clamav : Set user ownership for clamav files (Ubuntu)] *******
skipping: [am-local]

TASK [artefactual.clamav : Set user ownership for clamav files (RedHat)] *******
skipping: [am-local]

TASK [artefactual.clamav : Enable services] ************************************
ok: [am-local] => (item=clamav-daemon)

TASK [artefactual.clamav : Enable epel repository] *****************************
skipping: [am-local]

TASK [artefactual.clamav : Install packages required by role (RedHat)] *********
skipping: [am-local] => (item=[]) 

TASK [artefactual.clamav : Install required packages (RedHat)] *****************
skipping: [am-local] => (item=[]) 

TASK [artefactual.clamav : Config /etc/freshclam.conf from template] ***********
skipping: [am-local]

TASK [artefactual.clamav : Config /etc/sysconfig/freshclam from template] ******
skipping: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
skipping: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
skipping: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
skipping: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
skipping: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
skipping: [am-local]

TASK [artefactual.clamav : stat] ***********************************************
skipping: [am-local]

TASK [artefactual.clamav : set_fact] *******************************************
skipping: [am-local]

TASK [artefactual.clamav : debug] **********************************************
skipping: [am-local]

TASK [artefactual.clamav : Get clamav signatures from artefactual-labs repo] ***
skipping: [am-local]

TASK [artefactual.clamav : Copy clamav signatures to /var/lib/clamav] **********
skipping: [am-local] => (item=bytecode.cvd) 
skipping: [am-local] => (item=daily.cvd) 
skipping: [am-local] => (item=main.cvd) 

TASK [artefactual.clamav : Set user ownership for clamav files (Ubuntu)] *******
skipping: [am-local]

TASK [artefactual.clamav : Set user ownership for clamav files (RedHat)] *******
skipping: [am-local]

TASK [artefactual.clamav : Config /etc/clamd.d/scan.conf from template] ********
skipping: [am-local]

TASK [artefactual.clamav : Symlink /etc/clamd.d/scan.conf to /etc/clamd.conf] ***
skipping: [am-local]

TASK [artefactual.clamav : Enable and start services] **************************
skipping: [am-local]

TASK [artefactual.clamav : Pause, give some time for service to get up] ********
Pausing for 10 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [am-local]

TASK [artefactual.clamav : check clamdscan is working] *************************
changed: [am-local]

TASK [artefactual.archivematica-src : Set fact: type of environment] ***********
ok: [am-local]

TASK [artefactual.archivematica-src : Set fact: environment vars] **************
ok: [am-local]

TASK [artefactual.archivematica-src : initialize systemd folder] ***************
ok: [am-local]

TASK [artefactual.archivematica-src : initialize systemd folder (RedHat)] ******
skipping: [am-local]

TASK [artefactual.archivematica-src : Install necessary packages required by this ansible role (RedHat)] ***
skipping: [am-local] => (item=[]) 

TASK [artefactual.archivematica-src : Install necessary packages required by this ansible role (ubuntu)] ***
ok: [am-local] => (item=[u'python-pycurl', u'git', u'python-mysqldb'])

TASK [artefactual.archivematica-src : Ensure pip is not installed from packages] ***
ok: [am-local] => (item=python-pip)
ok: [am-local] => (item=python2-pip)

TASK [artefactual.archivematica-src : Download get-pip.py] *********************
ok: [am-local]

TASK [artefactual.archivematica-src : Install pip with get-pip.py] *************
changed: [am-local]

TASK [artefactual.archivematica-src : Install virtualenv with pip] *************
ok: [am-local] => (item=virtualenv)

TASK [artefactual.archivematica-src : Create user archivematica] ***************
ok: [am-local]

TASK [artefactual.archivematica-src : Add archivematica user permissions in visudo file] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Expand archivematica_src_dir] ************
ok: [am-local]

TASK [artefactual.archivematica-src : Create archivematica_src_dir] ************
ok: [am-local]

TASK [artefactual.archivematica-src : Checkout out archivematica-sampledata repository] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Checkout archivematica-storage-service code] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Ensure the source code is readable by all] ***
ok: [am-local]

TASK [artefactual.archivematica-src : include ss-osdeps.yml] *******************
included: /home/kieranjol/deploy-pub/playbooks/archivematica/roles/artefactual.archivematica-src/tasks/ss-osdeps.yml for am-local

TASK [artefactual.archivematica-src : set_fact] ********************************
ok: [am-local]

TASK [artefactual.archivematica-src : create temp directory in local machine to store osdeps files] ***
changed: [am-local -> 127.0.0.1]

TASK [artefactual.archivematica-src : stat for osdeps file] ********************
ok: [am-local]

TASK [artefactual.archivematica-src : set_fact osdeps] *************************
ok: [am-local]

TASK [artefactual.archivematica-src : fetch dependencies files from source code repository to local machine] ***
changed: [am-local]

TASK [artefactual.archivematica-src : include variables from retrieved dependencies files in namespace storage_service] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Install storage service ext. package deps.] ***
ok: [am-local] => (item={u'state': u'latest', u'name': u'nginx'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'unar'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'rsync'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'python-dev'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libxml2-dev'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libxslt1-dev'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libz-dev'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libffi-dev'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libssl-dev'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'gcc'})

TASK [artefactual.archivematica-src : Install storage service ext. package deps. 2] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Install storage service ext. package deps. 3] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : include ss-osdeps-fallback.yml] **********
skipping: [am-local]

TASK [artefactual.archivematica-src : error if no osdeps and no fallback] ******
skipping: [am-local]

TASK [artefactual.archivematica-src : stat] ************************************
ok: [am-local]

TASK [artefactual.archivematica-src : set_fact] ********************************
ok: [am-local]

TASK [artefactual.archivematica-src : set_fact] ********************************
skipping: [am-local]

TASK [artefactual.archivematica-src : Create virtualenv for archivematica-storage-service] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Create virtualenv for archivematica-storage-service, pip install requirements] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Work around to install pip deps commented out in old SS branches] ***
skipping: [am-local] => (item=python-swiftclient) 
skipping: [am-local] => (item=python-keystoneclient) 
skipping: [am-local] => (item=sword2) 
skipping: [am-local] => (item=pyopenssl) 
skipping: [am-local] => (item=ndg-httpsclient) 
skipping: [am-local] => (item=pyasn1) 

TASK [artefactual.archivematica-src : Create subdirectory for archivematica-storage-service source files] ***
ok: [am-local] => (item=/usr/lib/archivematica)

TASK [artefactual.archivematica-src : Create subdirectory for archivematica-storage-service database file] ***
ok: [am-local] => (item=/var/archivematica/storage-service)

TASK [artefactual.archivematica-src : Create subdirectories for archivematica-storage-service config] ***
ok: [am-local] => (item=/etc/archivematica)

TASK [artefactual.archivematica-src : Create archivematica-storage-service log directories] ***
ok: [am-local] => (item=/var/log/archivematica/storage-service)

TASK [artefactual.archivematica-src : Touch SS log files] **********************
changed: [am-local] => (item=storage_service.log)
changed: [am-local] => (item=storage_service_debug.log)

TASK [artefactual.archivematica-src : Copy archivematica-storage-service source files] ***
ok: [am-local] => (item={u'dest': u'/usr/lib/archivematica/storage-service', u'src': u'/vagrant/src/archivematica-storage-service/storage_service'})

TASK [artefactual.archivematica-src : Run SS django collectstatic] *************
ok: [am-local]

TASK [artefactual.archivematica-src : Check if celery or django-celery is in the pip requirements] ***
changed: [am-local]

TASK [artefactual.archivematica-src : set_fact] ********************************
skipping: [am-local]

TASK [artefactual.archivematica-src : set_fact] ********************************
ok: [am-local]

TASK [artefactual.archivematica-src : Install redis-server package if required] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Enable and start redis-server if not already running] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Template celery upstart file if required] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Start storage service celery worker] *****
skipping: [am-local]

TASK [artefactual.archivematica-src : Remove SS DB] ****************************
skipping: [am-local]

TASK [artefactual.archivematica-src : Run SS django manage syncdb] *************
skipping: [am-local]

TASK [artefactual.archivematica-src : Fake locations 0.4 migration] ************
skipping: [am-local]

TASK [artefactual.archivematica-src : Fake locations 0.5 migration] ************
skipping: [am-local]

TASK [artefactual.archivematica-src : Fake locations 0.7 migration] ************
skipping: [am-local]

TASK [artefactual.archivematica-src : migrate with --fake-initial] *************
skipping: [am-local]

TASK [artefactual.archivematica-src : Run SS django database migrations] *******
ok: [am-local]

TASK [artefactual.archivematica-src : Fix DB permissions] **********************
ok: [am-local]

TASK [artefactual.archivematica-src : Back create API keys for old users] ******
ok: [am-local]

TASK [artefactual.archivematica-src : set nginx sites-available config file] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : set uwsgi apps-available config file] ****
skipping: [am-local]

TASK [artefactual.archivematica-src : Remove Nginx default server] *************
skipping: [am-local] => (item=/etc/nginx/sites-available/default) 
skipping: [am-local] => (item=/etc/nginx/sites-available/default.conf) 
skipping: [am-local] => (item=/etc/nginx/sites-enabled/default) 
skipping: [am-local] => (item=/etc/nginx/sites-enabled/default.conf) 

TASK [artefactual.archivematica-src : Set up Nginx server] *********************
skipping: [am-local]

TASK [artefactual.archivematica-src : Set up uWSGI server] *********************
skipping: [am-local]

TASK [artefactual.archivematica-src : Enable services] *************************
skipping: [am-local] => (item=nginx) 
skipping: [am-local] => (item=uwsgi) 

TASK [artefactual.archivematica-src : set nginx sites-available config file] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Template gunicorn configuration file] ****
ok: [am-local]

TASK [artefactual.archivematica-src : Config storage service gunicorn upstart (/etc/init)] ***
ok: [am-local]

TASK [artefactual.archivematica-src : Reload Upstart configuration] ************
changed: [am-local]

TASK [artefactual.archivematica-src : Add storage service gunicorn systemd (/etc/systemd)] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Add storage service gunicorn systemd env file (in /etc/default)] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Disable Nginx default server config] *****
ok: [am-local] => (item=/etc/nginx/sites-enabled/default)
ok: [am-local] => (item=/etc/nginx/sites-enabled/default.conf)

TASK [artefactual.archivematica-src : Set up Nginx server] *********************
ok: [am-local]

TASK [artefactual.archivematica-src : Template nginx sites-available ssl config file] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Remove non-ssl config from sites-enabled] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Add ssl config to sites-enabled] *********
skipping: [am-local]

TASK [artefactual.archivematica-src : Enable services (upstart)] ***************
changed: [am-local] => (item=archivematica-storage-service)
changed: [am-local] => (item=nginx)

TASK [artefactual.archivematica-src : Enable services (systemd)] ***************
skipping: [am-local] => (item=archivematica-storage-service) 
skipping: [am-local] => (item=nginx) 

TASK [artefactual.archivematica-src : Checkout Archivematica code] *************
ok: [am-local]

TASK [artefactual.archivematica-src : set_fact] ********************************
ok: [am-local]

TASK [artefactual.archivematica-src : create temp directory in local machine to store osdeps files] ***
ok: [am-local -> 127.0.0.1]

TASK [artefactual.archivematica-src : fetch dependencies files from source code repository to local machine] ***
changed: [am-local] => (item=archivematicaCommon)
changed: [am-local] => (item=MCPServer)
changed: [am-local] => (item=MCPClient)
changed: [am-local] => (item=dashboard)

TASK [artefactual.archivematica-src : command] *********************************
changed: [am-local -> 127.0.0.1]

TASK [artefactual.archivematica-src : set_fact] ********************************
ok: [am-local]

TASK [artefactual.archivematica-src : set_fact] ********************************
ok: [am-local]

TASK [artefactual.archivematica-src : debug] ***********************************
ok: [am-local] => {
    "msg": "comps is [ 'archivematicaCommon', 'MCPServer', 'MCPClient', 'dashboard' ], comps_osdeps is [u'archivematicaCommon', u'archivematica-storage-service', u'dashboard', u'MCPClient', u'MCPServer'], comps_no_osdeps is []"
}

TASK [artefactual.archivematica-src : include variables from retrieved dependencies files] ***
ok: [am-local] => (item=archivematicaCommon)
ok: [am-local] => (item=archivematica-storage-service)
ok: [am-local] => (item=dashboard)
ok: [am-local] => (item=MCPClient)
ok: [am-local] => (item=MCPServer)

TASK [artefactual.archivematica-src : include pipeline-osdeps-archivematicaCommon] ***
included: /home/kieranjol/deploy-pub/playbooks/archivematica/roles/artefactual.archivematica-src/tasks/pipeline-osdeps-archivematicaCommon.yml for am-local

TASK [artefactual.archivematica-src : Install archivematicaCommon ext. deps. repo keys] ***

TASK [artefactual.archivematica-src : Add archivematicaCommon ext. deps. repos (ubuntu)] ***

TASK [artefactual.archivematica-src : Install archivematicaCommon ext. package deps. (ubuntu)] ***
ok: [am-local] => (item={u'state': u'latest', u'name': u'libmysqlclient-dev'})

TASK [artefactual.archivematica-src : Add archivematicaCommon ext. deps. repos (RH/CentOS)] ***

TASK [artefactual.archivematica-src : Install archivematicaCommon ext. package deps. (RH/CentOS)] ***
skipping: [am-local] => (item={u'state': u'latest', u'name': u'libmysqlclient-dev'}) 

TASK [artefactual.archivematica-src : Install archivematicaCommon ext. package deps. (2)(RH/CentOS)] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Install archivematicaCommon ext. package deps. (3)(RH/CentOS)] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : include pipeline-osdeps-MCPServer] *******
included: /home/kieranjol/deploy-pub/playbooks/archivematica/roles/artefactual.archivematica-src/tasks/pipeline-osdeps-MCPServer.yml for am-local

TASK [artefactual.archivematica-src : Install MCPServer ext. deps. repo keys] ***

TASK [artefactual.archivematica-src : Add MCPServer ext. deps. repos] **********

TASK [artefactual.archivematica-src : Install MCPServer ext. package deps.] ****
ok: [am-local] => (item={u'state': u'latest', u'name': u'dbconfig-common'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'logapp'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'uuid'})

TASK [artefactual.archivematica-src : Add MCPServer ext. deps. repos (RH/CentOS)] ***

TASK [artefactual.archivematica-src : Install MCPServer ext. package deps. (RH/CentOS)] ***
skipping: [am-local] => (item={u'state': u'latest', u'name': u'dbconfig-common'}) 
skipping: [am-local] => (item={u'state': u'latest', u'name': u'logapp'}) 
skipping: [am-local] => (item={u'state': u'latest', u'name': u'uuid'}) 

TASK [artefactual.archivematica-src : Install MCPServer ext. package deps. (2) (RH/CentOS)] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : Install MCPServer ext. package deps. (3) (RH/CentOS)] ***
skipping: [am-local]

TASK [artefactual.archivematica-src : include pipeline-osdeps-MCPClient] *******
included: /home/kieranjol/deploy-pub/playbooks/archivematica/roles/artefactual.archivematica-src/tasks/pipeline-osdeps-MCPClient.yml for am-local

TASK [artefactual.archivematica-src : Install MCPClient ext. deps. repo keys] ***
ok: [am-local] => (item={u'url': u'https://packages.archivematica.org/GPG-KEY-archivematica', u'validate_certs': u'no', u'id': u'0x5236CA08'})

TASK [artefactual.archivematica-src : Add MCPClient ext. deps. repos] **********
ok: [am-local] => (item=deb [arch=amd64] http://packages.archivematica.org/1.6.x/ubuntu-externals trusty main)
ok: [am-local] => (item=deb http://archive.ubuntu.com/ubuntu/ trusty multiverse)
ok: [am-local] => (item=deb http://archive.ubuntu.com/ubuntu/ trusty-security universe)
ok: [am-local] => (item=deb http://archive.ubuntu.com/ubuntu/ trusty-updates multiverse)

TASK [artefactual.archivematica-src : Install MCPClient ext. package deps.] ****
ok: [am-local] => (item={u'state': u'latest', u'name': u'atool'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'bagit'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'bulk-extractor'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'ffmpeg'})
failed: [am-local] (item={u'state': u'latest', u'name': u'fits'}) => {"cache_update_time": 1498202565, "cache_updated": false, "failed": true, "item": {"name": "fits", "state": "latest"}, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\"     install 'fits'' failed: E: Failed to fetch http://packages.archivematica.org/1.6.x/ubuntu-externals/pool/main/f/fits/fits_0.8.4-1~14.04_amd64.deb  GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://packages.archivematica.org/1.6.x/ubuntu-externals/pool/main/f/fits/fits_0.8.4-1~14.04_amd64.deb  GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://packages.archivematica.org/1.6.x/ubuntu-externals/pool/main/f/fits/fits_0.8.4-1~14.04_amd64.deb  GnuTLS recv error (-9): A TLS packet with unexpected length was received.", "", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following extra packages will be installed:\n  default-jdk default-jre default-jre-headless fonts-dejavu-extra\n  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev\n  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev\n  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre\n  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools\n  xtrans-dev\nSuggested packages:\n  libice-doc libsm-doc libxcb-doc libxt-doc openjdk-7-demo openjdk-7-source\n  visualvm\nThe following NEW packages will be installed:\n  default-jdk default-jre default-jre-headless fits fonts-dejavu-extra\n  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev\n  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev\n  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre\n  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools\n  xtrans-dev\n0 upgraded, 25 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 68.9 MB/90.7 MB of archives.\nAfter this operation, 150 MB of additional disk space will be used.\nGet:1 http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04 [68.9 MB]\nErr http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04\n  GnuTLS recv error (-9): A TLS packet with unexpected length was received.\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following extra packages will be installed:", "  default-jdk default-jre default-jre-headless fonts-dejavu-extra", "  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev", "  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev", "  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre", "  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools", "  xtrans-dev", "Suggested packages:", "  libice-doc libsm-doc libxcb-doc libxt-doc openjdk-7-demo openjdk-7-source", "  visualvm", "The following NEW packages will be installed:", "  default-jdk default-jre default-jre-headless fits fonts-dejavu-extra", "  libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev", "  libsm-dev libx11-dev libx11-doc libxau-dev libxcb1-dev libxdmcp-dev", "  libxt-dev nailgun nailgun-client openjdk-7-jdk openjdk-7-jre", "  x11proto-core-dev x11proto-input-dev x11proto-kb-dev xorg-sgml-doctools", "  xtrans-dev", "0 upgraded, 25 newly installed, 0 to remove and 0 not upgraded.", "Need to get 68.9 MB/90.7 MB of archives.", "After this operation, 150 MB of additional disk space will be used.", "Get:1 http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04 [68.9 MB]", "Err http://packages.archivematica.org/1.6.x/ubuntu-externals/ trusty/main fits amd64 0.8.4-1~14.04", "  GnuTLS recv error (-9): A TLS packet with unexpected length was received."]}
ok: [am-local] => (item={u'state': u'latest', u'name': u'gearman'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'imagemagick'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'inkscape'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'jhove'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libimage-exiftool-perl'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'libxml2-utils'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'logapp'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'md5deep'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'mediainfo'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'nfs-common'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'openjdk-7-jre-headless'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'p7zip-full'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'pbzip2'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'postfix'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'readpst'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'rsync'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'siegfried'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'sleuthkit'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'tesseract-ocr'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'tika'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'tree'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'ufraw'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'unrar-free'})
ok: [am-local] => (item={u'state': u'latest', u'name': u'uuid'})
	to retry, use: --limit @/home/kieranjol/deploy-pub/playbooks/archivematica/singlenode.retry

PLAY RECAP *********************************************************************
am-local                   : ok=118  changed=15   unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

Problem: FIDO does not support pronom version 90

Archivematica 1.6.0 includes fido v 1.3.5, which by default supports PRONOM version 88.

In March 2017, The National Archives of the UK released PRONOM v 90.

Today a new version of FIDO (v.1.3.6) was released, which supports PRONOM v90. Archivematica should be updated to use this new version of Fido.

Shibboleth integration (merge from Jisc fork)

Merge work from JiscSD#5 and JiscSD#20 into Archivematica core.

This enables Shibboleth authentication (optionally) to allow login from academic institutions. The Archivematica functionality should not concern itself with implementation of Shibboleth protocols - it will simply respond to authentication headers received from the web server and use those to create and configure users.

It would be helpful to first merge #665 which provides some refactors around the welcome screen that are used here.

Related SS issue: artefactual/archivematica-storage-service#210

Problem: Archivematica does not send log events to stderr

MCPServer, MCPClient, Dashboard and client scripts... they're all set up with a handler that writes to disk (/var/log/archivematica). If we sent to stderr instead, the logs would be captured by upstart, journalctl, docker, etc...

Client scripts use sys.stdout and sys.stderr directly but most of them also have a leveled logger defined. That's a lot of streams! I think they all should use the logger which should be set up to send the events to the standard error stream. MCPClient can continue capturing stdout/stderr which is relevant when a client scripts fails, etc...

Problem: Can't start large transfers

As discussed in this google group thread: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/archivematica-tech/jADMHfVVUHY/vFPt4wvdAAAJ

in the qa/1.x branch, this line:
https://github.com/artefactual/archivematica/blob/qa/1.x/src/archivematicaCommon/lib/storageService.py#L57

appears to be causing a problem. The problem occurs when a transfer is started in the dashboard (either by calling the start_transfer() rest api endpoint, or by clicking the 'start transfer' button in the transfer tab of the dashboard. This causes the dashboard to make a rest call to the storage service, asking it to copy files from a specified Transfer Source Location to the pipeline. This can take a long time for large files, minutes or hours potentially.

Having a timeout of 5 seconds or 120 seconds causes an error like the one shown in the google group.

Problem: client script reads from MCPServer config

The case is very subtle.

There are a number of client scripts that import dicts.py (from archivematicaCommon) which reads from serverConfig.conf.

So if you deployed MCPClient you had to provision the serverConfig.conf file too.

Problem: SS 'object count' is not always an int: ``ungettext`` bug

If a transfer source location contains over 5000 files, then filesystem_ajax/views.py will trigger a TypeError because the call to ungettext assumes that the value of the 'object count' attribute returned by the SS will always be an integer. In fact, in this case the value will be a string, i.e., '5000+'.

Problem: AIP download is unnamed

In 2066135 we added to Dashboard the ability to stream AIPs to the browser that are served by the Storage Service. We're using stream_file_from_storage_service which gets the job done but it does not set the filename. In Chrome that results in a file named download, in Firefox the file is named randomly.

In Firefox, the download never finishes.

Problem: Can't start DSpace Transfer

see https://projects.artefactual.com/issues/11224

When trying to use the start_transfer api endpoint in the dashboard rest api, if the transfer being started contains zip files, a 500 server error is reported. This doesn't happen when the transfer only contains folders.

DSpace Transfers are zip files, so this bug make it impossible to start a transfer of type 'dspace' through the api (i.e. from automation tools).

Problem: quarantine delay is not reliable

When a transfer is sent to quarantine the user will be prompted to remove it from quarantine manually. But also, by default, the transfer is removed from quarantined automatically after 28 days. This is a delay that can be configured by the user in the processing configuration.

The purpose of the delay is to allow virus definitions to update, before virus scan.

There are two modules in MCP implementing the processing delay (1, 2).

It's done with a timer from the threading module which doesn't seem to be provide real guarantees. What would happen if the process is interrupted before the timer finishes?
AMQP or Redis seem to offer primitives that allow implementing scheduling. Gearman doesn't seem to provide any.

Solution 1

Deprecate quarantine delay functionality. Only the user would be able to remove it from quarantine.

Solution 2

Update virus definitions before antivirus checking?

Solution 3

Implement delayed jobs using Redis or similar. Once a new scheduled job comes in, MCP would persist it somewhere. In a loop, the tasks would be polled frequently and throw new jobs when needed. The following is a library that could be used for this purpose or as a reference: https://github.com/josiahcarlson/rpqueue/ (it uses Python + Redis).

Remember that Redis is already used as a Gearman backend, we've tried this successfully in our local Docker developer setup. Adding Redis to our stack would be also beneficial for other purposes like caching in Django.

Problem: dashboard reads config from clientConfig, serverConfig and dbsettings

IMO dashboard should just take environment variables but it currently loads config from pretty much everywhere it shouldn't.

Some examples:

$ git grep get_client_config_value
components/accounts/views.py:from components.helpers import get_client_config_value
components/accounts/views.py:    if get_client_config_value('kioskMode') == 'True':
components/administration/views.py:            'path': helpers.get_client_config_value('sharedDirectoryMounted')
components/archival_storage/views.py:        helpers.get_client_config_value('sharedDirectoryMounted'),
components/helpers.py:def get_client_config_value(field):

And...

$ git grep get_server_config_value
components/api/views.py:SHARED_DIRECTORY_ROOT = helpers.get_server_config_value('sharedDirectory')
components/api/views.py:        helpers.get_server_config_value('watchDirectoryPath'),
components/api/views.py:    shared_directory_path = helpers.get_server_config_value('sharedDirectory')
components/archival_storage/views.py:        shared_dir = helpers.get_server_config_value('sharedDirectory')
components/filesystem_ajax/views.py:SHARED_DIRECTORY_ROOT = helpers.get_server_config_value('sharedDirectory')
components/filesystem_ajax/views.py:    shared_dir = helpers.get_server_config_value('sharedDirectory')
components/filesystem_ajax/views.py:    shared_dir = os.path.realpath(helpers.get_server_config_value('sharedDirectory'))
components/helpers.py:def get_server_config_value(field):
components/helpers.py:        get_server_config_value('sharedDirectory'),
components/ingest/pair_matcher.py:    watch_dir = helpers.get_server_config_value('watchDirectoryPath')
components/ingest/views.py:        shared_dir = helpers.get_server_config_value('sharedDirectory')
components/ingest/views.py:        watched_dir = helpers.get_server_config_value('watchDirectoryPath')
components/ingest/views.py:    watched_dir = helpers.get_server_config_value('watchDirectoryPath')
components/ingest/views.py:    shared_directory_path = helpers.get_server_config_value('sharedDirectory')
installer/views.py:            shared_path = helpers.get_server_config_value('sharedDirectory')

This level of coupling is extremely undesirable. It's really hard at the moment to run the dashboard in a separate machine unless you pull all these configuration files.

Configuration cleanup (merge from Jisc fork)

Integrate work from JiscSD#3 into core Archivematica

This makes improvements to the configuration such that config is no longer shared between components, and it can be configured either in a config file or through environment variables.

This was added to the fork after the Docker work (#663), but might be possible without merging that first.

It will also require consideration of JiscSD#9 - the work in the Jisc PR changes the way the whitelist is used in authentication.

Problem: 0002 migration uses loaddata to import a fixture

loaddata can only use existing models. During a migration, we should be using the models provided by the migration system - this avoids a number of issues like making possible to delete existing models (e.g. #576 aims to delete workflow models).

Problem: dashboard manage.py error on import

In #642 an archivematicaCommon module import was added to manage.py, but this is producing an error since the module can't be found even though the module path is specified using --pythonpath:

(archivematica-dashboard) vagrant@am-local:/usr/share/archivematica/dashboard$ ./manage.py --pythonpath=/usr/lib/archivematica/archivematicaCommon
Traceback (most recent call last):
  File "./manage.py", line 9, in <module>
    import elasticSearchFunctions
ImportError: No module named elasticSearchFunctions

To work around this need to do something like:

(archivematica-dashboard) vagrant@am-local:/usr/share/archivematica/dashboard$ PYTHONPATH=/usr/lib/archivematica/archivematicaCommon ./manage.py

This problem is breaking deployments using https://github.com/artefactual-labs/ansible-archivematica-src.

Could an alternative to #642 be to use something like https://stackoverflow.com/a/37306165 instead of adding the import to the manage.py file?

Problem: There is an (unused) duplicate 'watchedDirectories' folder

Inside the sharedDirectory structure, there is this folder:

https://github.com/artefactual/archivematica/tree/stable/1.6.x/src/MCPServer/share/sharedDirectoryStructure/watchedDirectories/watchedDirectories/system/autoProcessSIP

As far as I can tell it is unused, and could be removed. I think it is confusing to have a folder with a path 'watchedDirectories/watchedDirectories' like that. I am not sure if there is a reason for this, but I can't find a place where this is used.

Problem: using symlinks breaks Windows dev environments

Archivematica cannot be deployed on Windows, but this PR from @minusdavid artefactual/deploy-pub#39 makes it possible to deploy a development environment on Windows, using vagrant to deploy to a linux vm.

That PR is working great, but there is a problem with checking out a git repo that contains symlinks into a windows filesystem (google it, lots of links). Windows doesn't properly support symlinks, and so checking out a repo with symlinks is difficult, ansible roles choke, you get weird git errors, etc.

In this repo, there are only a few symlinks being used - it would not be hard to remove them altogether. I think the only place left is in the osdeps folders. Removing those symlinks and creating duplicate files for now would allow osdeps to differ for each platform, which is fine, and would make developing in a Windows environment much easier, which is a bonus.

Problem: antivirus scanning is broken in standard installation

Only affecting qa/1.x (1.7 not released yet).

By standard I mean: Ansible or packages, where MCPClient and clamd run in the same machine.
In a36a9bf we added the ability to talk to a clamd instance over the network but in addition to that we started passing the files to clamd by reference which doesn't always work (depending on how the permissions are set up).

This is being fixed in #696 (9198655), where we're adding an option so the user can decide if the file is passed to the antivirus by reference or by value (streaming). It defaults to streaming because it doesn't require any special config. Passing by reference may be better in some situations but it also requires certain level of permission orchestration.

Change head branch in repository from stable/1.6.x to qa/1.x (or master)

I've noticed that Github defaults to the stable/1.6.x branch and it's because the head branch for the repository is stable/1.6.x.

I think most projects I've worked on tend to have master as the HEAD branch. While archivematica doesn't have a master branch, perhaps it should be changed to qa/1.x?

I know when I cloned the repository to write a patch, I accidentally worked off stable/1.6.x rather than qa/1.x (or master) because it was the branch that was created during cloning.

MCPClient should check for exceptions when creating Gearman worker

When constructing the Gearman worker in startThread() in /usr/lib/archivematica/MCPClient/archivematicaClient.py, there should be exception checking.

If the config item MCPArchivematicaServer has an invalid value, the worker won't run, and Archivematica will be stuck thinking the task is executing indefinitely with no indication of what's happened in the user interface or the logs.

Workflow fixes related to DIP Upload & Store

From to PR #680:

Changes to the workflow and related client scripts so that the "Upload DIP" and "Store DIP" decision points make more sense and allow for preconfigured choices for "Store DIP" AFTER "Upload DIP". See RM issue 11195 and closed PR #672.

Problem: Archivematica uses sudo

Some parts of archivematica, like this client script, use commands with sudo. Is this really necessary? Isn't it enough with having directories to which archivematica need full read/write access (like the sharedDirectory) set to owner archivematica?
The archivematica installer sets limited sudo for the user archivematica. This has security implications and may not be allowed depending on the IT policy of the organization. If sudo is not required, archivematica should be refactored to not use it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.