Git Product home page Git Product logo

ckanext-qa's Introduction

image

CKAN QA Extension (Quality Assurance)

The ckanext-qa extension will check each of your dataset resources in CKAN and give them an 'openness score' based Tim Berners-Lee's five stars of openness (http://lab.linkeddata.deri.ie/2010/star-scheme-by-example)

The openness score is displayed as stars on the dataset and resource pages.

Stars on the dataset

Stars spelled out on the resource

It also provides a report that allows you to view the openness (stars ratings) across a publisher or across them all:

Openness report (star ratings) for a publisher

Requirements

Before installing ckanext-qa, make sure that you have installed the following:

Known issues:

  • if the CKAN version is earlier than 2.3 then QA and Archiver information will not display on the resource read page.

Installation

To install ckanext-qa, ensure you have previously installed ckanext-archiver (v2.0+) and ckanext-report and then:

  1. Activate your CKAN virtual environment, for example:

    . /usr/lib/ckan/default/bin/activate
  2. Install the ckanext-qa Python package into your virtual environment:

    pip install -e git+http://github.com/okfn/ckanext-qa.git#egg=ckanext-qa
  3. Install the qa dependencies:

    pip install -r ckanext-qa/requirements.txt
  4. Now create the database tables:

    paster --plugin=ckanext-qa qa init --config=production.ini
  5. Add qa to the ckan.plugins setting BEFORE archiver in your CKAN config file (by default the config file is located at /etc/ckan/default/production.ini).
  6. Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:

    sudo service apache2 reload

Upgrade from version 0.1 to 2.x

NB You should upgrade ckanext-archiver and ckanext-qa from v0.1 to 2.x in one go. Upgrade ckanext-archiver first and then carry out the following:

  1. Activate your CKAN virtual environment, for example:

    . /usr/lib/ckan/default/bin/activate
  2. Upgrade the ckanext-qa Python package:

    cd ckanext-qa
    git pull
    python setup.py develop
  3. Create the new database tables:

    paster --plugin=ckanext-qa qa init --config=production.ini
  4. Install the normal and developer dependencies:

    pip install -r requirements.txt
    pip install -r dev-requirements.txt
  5. Migrate your database to the new QA tables:

    python ckanext/qa/bin/migrate_task_status.py --write production.ini
  6. (Re)start the paster celeryd2 run processes described for ckanext-archiver.

Configuration

You must make sure that the following is set in your CKAN config:

ckan.site_url = <URL to your CKAN instance>

Optionally you can configure a different set of scores to award each resource format:

qa.resource_format_openness_scores_json = <filepath>

The default value is resource_format_openness_scores.json)

Running

First, make sure that Celery is running for the priority and bulk queues. This is explained in the ckanext-archiver README:

[Using Archiver](https://github.com/ckan/ckanext-archiver#using-archiver)

QA is performed when a dataset/resource is archived, or you can run it manually using a paster command:

paster --plugin=ckanext-qa qa update [dataset] --config=production.ini

Here dataset is a CKAN dataset name or ID, or you can omit it to do the QA on all datasets.

For a full list of manual commands run:

paster --plugin=ckanext-qa qa --help

Once the QA has run for a dataset, you will see the stars displayed on the dataset's web page, and the detected file format available when you call package_show for it, in the qa for the dataset and each resource.

You can get an overall picture by generating an Openness report:

paster --plugin=ckanext-report report generate openness --config=production.ini

And view it on your CKAN site at /report/openness.

Tests

To run the tests:

  1. Activate your CKAN virtual environment, for example:

    . /usr/lib/ckan/default/bin/activate
  2. If not done already, install the dev requirements:

    (pyenv)~/pyenv/src/ckan$ pip install ../ckanext-qa/dev-requirements.txt
  3. From the CKAN root directory (not the extension root) do:

    (pyenv)~/pyenv/src/ckan$ nosetests --ckan ../ckanext-qa/ckanext/qa/tests/ --with-pylons=../ckanext-qa/test-core.ini

If you get error "MagicException: None" then it may be due to libmagic needing an update. Try:

sudo apt-get install libmagic1

Translations ------

To translate plugin to a new language (ie. "pl") run python setup.py init_catalog -l pl.

To update template file with new translation added in the code or templates run python setup.py extract_messages in the root plugin directory. Then run ./ckanext/qa/i18n/unique_pot.sh -v to strip other plugin's translations.

To update translation files for locale "pl" with new template run python setup.py update_catalog -l pl.

Questions

The archiver info shows on the dataset/resource pages but the QA doesn't

You need to ensure that in your ckan.plugins you have qa listed BEFORE archiver or else the template inheritance doesn't work and this happens.

ckanext-qa's People

Contributors

amercader avatar aron avatar beazil avatar bzar avatar cain-ish avatar carlqlange avatar dennisrpnw avatar dvainio avatar icmurray avatar johnglover avatar johnmartin avatar krzysztofmadejski avatar morty avatar rossjones avatar rufuspollock avatar teajaymars avatar threeaims avatar tobes avatar vitorbaptista avatar wwitzel3 avatar zharktas avatar zydio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ckanext-qa's Issues

bad url join in celery task when site_url has a path components

Consider ckan.site_url = http://somehost/ckan in CKAN configuration file. With this, when building api_url here, one gets the wrong URL http://somehost/api/action because the "ckan" path component gets dropped by urljoin. One solution would be to have a trailing / in site_url configuration option but this is apparently not recommended. So I guess some url manipulation would be needed on extension side.

Note that other extensions (such as archiver and datastorer) have the same problem.

ckanext-qa does not show rating stars

After following all the indicated steps, my CKAN server shows the word "Openness" on the page of each dataset, however, no punctuation (stars) is shown. To what is due? My server has the latest version of CKAN installed from source. Thank you.

Possibility to show a dataset that is still not open

We wanted to use ckan to make public the inventory of datasets of a public organization (even the ones that are still not open). This way we would be transparent regarding existing datasets and be able to prioritize the opening process taking into account the interest and demand of citizens.

However, we are concerned with communicating this clearly. In this sense, this extension could address this showing not only the level of maturity of open datasets, but also that some datasets exist but are still not open (no stars or other visual solution). This could solve our concern.

Incorrect template tag prevent the QA rating to be displayed in the resource page

By default, when installing the QA extension, the rating at the resource level does not work (although the /qa/ works fine).

After investigation, it appears that one of the template block is not coherent with the main CKAN template.

File ckanext-qa/ckanext/qa/templates_extend/package/resource_read.html

The first block is declared as follow

{% block resource_additional_information %}
  {{ super() }}

  {{ h.qa_stars(c.resource.id) }}
{% endblock %}

But in the master CKAN template (ckan/templates/package/resource_read.html), there is no such block. The only equivalent thing is

{% block resource_additional_information %}

qa error: update_package' object has no attribute 'get_logger'

I get this error in the terminal when running it...
I use ckan 2.6.6.
The archiver/report works but I get no start in the openess section

[2018-07-19 13:08:34,246: ERROR/MainProcess] Task qa.update_package[another-test-ef74] raised unexpected: AttributeError("'update_package' object has no attribute 'get_logger'",) Traceback (most recent call last): File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task R = retval = fun(*args, **kwargs) File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__ return self.run(*args, **kwargs) File "/home/hit/ckan/lib/default/src/ckanext-qa/ckanext/qa/tasks.py", line 83, in update_package log = update_package.get_logger() File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/celery/local.py", line 143, in __getattr__ return getattr(self._get_current_object(), name) AttributeError: 'update_package' object has no attribute 'get_logger'

old kombu dependency

Travis builds don't appear to have been working for a fair while, due to incompatibilities introduced by the removal of _uuid_generate_random (python 2.7.11, presumably the version Travis uses in its build env). This has been fixed since 3.1.30 in kombu; what components require kombu to be pinned at 2.1.3?

Upgrade Celery to 3.x

So tests are passing.

Following from #51 (comment)

@davidread wrote:

Upgrading celery is long overdue. I'd be v happy if you're happy to test a more recent version on ckanext-archiver & qa and we can change the suggested version.

Running with CKAN 2.3

Hi,

Is this extension working only with CKAN 1.5 or will it run with higher as well. If so what is the highest version it will run with?

Running with CKAN 2.3 throws errors:

Runnning $ paster qa update throws:
requests.exceptions.MissingSchema: Invalid URL u'api/action/current_package_list_with_resources': No schema supplied

Traceback (most recent call last):
  File "/usr/lib/ckan/default/bin/paster", line 9, in <module>
    load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
    invoke(command, command_name, options, args[1:])
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
    exit_code = runner.run(args)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
    result = self.command()
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/commands.py", line 73, in command
    for package in self._package_list():
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/commands.py", line 136, in _package_list
    response = self.make_post(url, {'page': page, 'limit': limit})
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/commands.py", line 109, in make_post
    return requests.post(url, data=json.dumps(data), headers=headers)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/api.py", line 87, in post
    return request('post', url, data=data, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 276, in request
    prep = req.prepare()
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/models.py", line 221, in prepare
    p.prepare_url(self.url, self.params)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/models.py", line 287, in prepare_url
    raise MissingSchema("Invalid URL %r: No schema supplied" % url)
requests.exceptions.MissingSchema: Invalid URL u'api/action/current_package_list_with_resources': No schema supplied

Running $ paster qa update some_package throws:
requests.exceptions.MissingSchema: Invalid URL u'api/action/package_show': No schema supplied

Traceback (most recent call last):
  File "/usr/lib/ckan/default/bin/paster", line 9, in <module>
    load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
    invoke(command, command_name, options, args[1:])
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
    exit_code = runner.run(args)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
    result = self.command()
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/commands.py", line 73, in command
    for package in self._package_list():
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/commands.py", line 126, in _package_list
    response = self.make_post(url, data)
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/commands.py", line 109, in make_post
    return requests.post(url, data=json.dumps(data), headers=headers)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/api.py", line 87, in post
    return request('post', url, data=data, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 276, in request
    prep = req.prepare()
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/models.py", line 221, in prepare
    p.prepare_url(self.url, self.params)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/models.py", line 287, in prepare_url
    raise MissingSchema("Invalid URL %r: No schema supplied" % url)
requests.exceptions.MissingSchema: Invalid URL u'api/action/package_show': No schema supplied

from six import viewkeys: cannot import name viewkeys

Thrown while calling celery: paster --plugin=ckanext-archiver celeryd2 run all
Runs on my local machine (six==1.10.0), but not on the server that was setuped earlier (six==1.7.3)
Call stack: QA -> messytables -> html5lib.

Quickfix: pip install six==1.10.0

Diagnosis with pipdeptree -p six -r:

  1. Local
six==1.10.0
  - bleach==1.4.2 [requires: six]
  - html5lib==0.9999999 [requires: six] (seven 9s)
    - bleach==1.4.2 [requires: html5lib>=0.999]
    - messytables==0.15.2 [requires: html5lib]
      - ckanext-qa==2.0 [requires: messytables~=0.15.2]
  - PasteScript==2.0.2 [requires: six]
    - Pylons==0.9.7 [requires: PasteScript>=1.7.3]
  - pip-tools==1.1.2 [requires: six]
  - sqlalchemy-migrate==0.9.1 [requires: six>=1.4.1]
  1. Server
six==1.7.3
  - html5lib==0.999999999 [requires: six]  (nine 9s)
    - messytables==0.15.2 [requires: html5lib]
      - ckanext-qa==2.0 [requires: messytables>=0.15]
  - sqlalchemy-migrate==0.9.1 [requires: six>=1.4.1]

Conclusions:

  1. offtop: WTF is 0.9999999 version of html5lib? I guess if versions would be txt based it wuld be finall-final-final
  2. some googling points to html5lib/html5lib-python#298 that has been merged in Oct 2016, but no release has been made since then, so it's good to pinpoint six version here
  3. fix belongs to messytables (if not html5lib), but till it's fixed there it's good to bump it here as well.

Full stack trace:

Traceback (most recent call last):
[...]
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 17, in <module>
    from ckanext.qa.sniff_format import sniff_file_format
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/sniff_format.py", line 10, in <module>
    import messytables
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/messytables/__init__.py", line 21, in <module>
    from messytables.html import HTMLTableSet, HTMLRowSet
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/messytables/html.py", line 4, in <module>
    import html5lib
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/html5lib/__init__.py", line 16, in <module>
    from .html5parser import HTMLParser, parse, parseFragment
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/html5lib/html5parser.py", line 2, in <module>
    from six import with_metaclass, viewkeys, PY3
ImportError: cannot import name viewkeys

Improve/start versioning

Define 2.0 as a first version and start creating new versions after fixes and improvements.

Tasks

  • Start using GitHub tags for each version.
  • Start a changelog file with info about CKAN version covered + bugs + fixes

Errors and no qa results

I've encountered following error from celerytask while performing QA. Do you have any idea what could cause it?

[2015-02-03 16:11:33,059: ERROR/MainProcess] Task qa.update[507f23c7-9eca-4f03-bd83-10616cad1ee7] raised exception: KeyError('package',)
Traceback (most recent call last):
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/celery/execute/trace.py", line 47, in trace
    return cls(states.SUCCESS, retval=fun(*args, **kwargs))
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/celery/app/task/__init__.py", line 247, in __call__
    return self.run(*args, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/celery/app/__init__.py", line 175, in run
    return fun(*args, **kwargs)
  File "/home/krzysiek/WebServer/ckan/lib/default/src/ckanext-qa/ckanext/qa/tasks.py", line 122, in update
    data['package'], data['position'],
KeyError: 'package'

remove dependency on ckan.model.ResourceGroup

ckan.model.ResourceGroup was removed from CKAN somewhere in mid 2014. CKAN 2.3a (Nov 2014) runs fine with latest ckanext-qa (qa update and showing resource ratings), but crashes on links under /qa due to reports.py depending on the removed ckan.model.ResourceGroup.

Relevant stack trace (nb. my CKAN lives in /mnt/ckan/ rather than /usr/lib/ckan):

[Mon Nov 24 17:35:34.501070 2014] [:error] [pid 6779] [remote 10.6.20.100:388]   c.organisations = organisations_with_broken_resource_links_by_name()
[Mon Nov 24 17:35:34.501072 2014] [:error] [pid 6779] [remote 10.6.20.100:388] File '/mnt/ckan/default/src/ckanext-qa/ckanext/qa/reports.py', line 150 in organisations_with_broken_resource_links_by_name
[Mon Nov 24 17:35:34.501075 2014] [:error] [pid 6779] [remote 10.6.20.100:388]   result = _get_broken_resource_links().keys()
[Mon Nov 24 17:35:34.501078 2014] [:error] [pid 6779] [remote 10.6.20.100:388] File '/mnt/ckan/default/src/ckanext-qa/ckanext/qa/reports.py', line 165 in _get_broken_resource_links
[Mon Nov 24 17:35:34.501081 2014] [:error] [pid 6779] [remote 10.6.20.100:388]   .join(model.ResourceGroup, model.Package.id == model.ResourceGroup.package_id)\\
[Mon Nov 24 17:35:34.501087 2014] [:error] [pid 6779] [remote 10.6.20.100:388] AttributeError: 'module' object has no attribute 'ResourceGroup'

Exception thrown when url 404

[2015-10-26 11:48:29,819: ERROR/PoolWorker-2] qa.update[3e3cc069-23c7-4470-875e-4633c7f552db]: Exception occurred during QA update: AttributeError: 'Response' object has no attribute 'error'
[2015-10-26 11:48:29,820: INFO/PoolWorker-2] Starting new HTTPS connection (1): danepubliczne.gov.pl
[2015-10-26 11:48:29,861: ERROR/MainProcess] Task qa.update[3e3cc069-23c7-4470-875e-4633c7f552db] raised exception: AttributeError("'Response' object has no attribute 'error'",)
Traceback (most recent call last):
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/execute/trace.py", line 47, in trace
    return cls(states.SUCCESS, retval=fun(*args, **kwargs))
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/app/task/__init__.py", line 247, in __call__
    return self.run(*args, **kwargs)
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/app/__init__.py", line 175, in run
    return fun(*args, **kwargs)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 162, in update
    _update_resource(context, data, log)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 104, in _update_resource
    % (res.status_code, content, context, resource, res, res.error, post_data, api_url))
AttributeError: 'Response' object has no attribute 'error'
Traceback (most recent call last):
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/execute/trace.py", line 47, in trace
    return cls(states.SUCCESS, retval=fun(*args, **kwargs))
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/app/task/__init__.py", line 247, in __call__
    return self.run(*args, **kwargs)
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/app/__init__.py", line 175, in run
    return fun(*args, **kwargs)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 162, in update
    _update_resource(context, data, log)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 104, in _update_resource
    % (res.status_code, content, context, resource, res, res.error, post_data, api_url))
AttributeError: 'Response' object has no attribute 'error'

No rating stars in ckan 2.8

Dear contributors,
I have ckan 2.8 with dcat and dcatapit plugins installed.
I can archive using ckanext-archiver with job workers, but I have no rating star when I archive using qa with following command:

paster --plugin=ckanext-qa qa update --config=/etc/ckan/default/production.ini

Also no records in qa table.

Any suggestions please?

Thanks in advance,
Marco

ImportError: cannot import name __version__

Got this on another server while installing QA. No idea why this happens or how to debug it :/.

There is an __init__.py file with version specified.

(ckan) ckan@host:~/src/ckanext-qa$ python setup.py develop
Traceback (most recent call last):
  File "setup.py", line 2, in <module>
    from ckanext.qa import __version__
ImportError: cannot import name __version__

'thread._local' object has no attribute 'host'

When running paster qa update dataset_id I'm getting the following error. I have no idea how to tackle it.

My config file:

[server:main]
use = egg:Paste#http
host = 0.0.0.0
port = 5000

so the host is defined.

Stacktrace:

[2017-03-14 16:41:04,269: ERROR/MainProcess] Task qa.update_package[zbior-103f] raised exception: AttributeError("'thread._local' object has no attribute 'host'",)
Traceback (most recent call last):
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/execute/trace.py", line 47, in trace
    return cls(states.SUCCESS, retval=fun(*args, **kwargs))
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/app/task/__init__.py", line 247, in __call__
    return self.run(*args, **kwargs)
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/celery/app/__init__.py", line 175, in run
    return fun(*args, **kwargs)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 68, in update_package
    update_package_(package_id, log)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 93, in update_package_
    _update_search_index(package.id, log)
  File "/home/ckan/.virtualenvs/ckan/src/ckanext-qa/ckanext/qa/tasks.py", line 419, in _update_search_index
    package = toolkit.get_action('package_show')(context_, {'id': package_id})
  File "/home/ckan/.virtualenvs/ckan/src/ckan/ckan/logic/__init__.py", line 424, in wrapped
    result = _action(context, data_dict, **kw)
  File "/home/ckan/.virtualenvs/ckan/src/ckan/ckan/logic/action/get.py", line 931, in package_show
    package_dict = model_dictize.package_dictize(pkg, context)
  File "/home/ckan/.virtualenvs/ckan/src/ckan/ckan/lib/dictization/model_dictize.py", line 216, in package_dictize
    result_dict["resources"] = resource_list_dictize(result, context)
  File "/home/ckan/.virtualenvs/ckan/src/ckan/ckan/lib/dictization/model_dictize.py", line 66, in resource_list_dictize
    resource_dict = resource_dictize(res, context)
  File "/home/ckan/.virtualenvs/ckan/src/ckan/ckan/lib/dictization/model_dictize.py", line 126, in resource_dictize
    qualified=True)
  File "/home/ckan/.virtualenvs/ckan/src/ckan/ckan/lib/helpers.py", line 142, in url_for
    my_url = _routes_default_url_for(*args, **kw)
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/routes/util.py", line 257, in url_for
    host = config.host
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/routes/__init__.py", line 14, in __getattr__
    return getattr(self.__shared_state, name)
AttributeError: 'thread._local' object has no attribute 'host'

How ckanext-cloudstorage and ckanext-qa could work together?

Following up on the discussion from ckan/ckanext-archiver#36 (comment): how qa can work on resources stored in the cloud by dataportal provider (so it's part of the infrastructure) without "archiving" them on local application server.

@TkTech: ckanext-cloudstorage can populate the mimetype and hash fields which are core fields on a resource as it's uploaded."

@KrzysztofMadejski: Some feaures (like hash and libmagic) would need to be run on the storage server.

QA rating without downloading files

I would like to suggest that the extension be fundamentally redesigned so that it is not necessary to download complete files.

It is not ideal to download the complete files. Some files are several gigabyte large. For other resources, a lot of computing time is required in the source system.

Therefore, I suggest different levels to determine the file format:

  1. Trust the format specification made by the user.
  2. Do a HTTP HEAD request and trust the webserver's answer.
  3. Download a few bytes of the resource and use file magic numbers.
  4. Download the complete file and do the analysis.

qa mistakes json for txt

qa isn't detecting the json format on json files we upload and mistakes them for .txt which he later mistakes for csv

i'm gonna put the error down here, any help is appreciated

2019-04-11 16:51:29,678 INFO [ckanext.qa] Sniffing file format of: /var/local/ckan/default/resources/d03/232/17-3c0b-4bbe-ac30-d4c67ef30199
2019-04-11 16:51:29,684 INFO [ckanext.qa] Magic detects file as: text/plain
2019-04-11 16:51:29,686 INFO [ckanext.qa] Mimetype translates to filetype: TXT
2019-04-11 16:51:29,692 INFO [ckanext.qa] Not JSON - 0 matches
2019-04-11 16:51:29,763 INFO [ckanext.qa] Is CSV because 1.6 cells per row (56 cells, 36 rows)
2019-04-11 16:51:29,764 ERROR [ckanext.qa] Unexpected error while calculating openness score AttributeError: 'NoneType' object has no attribute 'ugettext'

PluginNotFoundException raised for other activated plugins when running a QA celery task

When I add a dataset, in order to get the qa stars I have to deactivate this plugins of the production.ini: "pdf_view geo_view geojson_view officedocs_view federgob". Then I execute the instruction: "paster --plugin=ckanext-qa qa update --config=/etc/ckan/default/production.ini", and then activate the plugins again.

If I don't have this, appears an error and I can't see the datasets. What's the problem? Is there any solution?

Thanks.

Celery Queues in stock CKAN ?

By default CKAN creates single celery queue, ckanext-qa and ckanext-archiver use queue named bulk if no queues are given in paster command. Does these extensions actually work with stock CKAN ?

In opendata.fi we cherry-picked ckan commit from data.gov.uk which enabled defining multiple queus https://github.com/yhteentoimivuuspalvelut/ckan/commit/85ef19c73faa7e841e9e0b94ec1044dbb6fee389 but we do not actually want to support this method in our new instances.

Celery is deprecated in future CKAN but we need to get these extension work for now, so is this just a configuration error in our end or should these work with CKAN 2.5.2 ?

Import Error: from kombu import virtual

(ckan)ckan@km:~/src/ckanext-qa$ paster --plugin=ckanext-qa qa update --config=/etc/ckan/prod.ini 
Traceback (most recent call last):
[...]
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/kombu/transport/__init__.py", line 76, in _get_transport_cls
    __import__(transport_module_name)
  File "/home/ckan/.virtualenvs/ckan/local/lib/python2.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 10, in <module>
    from kombu import virtual
ImportError: cannot import name virtual
(ckan)ckan@km:~/src/ckanext-qa$ pip freeze | grep kombu
kombu==2.1.8
kombu-sqlalchemy==1.1.0

Can be related to #22 and other kombu-related issues

Is ckanext-report optional or not?

Plugin fails to initialize if their is no ckanext-report, so it seems as it is required, but at the same time https://github.com/ckan/ckanext-qa/blob/master/README.rst mentions report as Optional.

So what should be the case? It's imported in so many places that I assume it's required.

README also mentions "It also provides a report that allows you to view the openness (stars ratings) across a publisher or across them all." I couldn't find any controller in ckanext-qa for that. Is that part of the report? What's the named route for it?

Make ckanext-qa translatable

Missing i18n in many templates in templates_new.

Commit will follow along with babel configuration:

cd ckanext-qa
python setup.py extract_messages
python setup.py update_catalog -l LANG
# or #python setup.py init_catalog -l LANG

# Translate i18n/LANG/LC_MESSAGES/ckanext-qa.po

Using QA in automatic manner

I'm wondering do I have to run paster qa update manually or will it be run automatically for new resource if archiver's celeryd is rynning?

Some errors

Trying to "paster qa update" a dataset:

ckanext-qa/ckanext/qa/commands.py", line 129, in _package_list
(id, url, response.error))
AttributeError: 'Response' object has no attribute 'error'

Suppose you fix this, then you'll get an error trying to get the dataset via API because I think you use a POST to get the dataset, and the API says you get it with an ?id=id-of-the-dataset appended in the URL, I guess only vía GET.

HelperError: relative_url_for has not been defined in openness.html line 40

I get an internal server error when I hit my /report/openness link and the error log has the following in it:

File '/usr/lib/ckan/default/src/ckanext-qa/ckanext/qa/templates/report/openness.html', line 40 in top-level template code
  <td>{{ h.link_to(row['organization_title'], h.relative_url_for(organization=row['organization_name'])) }}</td>
File '/usr/lib/ckan/default/lib/python2.7/site-packages/jinja2/environment.py', line 412 in getattr
  return obj[attribute]
File '/usr/lib/ckan/default/src/ckan/ckan/lib/helpers.py', line 62 in __getitem__
  key=key
HelperError: Helper 'relative_url_for' has not been defined.

I am running 2.6.0 on Ubuntu 14.04 an have the latest version of the QA, Report and Archiver plugins installed.

Support for CKAN 2.10

To support CKAN version 2.10, it is necessary to update the event trigger from after_show to after_dataset_show in:

def after_show(self, context, pkg_dict):

This modification aligns with the deprecations and changes outlined in the CKAN Changelog for Version 2.10:

There might be other potential deprecations and adjustments related to CKAN 2.10 and the transition from Python 2 to Python 3. There's also an existing PR, currently not including the changes outlined above.

resources and packages will have a set of openness key's stores in their extra properties

resources and packages will have a set of openness key's stores in their extra properties

That's what said in the doc but internally this plugin is using model.TaskStatus which seems quite suboptimal (having to query three times task_status_show in each resource view).

I wonder why did you take this path?

I'm planning to put those score in resource extra fields to be able to filter datasets based on openess: DanePubliczneGovPl/ckanext-danepubliczne#74

Error: The action 'qa_dataset_openness_show' is already implemented in 'qa'

When I add single dataset into the queue and then run celeryd the work is completed fine.

However, when I add all datasets and then run
paster --plugin=ckanext-archiver celeryd run --queue=bulk --config=ckan.ini

the process throws exceptions after the first batch of 8 works is completed.

The exception:

[2016-01-19 13:02:17,503: ERROR/MainProcess] Task qa.update_package[all-island-housing-tenure-sa-ead6] raised exception: Exception("The action 'qa_dataset_openness_show' is already implemented in 'qa'",)
Traceback (most recent call last):
  File "/home/co/ckan/local/lib/python2.7/site-packages/celery/execute/trace.py", line 47, in trace
    return cls(states.SUCCESS, retval=fun(*args, **kwargs))
  File "/home/co/ckan/local/lib/python2.7/site-packages/celery/app/task/__init__.py", line 247, in __call__
    return self.run(*args, **kwargs)
  File "/home/co/ckan/local/lib/python2.7/site-packages/celery/app/__init__.py", line 175, in run
    return fun(*args, **kwargs)
  File "/src/ckanext-qa/ckanext/qa/tasks.py", line 66, in update_package
    load_config(ckan_ini_filepath)
  File "/src/ckanext-qa/ckanext/qa/tasks.py", line 38, in load_config
    conf.local_conf)
  File "/src/ckan/ckan/config/environment.py", line 232, in load_environment
    p.load_all(config)
  File "/src/ckan/ckan/plugins/core.py", line 134, in load_all
    load(*plugins)
  File "/src/ckan/ckan/plugins/core.py", line 167, in load
    plugins_update()
  File "/src/ckan/ckan/plugins/core.py", line 116, in plugins_update
    environment.update_config()
  File "/src/ckan/ckan/config/environment.py", line 369, in update_config
    logic.get_action('get_site_user')({'ignore_auth': True}, None)
  File "/src/ckan/ckan/logic/__init__.py", line 394, in get_action
    resolved_action_plugins[name]
Exception: The action 'qa_dataset_openness_show' is already implemented in 'qa'

Any clues?

paster qa update fails with ckanapi error on datasets without resources

(datacats)me@IP:/mnt/projects/datacats/private/ckanext-qa⟫ datacats paster qa update
2015-11-17 04:10:07,343 ERROR [ckanext.qa] Failed to get package list with resources from url 'http://internal-data.dpaw.wa.gov.au/api/action/current_package_list_with_resources': Forbidden
Traceback (most recent call last):
  File "/usr/lib/ckan/bin/paster", line 9, in <module>
    load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
  File "/usr/lib/ckan/local/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
    invoke(command, command_name, options, args[1:])
  File "/usr/lib/ckan/local/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
    exit_code = runner.run(args)
  File "/usr/lib/ckan/local/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
    result = self.command()
  File "/project/ckanext-qa/ckanext/qa/commands.py", line 73, in command
    for package in self._package_list():
  File "/project/ckanext-qa/ckanext/qa/commands.py", line 141, in _package_list
    raise CkanApiError(err)
ckanext.qa.commands.CkanApiError: Failed to get package list with resources from url 'http://internal-data.dpaw.wa.gov.au/api/action/current_package_list_with_resources': Forbidden

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.