reanahub / reana-commons Goto Github PK
View Code? Open in Web Editor NEWREANA common utilities and schemas
Home Page: https://reana-commons.readthedocs.io/
License: MIT License
REANA common utilities and schemas
Home Page: https://reana-commons.readthedocs.io/
License: MIT License
On master
branch:
$ flake8 reana_commons
reana_commons/errors.py:25:9: F523 '...'.format(...) has unused arguments at position(s): 0
reana_commons/utils.py:216:8: E713 test for membership should be 'not in'
reana_commons/k8s/secrets.py:53:9: F841 local variable 'api_e' is assigned to but never used
reana_commons/k8s/secrets.py:116:9: F841 local variable 'e' is assigned to but never used
reana_commons/k8s/secrets.py:121:9: F841 local variable 'e' is assigned to but never used
The class
reana-commons/reana_commons/api_client.py
Line 63 in e720eaa
__init__
method that passes reana-job-controller
as service argument to the BaseAPIClient since its methods are designed to only work with this.We are currently keeping a list of available CVMFS repositories, but since cvmfs-csi v2 it's possible to easily automount any repository (if it is configured in the cluster).
In particular, consider the software.igwn.org
repository, which is currently not present in the list but does not need any additional configuration.
Reported by @lukasheinrich
/Users/lukasheinrich/Code/pmssm/recastenv/lib/python3.7/site-packages/urllib3/connectionpool.py:988:
InsecureRequestWarning: Unverified HTTPS request is being made to host 'reana.cern.ch'. Adding certificate verification is strongly advised.
See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
Stems from reanahub/reana-workflow-controller#363
Kubernetes starts the workflow-engine pod -> the workflow engine itself sets the status to running when it starts its execution (adding it to the factory create_workflow_engine_command so all workflow engines behave the same)
Make sure all the engines sets the workflow status to running
by appending the factory
reana-commons/reana_commons/utils.py
Line 248 in 732ca38
This issue is most probably responsible for reanahub/reana-workflow-engine-yadage#202 . It also relates to reana-db
and reana-workflow-controller
repositories. The error is starting in job-status-consumer when job-status
messages are processed under high-load on local cluster installation.
All logs below are from job-status-consumer
. The error happened with workflow 3580ae6d-f381-479b-babc-3a5033a70605
when workflow 91d19f41-fd8b-4f2b-af79-e6a2ab18e491
state was changing to finished
.
Logs before error:
...
2021-10-06 14:54:48,128 | root | MainThread | INFO | [x] Received workflow_uuid: 3580ae6d-f381-479b-babc-3a5033a70605 status: RunStatus.running
...
Error:
...
2021-10-06 14:56:35,497 | root | MainThread | INFO | [x] Received workflow_uuid: 91d19f41-fd8b-4f2b-af79-e6a2ab18e491 status: RunStatus.finished
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_23.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_36.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_19.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_30.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_08.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_25.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_34.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_12.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_01.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_37.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_40.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_27.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_07.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_00.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_09.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_11.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_10.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_17.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_20.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_06.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_24.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_26.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_38.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_33.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_35.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_22.png': No such file or directory
du: cannot access '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows/3580ae6d-f381-479b-babc-3a5033a70605/_yadage/adage/track/dag_15.png': No such file or directory
2021-10-06 14:56:35,607 | root | MainThread | ERROR | Unexpected error while processing workflow: Command '['du', '-s', '-b', '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/reana_workflow_controller/consumer.py", line 100, in on_message
_update_workflow_status(workflow, next_status, logs)
File "/usr/local/lib/python3.8/site-packages/reana_workflow_controller/consumer.py", line 138, in _update_workflow_status
Workflow.update_workflow_status(Session, workflow.id_, status, logs, None)
File "/code/modules/reana-db/reana_db/models.py", line 658, in update_workflow_status
raise e
File "/code/modules/reana-db/reana_db/models.py", line 653, in update_workflow_status
workflow.status = status
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 279, in __set__
self.impl.set(
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 872, in set
value = self.fire_replace_event(
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 880, in fire_replace_event
value = fn(
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/events.py", line 2174, in wrap
fn(target, *arg)
File "/code/modules/reana-db/reana_db/models.py", line 715, in workflow_status_change_listener
_update_disk_quota(workflow)
File "/code/modules/reana-db/reana_db/models.py", line 687, in _update_disk_quota
update_users_disk_quota(user=workflow.owner)
File "/code/modules/reana-db/reana_db/utils.py", line 242, in update_users_disk_quota
disk_usage_bytes = get_disk_usage_or_zero(workspace_path)
File "/code/modules/reana-db/reana_db/utils.py", line 253, in get_disk_usage_or_zero
disk_bytes = get_disk_usage(workspace_path, summarize=True)
File "/code/modules/reana-commons/reana_commons/utils.py", line 306, in get_disk_usage
disk_usage_info = get_disk_usage_info_paths(directory, command, name_filter)
File "/code/modules/reana-commons/reana_commons/utils.py", line 271, in get_disk_usage_info_paths
disk_usage_info = subprocess.check_output(command).decode().split()
File "/usr/local/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/local/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['du', '-s', '-b', '/var/reana/users/00000000-0000-0000-0000-000000000000/workflows']' returned non-zero exit status 1.
2021-10-06 14:56:35,684 | root | MainThread | INFO | [x] Received workflow_uuid: e643fb12-a75e-46ad-9135-b2b0362b2695 status: RunStatus.finished
...
After error:
...
2021-10-06 14:56:40,413 | root | MainThread | INFO | [x] Received workflow_uuid: 3580ae6d-f381-479b-babc-3a5033a70605 status: RunStatus.running
2021-10-06 14:56:40,559 | root | MainThread | INFO | [x] Received workflow_uuid: 3580ae6d-f381-479b-babc-3a5033a70605 status: RunStatus.finished
...
Side note, the error is caught in job-status-consumer
but because the try/catch block has a huge scope the _delete_pod...
function is never reached and we lost the message, hence the run-batch stays in the NotReady
state.
After introducing REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES
, the REANA instances that set it to failed
or failed,finished
, will encounter a problem of jobs being queued forever. Steps to reproduce:
~/reana $ git diff
diff --git a/helm/configurations/values-dev.yaml b/helm/configurations/values-dev.yaml
diff --git a/helm/configurations/values-dev.yaml b/helm/configurations/values-dev.yaml
index ce48bf8..da418df 100644
--- a/helm/configurations/values-dev.yaml
+++ b/helm/configurations/values-dev.yaml
@@ -3,10 +3,11 @@
components:
reana_server:
image: reanahub/reana-server
+ environment:
+ REANA_MAX_CONCURRENT_BATCH_WORKFLOWS: 2
reana_workflow_controller:
image: reanahub/reana-workflow-controller
+ environment:
+ REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES: failed
reana_workflow_engine_cwl:
image: reanahub/reana-workflow-engine-cwl
reana_workflow_engine_yadage:
~/reana-demo-worldpopulation $ git diff
diff --git a/reana.yaml b/reana.yaml
index e3db858..bd95d23 100644
--- a/reana.yaml
+++ b/reana.yaml
@@ -16,7 +16,7 @@ workflow:
steps:
- environment: 'reanahub/reana-env-jupyter:1.0.0'
commands:
- - mkdir -p results && papermill ${notebook} /dev/null -p input_file ${input_file} -p output_file ${output_file} -p region ${region} -p year_min ${year_min} -p year_max ${year_max}
+ - wrong
outputs:
files:
- results/plot.png
~/reana-demo-worldpopulation $ reana-client run
...
workflow.1 has been queued
~/reana-demo-worldpopulation $ reana-client run
...
workflow.2 has been queued
run-batch
will be kep in NotReady
status (because REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES=failed
)~/reana-demo-worldpopulation $ kubectl get pods | grep run-batch
reana-run-batch-3908943c-4700-4a45-94f4-74d3e1fb54a1-wg7rx 1/2 NotReady 0 11m
reana-run-batch-5ceb36b7-3bbf-43b9-8e6e-f57a8ebf12f4-xvzvr 1/2 NotReady 0 11m
~/reana-demo-worldpopulation $ reana-client run
...
workflow.3 has been queued
check_running_reana_batch_workflows_count
, will surpass the REANA_MAX_CONCURRENT_BATCH_WORKFLOWS=2
~/reana-demo-worldpopulation $ kubectl logs reana-server-85fb4bd9bf-2tvjz scheduler | tail
2021-03-01 09:20:49,361 | root | MainThread | ERROR | Requeueing workflow workflow.3 ...
2021-03-01 09:20:49,387 | root | MainThread | INFO | REANA not ready to run workflow workflow.3, requeueing ...
2021-03-01 09:20:49,400 | root | MainThread | ERROR | Requeueing workflow workflow.3 ...
2021-03-01 09:20:49,428 | root | MainThread | INFO | REANA not ready to run workflow workflow.3, requeueing ...
2021-03-01 09:20:49,442 | root | MainThread | ERROR | Requeueing workflow workflow.3 ...
2021-03-01 09:20:49,473 | root | MainThread | INFO | REANA not ready to run workflow workflow.3, requeueing ...
2021-03-01 09:20:49,496 | root | MainThread | ERROR | Requeueing workflow workflow.3 ...
2021-03-01 09:20:49,533 | root | MainThread | INFO | REANA not ready to run workflow workflow.3, requeueing ...
2021-03-01 09:20:49,571 | root | MainThread | ERROR | Requeueing workflow workflow.3 ...
2021-03-01 09:20:49,743 | root | MainThread | INFO | REANA not ready to run workflow workflow.3, requeueing ...
reana_workflow_mode=batch
which are running.We are seeing in the builds the following warning:
...
reana-commons 0.5.0.dev20190408 has requirement jsonschema[format]<2.7,>=2.6.0, but you'll have jsonschema 3.0.1 which is incompatible.
...
We should remove the upper boundary.
See #17 (comment).
We could improve reporting about apparently failed message publishing. Currently, the researchers are seeing:
2021-09-27 08:46:08,690 | root | MainThread | ERROR | Error while publishing channel disconnected
2021-09-27 08:46:08,691 | root | MainThread | INFO | Retry in 0 seconds.
2021-09-27 08:46:08,750 | root | MainThread | INFO | Workflow 4a0cb177-db8d-4cb2-9622-314790060ab7 finished. Files available at ...
The above ERROR is not fatal, because under the hood we retry several times to publish messages. So it would be more appropriate to see rather WARNING here, something like:
2021-09-27 08:46:08,690 | root | MainThread | WARNING | Error while publishing channel disconnected, retry 1 of 3...
2021-09-27 08:46:18,690 | root | MainThread | WARNING | Error while publishing channel disconnected, retry 2 of 3...
2021-09-27 08:46:18,690 | root | MainThread | WARNING | Error while publishing channel disconnected, retry succeeded.
2021-09-27 08:46:08,750 | root | MainThread | INFO | Workflow 4a0cb177-db8d-4cb2-9622-314790060ab7 finished. Files available at ...
We should emit the ERROR only when the process fails three times for good.
Originated in reanahub/reana-workflow-engine-yadage#202
reana-commons/reana_commons/publisher.py
Lines 82 to 83 in 535511f
Currently REANA-Workflow-Controller and REANA-Server hold DB fixtures which should be extracted to have a common code base.
For inspiration we have pytest-invenio.
Following this recipe, r-d-r-roofit
returns [(2, 67108864.0)]
, but there shouldn't be any parallelization there.
This is due to the all
rule not being properly filtered out here:
reana-commons/reana_commons/snakemake.py
Line 157 in e9d6987
The expected value is [(1, 67108864.0)]
instead, being 67108864.0, 64Mi.
Following the discussion in #415, we should consider updating the JSON Schemas from draft-4 to a newer version, such as draft 2020-12. This update offers several advantages:
if/then
conditionals. This can be particularly useful in complex scenarios or when validating specifications for different workflow engines.The migration process should be straightforward, mainly involving the replacement of id
s with $id
s and definitions
with $defs
. Also, I don't see how the migration could be backwards-incompatible, since I think we only use the schema for the REANA specification validator in reana-commons
.
Note that despite being referred to as "drafts," JSON Schema versions are production-ready, as explained here.
Currently, any component that uses REANA-Commons will have have failing tests e.g. this build in RWC, because of:
...
fs.errors.CreateFailed: root path '/var/reana' does not exist
...
This happens because even though we create a temporary directory and then we configure the Flask app to take it into account in tests, the SHARED_VOLUME_PATH inside REANA-Commons remains unchanged and REANA-Commons's get_disk_usage
relies on it directly.
Possible solutions:
SHARED_VOLUME_PATH
through environment variableget_disk_usage
as new parameter. This way it would be the responsibility of the clients to load dynamic config (e.g. RWC)
get_disk_usage
is called from R-DB (non Flask component, therefore, no dynamic config).Extract RWC's FS fixture to reuse it across components:
When receiving an email sent via send_email the To:
shows Undisclosed recipients:;
To solve it, it might be sufficient to add From:
and To:
headers to the email message.
Tests are not passing with
ubuntu-20.04
because of directory hash mismatch (calculate_hash_of_dir
used to calculate it). Error can be found here. Withubuntu-18.04
it works fine.
Originally posted by @audrium in #237 (comment)
Since it wasn't happening before, can you reproduce it locally? I would have guessed it would be somehow related to reanahub/reana-workflow-controller@4701718, but it isn't because if the directory doesn't exist it would return
-1
not a different hash. Some very wild guesses:
Originally posted by @diegodelemos in #237 (comment)
Currently we have hard-coded sleep time in check_connection_to_job_controller()
:
$ rg -C 3 'sleep\('
reana_commons/utils.py
394- break
395- except Exception:
396- pass
397: time.sleep(10)
398- retry_counter += 1
399- else:
400- logging.error("Job controller is not reachable.", exc_info=True)
We should make it configurable, at the very least to introduce a variable in reana_commons/config.py
so that these hard-coded values can be spotted easily in a centralised place.
Fix ReadTheDocs build:
The logic to load reana yadage specifications is both in r-client and in r-w-e-yadage.
The idea is moving this logic to r-commons and import it from there.
Tests are failing due to a new version of bravado-core (6.1.1).
In particular, this is the error when running ./run-tests.sh --check-pytest
:
pkg_resources.UnknownExtra: jsonschema 3.2.0 has no such extra feature 'format-nongpl'
bravado-core 6.1.1 changed its jsonschema dependency from jsonschema[format]>=2.5.1
to jsonschema[format-nogpl]>=2.5.1
(see Yelp/bravado-core@83afc79)
However, the extra of jsonschema was renamed from format_nongpl
to format-nongpl
only in version 4.9.0 (see python-jsonschema/jsonschema@438c8fb), and reana-commons is pinning jsonschema[format]>=3.0.1,<4.0.0
.
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pkg_resources/__init__.py", line 2825, in requires
deps.extend(dm[safe_extra(ext)])
KeyError: 'format-nongpl'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "setup.py", line 64, in <module>
setup(
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/__init__.py", line 103, in setup
return distutils.core.setup(**attrs)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/dist.py", line 963, in run_command
super().run_command(command)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/runner/work/reana/reana/.eggs/pytest_runner-6.0.1-py3.8.egg/ptr/__init__.py", line 196, in run
installed_dists = self.install_dists(dist)
File "/home/runner/work/reana/reana/.eggs/pytest_runner-6.0.1-py3.8.egg/ptr/__init__.py", line 147, in install_dists
orig.test.install_dists(dist), self.install_extra_dists(dist)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/command/test.py", line 194, in install_dists
tr_d = dist.fetch_build_eggs(dist.tests_require or [])
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/dist.py", line 636, in fetch_build_eggs
return _fetch_build_eggs(self, requires)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/setuptools/installer.py", line 38, in _fetch_build_eggs
resolved_dists = pkg_resources.working_set.resolve(
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pkg_resources/__init__.py", line 834, in resolve
new_requirements = dist.requires(req.extras)[::-1]
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pkg_resources/__init__.py", line 2827, in requires
raise UnknownExtra(
pkg_resources.UnknownExtra: jsonschema 3.2.0 has no such extra feature 'format-nongpl'
Rename all variables using *_uuid
to *_id
.
We can centralize the creation of open api clients (from server, client by moving them to reana-commons)
Same thing could be done for the open api tests (like in job-controller, workflow-controller).
Note there are two tests for the openapi, one for validation of the schema, and one to verify that the schema is not outdated.
Depends on #284
Introduce a Snakemake operational option to enable the generation of HTML reports.
...
inputs:
files:
- code/helloworld.py
- data/names.txt
directories:
- workflow/snakemake
parameters:
input: workflow/snakemake/inputs.yaml
options:
report: myreport.html
...
Currently, if there are dangling images in the Kubernetes node(s), the check_predefined_conditions
call will fail when we call list_nodes
because of kubernetes-client/python#895. The exact reason of this is described in kubernetes-client/python#895 (comment), there has been a fix attempt but it hasn't been tackled yet.
The traceback when this happens:
$ kubectl logs reana-server-xxxxx-yyy scheduler --previous
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/models/v1_container_image.py", line 75, in names
raise ValueError("Invalid value for `names`, must not be `None`") # noqa: E501
ValueError: Invalid value for `names`, must not be `None`
Note the usage of
--previous
, because the process exits you won't see any logs if you dokubectl logs reana-server-xxxxx-yyy
as you will see the logs of the new pod.
Since we cannot control this, we should be more resilient and avoid blocking the whole system for this reason.
Check Python package dependency troubles such as kombu/celery described in #102.
(stems from #102 (comment))
Table user
should be renamed to something like user_
or users
so that it doesn't clash with PostgreSQL' builtin user table.
We installed Reana in our cluster in a specific namespace (called reana
in our case).
When trying to access CVMFS within a workflow, the PVC gets created in the default
namespace, and we think this is the reason why the volume is not mounted correctly and the workflow fails.
We thought about two different workarounds:
This would mean that every time a user specifies
resources:
cvmfs:
- fcc.cern.ch
a csi-cvmfs-<exp_name>
PVC gets created in the Reana release namespace.
This would be achieved by:
REANA_CVMFS_PVC_TEMPLATE = {
"metadata": {"name": "", "namespace": ""},
"spec": {
"accessModes": ["ReadOnlyMany"],
"storageClassName": "",
"resources": {"requests": {"storage": "1G"}},
},
}
"""CVMFS persistent volume claim template."""
REANA_CVMFS_SC_TEMPLATE = {
"metadata": {"name": "", "namespace": ""},
"provisioner": "cvmfs.csi.cern.ch",
"parameters": {"repository": ""},
}
"""CVMFS storage claim template."""
def render_cvmfs_pvc(cvmfs_volume):
"""Render REANA_CVMFS_PVC_TEMPLATE."""
name = CVMFS_REPOSITORIES[cvmfs_volume]
rendered_template = dict(REANA_CVMFS_PVC_TEMPLATE)
rendered_template["metadata"]["name"] = "csi-cvmfs-{}-pvc".format(name)
rendered_template["metadata"]["namespace"] = "reana"
rendered_template["spec"]["storageClassName"] = "csi-cvmfs-{}".format(name)
return rendered_template
def render_cvmfs_sc(cvmfs_volume):
"""Render REANA_CVMFS_SC_TEMPLATE."""
name = CVMFS_REPOSITORIES[cvmfs_volume]
rendered_template = dict(REANA_CVMFS_SC_TEMPLATE)
rendered_template["metadata"]["name"] = "csi-cvmfs-{}".format(name)
rendered_template["metadata"]["namespace"] = "reana"
rendered_template["parameters"]["repository"] = cvmfs_volume
return rendered_template
Or, more generally, doing rendered_template["metadata"]["namespace"]
so that it comes from the {{ .Release.Namespace }}
.
Adding a feature in the reana Helm chart so that CVMFS is accessible by default, if wanted and specified, on all the nodes of the cluster where Reana is, as it is done on our Jupyterhub:
Here is a link to GH actions run with python 2.7: https://github.com/audrium/reana-commons/runs/1347467608
It fails executing tests because pathlib
library which is used in test suit is not supported in python 2.7
๐ Hi. I'd like to bump the yadage
version
Line 35 in 0fa088a
to v0.21.0
so that it enforces a new lower bound on packtivity
that properly handles jqlang
v1.6
and v1.7
(c.f. yadage/yadage#132). However, yadage
v0.21.0
is Python 3.8+ now that Python 3.7 is officially EOL, but reana-commons
still supports Python 3.6
Line 95 in 0fa088a
reana-commons/.github/workflows/ci.yml
Lines 119 to 120 in 0fa088a
I think from discussions in the past that reana
tries to support EOL Python for some time given operations requirements (maybe I have this wrong?), but do you have a projected timeline on when a dependency with a requires-python
of >=3.8
could be used?
Hi. ๐ For reana-client
v0.8.0
the install_requires
requires reana-commons[yadage,snakemake]>=0.8.0
install_requires = [
"click>=7",
"cwltool==3.1.20210628163208",
"jsonpointer>=2.0",
"reana-commons[yadage,snakemake]>=0.8.0,<0.9.0",
"tablib>=0.12.1,<0.13",
"werkzeug>=0.14.1",
]
and reana-commons
v0.8.0
has all the dependencies for the 'yadage'
extra pinned.
Line 36 in 3525623
So there is no way to install reana-client>=0.8.0
at the moment without also requiring these pinned versions.
This is problematic as adage
v0.10.2
, and yadage-schemas
v0.10.7
were made as part of a collection of patch releases so that a patch release v0.1.9
of recast-atlas
could be made. However, these pinned versions means that there is no compatible way for pip
to solve an install that asks for both a new recast-atlas
and reana-client
โ c.f. recast-hep/recast-atlas#86.
Can the dependencies for the yadage
extra please be changed to lower bounds only?
$ git diff
diff --git a/setup.py b/setup.py
index 2198edb..77900b0 100755
--- a/setup.py
+++ b/setup.py
@@ -33,7 +33,7 @@ extras_require = {
"docs": ["Sphinx>=1.4.4", "sphinx-rtd-theme>=0.1.9",],
"tests": tests_require,
"kubernetes": ["kubernetes>=11.0.0,<12.0.0",],
- "yadage": ["adage==0.10.1", "yadage==0.20.1", "yadage-schemas==0.10.6",],
+ "yadage": ["adage>=0.10.1", "yadage>=0.20.1", "yadage-schemas>=0.10.6",],
"snakemake": [get_snakemake_pkg()],
"snakemake_reports": [get_snakemake_pkg("[reports]")],
}
(cc @lukasheinrich)
As a workaround, we're just using load_json
to make it work.
Assess if any operational option can be used for Snakemake workflows and adapt the load function accordingly.
Release checklist CodiMD.
The workflow engines should publish human-readable statuses taken from reana-db, e.g. for the cwl engine: publishing should change from:
publisher.publish_workflow_status(workflow_uuid, 2)
to
publisher.publish_workflow_status(workflow_uuid, WorkflowStatus.finished)
Hello,
On March 12, kubernetes
released versions 10.1.0 to 11.0.0
https://pypi.org/project/kubernetes/#history
This is now installed by pip rather than 10.0.1. The problem is, we use PyYAML 5.3.1 and:
Should we update tag 0.6.0 to use 10.0.1 explictly here?
https://github.com/reanahub/reana-commons/blob/v0.6.0/setup.py#L32
======
Collecting kubernetes==10.0.1
Using cached kubernetes-10.0.1-py2.py3-none-any.whl (1.5 MB)
Collecting pyyaml>=3.12
======
But when I try 10.1.0 instead, I get:
====
Collecting kubernetes==10.1.0
Using cached kubernetes-10.1.0.tar.gz (689 kB)
Collecting pyyaml~=3.12
====
Currently, we retrieve workflow workspace disk usage and we serve it through reana-client list -v
.
We should now do the same for all the workflows of a user so we know how much disk space each user takes.
For reasons that I haven't had time to diagnose yet, installs that limit the upper bound on PyYAML
, like
Line 68 in 2a61fff
are causing install failures
$ docker run --rm -ti python:3.11 /bin/bash
root@f6f2606748d9:/# python -m venv venv && . venv/bin/activate
(venv) root@f6f2606748d9:/# python -m pip --quiet install --upgrade pip setuptools wheel
(venv) root@f6f2606748d9:/# python -m pip list
Package Version
---------- -------
pip 23.2
setuptools 68.0.0
wheel 0.40.0
(venv) root@f6f2606748d9:/# python -m pip install --upgrade 'PyYAML>=5.1,<6.0'
Collecting PyYAML<6.0,>=5.1
Downloading PyYAML-5.4.1.tar.gz (175 kB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 175.1/175.1 kB 1.5 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
ร Getting requirements to build wheel did not run successfully.
โ exit code: 1
โฐโ> [68 lines of output]
/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/config/setupcfg.py:293: _DeprecatedConfig: Deprecated config in `setup.cfg`
!!
********************************************************************************
The license_file parameter is deprecated, use license_files instead.
By 2023-Oct-30, you need to update your project and remove deprecated calls
or your builds will no longer be supported.
See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
********************************************************************************
!!
parsed = self.parsers.get(option_name, lambda x: x)(value)
running egg_info
writing lib3/PyYAML.egg-info/PKG-INFO
writing dependency_links to lib3/PyYAML.egg-info/dependency_links.txt
writing top-level names to lib3/PyYAML.egg-info/top_level.txt
Traceback (most recent call last):
File "/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 338, in run_setup
exec(code, locals())
File "<string>", line 271, in <module>
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/__init__.py", line 107, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 314, in run
self.find_sources()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 322, in find_sources
mm.run()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 551, in run
self.add_defaults()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 589, in add_defaults
sdist.add_defaults(self)
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/command/sdist.py", line 104, in add_defaults
super().add_defaults()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/sdist.py", line 251, in add_defaults
self._add_defaults_ext()
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/sdist.py", line 336, in _add_defaults_ext
self.filelist.extend(build_ext.get_source_files())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<string>", line 201, in get_source_files
File "/tmp/pip-build-env-6ud0jyad/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 107, in __getattr__
raise AttributeError(attr)
AttributeError: cython_sources
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
ร Getting requirements to build wheel did not run successfully.
โ exit code: 1
โฐโ> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
(venv) root@f6f2606748d9:/#
While this is a problem with PyYAML
and (I think?) setuptools
Cython
this is also only being noticed as there is an upper bound on PyYAML
that was introduced in ca4fef4 in PR #308.
Is this upper bound still needed? If I remove this upper bound on my fork
diff --git a/setup.py b/setup.py
index eb8ba53..e29832f 100755
--- a/setup.py
+++ b/setup.py
@@ -65,7 +65,7 @@ install_requires = [
"jsonschema[format]>=3.0.1,<4.0.0",
"kombu>=4.6",
"mock>=3.0,<4",
- "PyYAML>=5.1,<6.0",
+ "PyYAML>=5.1",
"Werkzeug>=0.14.1",
"wcmatch>=8.3,<8.5",
]
the CI runs without error.
cc @mvidalgarcia (the author of PR #308).
Recursive globbing with the **
wildcard is currently not supported in workspace.glob
. To do so, Path.match
can be replaced by the library wcmatch.
Currently, we have snakemake>=6.5.3,<6.6.0
installed.
Line 29 in 3ca6146
Latest version at the current date is 6.8.0
TODO:
r-d-cms-h4l
.reana-commons
version with reana-dev git-create-release-commit -c .
.reana-dev git-upgrade-shared-modules -c . --use-latest-known-tag
and pip-compile
in r-w-e-snakemake
to update requirements.txt
.In Condor-Slurm-Demo sprint we will implement the first prototype to submit jobs to two new job backends (HTCondor and Slurm) using Serial workflows. In order to give users the possibility to choose which job backend to run their jobs on we should extend the serial specification syntax.
To ilustrate we can take the reana-demo-root6-roofit reana.yaml and modify it:
version: 0.5.0
inputs:
files:
- code/gendata.C
- code/fitdata.C
parameters:
events: 20000
data: results/data.root
plot: results/plot.png
workflow:
type: serial
specification:
steps:
+ - name: mkdir-results
+ description: Create results directory.
+ environment: 'reanahub/reana-env-root6'
+ backend:
+ name: Kubernetes
commands:
- mkdir -p results
+ outputs:
+ - results/
+ - name: gendata
+ description: Generate data depending on the number of `events`.
+ environment: 'reanahub/reana-env-root6'
+ backend:
+ name: HTCondor
commands:
- root -b -q 'code/gendata.C(${events},"${data}")' | tee gendata.log
+ inputs:
+ - code/gendata.C
+ outputs:
+ - gendata.log
+ - results/data.root
+ - name: fitdata
+ description: Fit data and generate final plot.
+ environment: 'reanahub/reana-env-root6'
+ backend:
+ name: Slurm
commands:
- root -b -q 'code/fitdata.C("${data}","${plot}")' | tee fitdata.log
+ inputs:
+ - code/fitdata.C
+ outputs:
+ - fitdata.log
+ - results/plot.png
outputs:
files:
- results/plot.png
Note that workflow.specification.steps[].name
and workflow.specification.steps[].description
are optional, but workflow.specification.steps[].inputs
and workflow.specification.steps[].outputs
would be really helpful so pushing/pulling inputs/outputs to/from the job workspaces would be easier (reanahub/reana-job-controller#143, reanahub/reana-job-controller#142).
Introduce initial package structure similarly to reana-client
.
Latest reana-commons
leads to kombu/celery incompatibility issue which makes e.g. r-j-controller to fail.
How to reproduce:
$ mkvirtualenv rjc
$ pip install --no-cache-dir .
...
Collecting celery<4.3,>=4.1.0 (from reana-commons==0.5.0.dev20190321)
...
Collecting kombu<5.0,>=4.2.0 (from reana-commons==0.5.0.dev20190321)
...
celery 4.2.2 has requirement kombu<4.4,>=4.2.0, but you'll have kombu 4.4.0 which is incompatible.
$ pip freeze | grep -E '(kombu|celery)'
celery==4.2.2
kombu==4.4.0
Note the warning phrase about celery/kombu version incompatibility. We are using:
$ git grep -E '(celery|kombu)' setup.py
setup.py: 'celery>=4.1.0,<4.3',
setup.py: 'kombu>=4.2.0,<5.0',
Apparently pip
does not resolve appropriate versions well.
Let us amend the version requirements to get rid of this problem, which will also help other REANA components.
It would be also good to add a working test for kombu/celery to avoid this happening in the future.
(P.S. We could even consider a sort of "integration test" capturing the pip install output and looking for phrase which is incompatible
๐ Might help in catching other problems of this kind in a general fashion here and in the other REANA components. )
Currently, JobStatus
(enumeration of possible job statuses) and RunStatus
(enumeration of possible workflow statuses) reside in reana-db repo. But, the values that represent job and workflow statuses are used by workflow-engines and, as of now, are duplicated in each of them. The engines do not depend on reana-db
so they cannot import those enums.
What do you think if we move JobStatus
and RunStatus
enums from reana-db
to reana-commons
repo? How will it affect future maintenance of engines, reana-db
and reana-commons
repositories?
Originated in this PR
This goes together with the removal of the organization from reana-server
and reana-workflow-controller
REST API.
Right now, the BaseConsumer
class is prepared to handle only one action. It is possible that in the near future we will have to consume from multiple queue
s and therefore perform different actions accordingly.
There are at least two options, that I can see now, to implement this:
Follow how Celery uses kombu.ConsumerMixin
, which can be seen here. Basically, the ConsumerMixin
class takes a mapping with the different types of messages and the different actions, so depending on the message type a concrete function is called
Use the ConsumerMixin
as a Consumer runner which contains different ConcreteConsumer
s for different use cases like JobStatus
, WorkflowLogs
, FilePullStatus
. For doing this there might be a limitation since by default ConsumerMixin
uses the kombu.Consumer
class, see here, so we might need to reimplement it.
Stemmed from #6. We should centralize the command to generate specs which, right now, is replicated in different components, i.e. here.
Depends on reanahub/reana-job-controller#106, since the decision taken influences this initialisation. For instance, if we decide to use marshmallow
we should be able to dynamically provide marshmallow
models to the OpenAPI generator, i.e.:
def build_openapi_spec(publish, marshmallow_schemas):
...
for name, schema in mashmallow_schemas:
spec.definition(name, schema=schema)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.