lcls-users / btx Goto Github PK

View Code? Open in Web Editor NEW

This project forked from apeck12/sfx_utils

1.0 1.0 12.0 18.38 MB

BeamTime with X-rays - miscellaneous functions for aiding analysis during LCLS experiments.

Home Page: https://lcls-users.github.io/btx/

License: Other

Python 61.85% Shell 1.93% HTML 0.39% JavaScript 35.83%

btx's Introduction

The documentation stored in this repository is built and hosted by Read the Docs.

Contributing

Find / create an Open Issue you'd like to work on.
Fork this repository and clone a version locally (git clone ssh://[email protected]/<user>/lcls-users.git). Setup lcls-users as the upstream: git remote add upstream https://github.com/lcls-users/lcls-users.git.
Create a new feature branch to do work
- Option 1: In the right sidebar under "Development", click Create a branch. If the issue already has a linked branch or pull request, select and click Create a branch. more info There should be a drop-down menu letting you create the branch on your own fork of the repository.
- Option 2: Manually git checkout -b <new-branch-name> and be sure to reference the issue number when writing the pull-request (e.g. closes #NN). Note that you can create a pull request right away and mark it as a draft using the right side-bar under "Reviewers".
Run the steps below to preview locally and make edits. Then commit your changes (git commit).
Push your changes (git push -u origin <new-branch-name>) and go to your lcls-users repo on github to create/update your pull request.
- Depending on tests/comments, etc. You may want to continue committing changes to your feature branch and then pushing them. Changes you push will be reflected in the pull-request and visible to your reviewers.
  
  Once the pull request is merged, however, these updates no longer have any effects.
Sync your main branch from upstream. This fetches a local copy of all the changes that have merged to main. It's generally useful any time you want to start work on a new feature branch.

    git checkout main
    git pull upstream main

Previewing The Site Locally

MkDocs is a static website generator which is used by Read the Docs to build the website. Individual pages are markdown files, and the website configuration is managed with a YAML file.

MkDocs can be used locally to build and view the website. It can be installed using the pip or conda package managers.

$ pip3 install mkdocs # for pip
$ conda install -c conda-forge mkdocs # for conda

MkDocs also includes a server allowing you to view the website in real time as you work on it by running the following command from the directory containing the mkdocs YAML configuration file.

$ mkdocs serve

By default the website is served over port 8000 and can be accessed from your browser via localhost:8000 (http://127.0.0.1:8000).

A note on relative links

Due to how the site is built and documents are linked, only true relative links can be used when linking between internal pages. I.e. if you want to link to another page on the website, use a relative path and include the extension of the file you would like to link to. This is instead of using the intended output destination as you may be accustomed to. For example, if you are working on a page that resides in the before directory, and would like to link to another page, other_page.md, you can do so like:

[other page link](other_page.md), if the other page is also in the before directory.
Or, [other page link](../during/other_page.md) if it is in another directory, such as during.

This will allow the links to properly resolve. Links of this format: [other page link](/before/other_page/) will NOT work properly when the website is published.

btx's People

Contributors

Stargazers

Watchers

Forkers

maithrimk gihankaushyal leomonrroy266 irischang020 carbonscott calhep yilaili gadorlhiac russell-marasigan john-winnicki slac-lcls

btx's Issues

Trigger a DAG run as `user1` from a DAG run triggered by `user2`

The use case would be:

cxiopr triggers a DAG run that will create a directory and populate it in its home (which it will need to execute on cxi-daq), by copying files from some accessible location in the file system (i.e. not from the web).

However, the location it copies from is a git repository that should be updated with the GitHub remote before anything happens. This should be another DAG executed by another user from the ps-data group, on a node with web access.

Discuss possibility to define a JIDPythonOperator

At the moment, we have a JIDSlurmOperator that is really a JIDBashOperator where a "control doc" (json dictionary) is given to the remote procedural call (rpc) server (the JID?) which will execute the def["executable"] entry.

This entry at the moment needs to be a bash script. It would be great if it could be a python script - previous attempts failed.

This could be useful to deploy Cheetah for example.

Before deleting lcls-run-digest: check code

There might be a few useful functions to merge here before closing shop for lcls-run-digest.
For example: https://github.com/fredericpoitevin/lcls-run-digest/blob/master/src/interface.py#L175

Make Timetool analysis class instead of a set of functions.

Optionally apply masks during run diagnostics

Instead of (or in addition to) outlier removal, enable applying a mask when computing the run trajectories.

Rayonix: psana interface needs "something" in the calib directory

If the calib directory has not been prepopulated with some geometry for the Rayonix, it won't work.

Result display in eLog: streamlit?

https://streamlit.io/gallery?category=science-technology

Refine geometry from AgBehenate run

At the moment geom_opt finds the detector distance.

New features to add:

find center
refine individual panel geometries (requires adding some logic for area detector geometry refinement)

`process_run` DAG

We list the tasks that define each DAG. We consider a task completed when successfully ran manually on the FFB nodes, and a DAG completed when successfully ran through the eLog.

run_analysis

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t run_analysis -n 8

run_analysis:
  max_events: -1

Figures visualizing the powder pattern and run statistics will be saved to ${config.root_dir}/powder/fig/*_r{run:04}.png.

opt_distance

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t opt_distance -n 1

opt_distance:
  center: 960 960

If the run is a silver behenate run, this will estimate the distance to the detector and generate a new CrystFEL-style geometry file with the coffset parameter updated accordingly. A figure of the fit can be found at: ${config.root_dir}/geom/figs/r${run:04}.png

find_peaks: #46

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t find_peaks -n 8

find_peaks:
  tag: ''
  psana_mask: False
  min_peaks: 10
  max_peaks: 2048
  npix_min: 2
  npix_max: 30
  amax_thr: 300.
  atot_thr: 600.
  son_min: 10.0
  peak_rank: 3
  r0: 3.0
  dr: 2.0
  nsigm: 10.0

index

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t index -n 8

index:
  tag: 'test'
  int_radius: '4,5,6'
  methods: 'mosflm'
  cell: '/cds/data/psdm/mfx/mfxlv4920/scratch/apeck/cco.cell'
  tolerance: '5,5,5,1.5'
  no_revalidate: True
  multi: True
  profile: True

display

Format "particle hits" for post-analysis.

Depends on #39

SFX Milestone - Plan

Experiment Timeline

DAGs

We list below the DAGs in order of appearance during the experiment:

Before the experiment:
- #66
After collecting a run:
- #69
- #67
After collecting several runs for a given sample:
- #70
After the experiment:
- #43

Directory structure

Processing

Processing is performed in the scratch/ folder, with one sub-directory per "task".

/cds/data/psdm/${instrument}/${experiment}/scratch/
|
|____ mask/
|     |____ r${RUN}.npy
|
|____ geom/
|     |____ r${RUN}.geom
|     |____ figs/
|           |____ r${RUN}.png
|
|____ powder/
|     |____ r${RUN}_{avg/max/std}.npy
|     |____ figs/
|           |____ powder_r${RUN}.png
|           |____ stats_r${RUN}.png
|
|____ index
      |____ r${RUN}_${TAG}.stream
      |____ figs/
      |     |____ cell_${TAG}.png
      |     |____ peakogram_${TAG}.png
      $ 
      |____ r${RUN}/
            |____ *.cxi (+.lst)
            |____ peakfinding.summary

Result display

Results are organized in the summary/ folder, with one sub-directory per "DAG".

/cds/data/psdm/${instrument}/${experiment}/stats/summary/
|
|____ update_metrology/
|     |____ r${RUN}/
|           |____report.html
|
|____ summarize_sample/
      |____ ${TAG}/
            |____ report.html

`index_run` DAG: add analysis task after indexing done.

make peakogram
cell_explorer

Peakogram and reflection intensity

The wiki manual to get peakogram and reflection intensity doesn't produce the intended graph. It only plots cell distribution.

Automatic monitoring node assignment to OM config through Airflow

From @valmar:
We can work on these files, because they currently contain "default" nodes. For example, run_om.sh for cxi has the following nodes: daq-mfx-mon02,daq-mfx-mon03,daq-mfx-mon04,daq-mfx-mon05

These are usually OK! I can make an empty template and we could fill them in with the correct nodes. Or we could just use these if they are usually OK!

The main problem I see is this: in order to get the available nodes, we must run wherepsana but we must run it on the DAQ machine (say cxi-daq), which might be a problem

One thing we could do is have this information "somewhere" on the network where AirFlow can get it, like the queue as we discussed yesterday

Originally posted by @fredericpoitevin in #42 (comment)

ARP Report to eLog function: not working yet

I can't seem to be able to report back to the eLog.

This is what we are using:

btx/scripts/tasks.py

Lines 11 to 16 in b547499

 # Fetch the URL to post progress update 

 update_url = os.environ.get('JID_UPDATE_COUNTERS') 

 def test(config): 

 print(config) 

 requests.post(update_url, json=[config])

No report is made:

Everything else seems to work:

Airflow log:

*** Log file does not exist: /opt/airflow/logs/test/test/2022-04-14T00:29:46.023176+00:00/1.log
*** Fetching from: http://airflow-worker-0.airflow-worker.airflow.svc.cluster.local:8793/log/test/test/2022-04-14T00:29:46.023176+00:00/1.log

[2022-04-14, 00:29:47 UTC] {taskinstance.py:1037} INFO - Dependencies all met for <TaskInstance: test.test 4646cb75-ce29-4375-9e8f-8c81f9bdbbb1 [queued]>
[2022-04-14, 00:29:47 UTC] {taskinstance.py:1037} INFO - Dependencies all met for <TaskInstance: test.test 4646cb75-ce29-4375-9e8f-8c81f9bdbbb1 [queued]>
[2022-04-14, 00:29:47 UTC] {taskinstance.py:1243} INFO - 
--------------------------------------------------------------------------------
[2022-04-14, 00:29:47 UTC] {taskinstance.py:1244} INFO - Starting attempt 1 of 1
[2022-04-14, 00:29:47 UTC] {taskinstance.py:1245} INFO - 
--------------------------------------------------------------------------------
[2022-04-14, 00:29:47 UTC] {taskinstance.py:1264} INFO - Executing <Task(JIDSlurmOperator): test> on 2022-04-14 00:29:46.023176+00:00
[2022-04-14, 00:29:47 UTC] {standard_task_runner.py:52} INFO - Started process 2096 to run task
[2022-04-14, 00:29:47 UTC] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'test', 'test', '4646cb75-ce29-4375-9e8f-8c81f9bdbbb1', '--job-id', '1033', '--raw', '--subdir', 'DAGS_FOLDER/test.py', '--cfg-path', '/tmp/tmphqx2f85d', '--error-file', '/tmp/tmpwqwgfspf']
[2022-04-14, 00:29:47 UTC] {standard_task_runner.py:77} INFO - Job 1033: Subtask test
[2022-04-14, 00:29:47 UTC] {logging_mixin.py:109} INFO - Running <TaskInstance: test.test 4646cb75-ce29-4375-9e8f-8c81f9bdbbb1 [running]> on host airflow-worker-0.airflow-worker.airflow.svc.cluster.local
[2022-04-14, 00:29:47 UTC] {taskinstance.py:1431} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=test
AIRFLOW_CTX_TASK_ID=test
AIRFLOW_CTX_EXECUTION_DATE=2022-04-14T00:29:46.023176+00:00
AIRFLOW_CTX_DAG_RUN_ID=4646cb75-ce29-4375-9e8f-8c81f9bdbbb1
[2022-04-14, 00:29:47 UTC] {jid.py:156} INFO - Attempting to run at SLAC...
[2022-04-14, 00:29:47 UTC] {jid.py:159} INFO - Queueing slurm job...
[2022-04-14, 00:29:47 UTC] {jid.py:161} INFO - {'_id': '230295b1-8c39-4a69-b9f0-27954f2f437e', 'experiment': 'mfxp19619', 'run_num': '592022-04-14T00:29:45.893539', 'user': 'fpoitevi', 'status': '', 'tool_id': '', 'def_id': '46e88852-b70e-45f0-acb3-4f318f1a46eb', 'def': {'_id': 'ec7ed17a-aa86-4e40-911c-93085cd6b522', 'name': 'test', 'executable': '/cds/sw/package/autosfx/btx/scripts/elog_submit.sh', 'trigger': 'MANUAL', 'location': 'SLAC', 'parameters': '--config_file /cds/sw/package/autosfx/btx/tutorial/test.yaml --dag test --queue psanaq --ncores 1 --jid_update_counters http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2 --task test', 'run_as_user': 'airflow'}}
[2022-04-14, 00:29:47 UTC] {jid.py:136} INFO - Calling http://psdm02:8446/jid_slac/jid/ws/mfxp19619/start_job with {'_id': '230295b1-8c39-4a69-b9f0-27954f2f437e', 'experiment': 'mfxp19619', 'run_num': '592022-04-14T00:29:45.893539', 'user': 'fpoitevi', 'status': '', 'tool_id': '', 'def_id': '46e88852-b70e-45f0-acb3-4f318f1a46eb', 'def': {'_id': 'ec7ed17a-aa86-4e40-911c-93085cd6b522', 'name': 'test', 'executable': '/cds/sw/package/autosfx/btx/scripts/elog_submit.sh', 'trigger': 'MANUAL', 'location': 'SLAC', 'parameters': '--config_file /cds/sw/package/autosfx/btx/tutorial/test.yaml --dag test --queue psanaq --ncores 1 --jid_update_counters http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2 --task test', 'run_as_user': 'airflow'}}...
[2022-04-14, 00:29:47 UTC] {jid.py:138} INFO -  + 200: {"success":true,"value":{"_id":"230295b1-8c39-4a69-b9f0-27954f2f437e","def":{"_id":"ec7ed17a-aa86-4e40-911c-93085cd6b522","executable":"/cds/sw/package/autosfx/btx/scripts/elog_submit.sh","location":"SLAC","name":"test","parameters":"--config_file /cds/sw/package/autosfx/btx/tutorial/test.yaml --dag test --queue psanaq --ncores 1 --jid_update_counters http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2 --task test","run_as_user":"airflow","trigger":"MANUAL"},"def_id":"46e88852-b70e-45f0-acb3-4f318f1a46eb","experiment":"mfxp19619","run_num":"592022-04-14T00:29:45.893539","status":"SUBMITTED","tool_id":360306,"user":"fpoitevi"}}

[2022-04-14, 00:29:47 UTC] {jid.py:163} INFO - jobid 360306 successfully submitted!
[2022-04-14, 00:29:57 UTC] {jid.py:168} INFO - Checking for job completion...
[2022-04-14, 00:29:57 UTC] {jid.py:136} INFO - Calling http://psdm02:8446/jid_slac/jid/ws/job_statuses with [{'_id': '230295b1-8c39-4a69-b9f0-27954f2f437e', 'def': {'_id': 'ec7ed17a-aa86-4e40-911c-93085cd6b522', 'executable': '/cds/sw/package/autosfx/btx/scripts/elog_submit.sh', 'location': 'SLAC', 'name': 'test', 'parameters': '--config_file /cds/sw/package/autosfx/btx/tutorial/test.yaml --dag test --queue psanaq --ncores 1 --jid_update_counters http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2 --task test', 'run_as_user': 'airflow', 'trigger': 'MANUAL'}, 'def_id': '46e88852-b70e-45f0-acb3-4f318f1a46eb', 'experiment': 'mfxp19619', 'run_num': '592022-04-14T00:29:45.893539', 'status': 'SUBMITTED', 'tool_id': 360306, 'user': 'fpoitevi'}]...
[2022-04-14, 00:29:57 UTC] {jid.py:138} INFO -  + 200: {"success":true,"value":[{"_id":"230295b1-8c39-4a69-b9f0-27954f2f437e","counters":[{"setup":{"root_dir":"/cds/data/psdm/mfx/mfxp19619/scratch/test/"},"test":{"parameter":"value"}}],"def":{"_id":"ec7ed17a-aa86-4e40-911c-93085cd6b522","executable":"/cds/sw/package/autosfx/btx/scripts/elog_submit.sh","location":"SLAC","name":"test","parameters":"--config_file /cds/sw/package/autosfx/btx/tutorial/test.yaml --dag test --queue psanaq --ncores 1 --jid_update_counters http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2 --task test","run_as_user":"airflow","trigger":"MANUAL"},"def_id":"46e88852-b70e-45f0-acb3-4f318f1a46eb","experiment":"mfxp19619","run_num":"592022-04-14T00:29:45.893539","status":"DONE","tool_id":360306,"user":"fpoitevi"}]}

[2022-04-14, 00:30:27 UTC] {jid.py:136} INFO - Calling http://psdm02:8446/jid_slac/jid/ws/mfxp19619/job_log_file with {'_id': '230295b1-8c39-4a69-b9f0-27954f2f437e', 'counters': [{'setup': {'root_dir': '/cds/data/psdm/mfx/mfxp19619/scratch/test/'}, 'test': {'parameter': 'value'}}], 'def': {'_id': 'ec7ed17a-aa86-4e40-911c-93085cd6b522', 'executable': '/cds/sw/package/autosfx/btx/scripts/elog_submit.sh', 'location': 'SLAC', 'name': 'test', 'parameters': '--config_file /cds/sw/package/autosfx/btx/tutorial/test.yaml --dag test --queue psanaq --ncores 1 --jid_update_counters http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2 --task test', 'run_as_user': 'airflow', 'trigger': 'MANUAL'}, 'def_id': '46e88852-b70e-45f0-acb3-4f318f1a46eb', 'experiment': 'mfxp19619', 'run_num': '592022-04-14T00:29:45.893539', 'status': 'DONE', 'tool_id': 360306, 'user': 'fpoitevi'}...
[2022-04-14, 00:30:27 UTC] {jid.py:138} INFO -  + 200: {"success":true,"value":"/cds/sw/package/autosfx/btx/scripts/main.py -c /cds/sw/package/autosfx/btx/tutorial/test.yaml -t test\nDEBUG:urllib3.connectionpool:Starting new HTTP connection (1): psdm02.pcdsn:8442\nDEBUG:urllib3.connectionpool:http://psdm02.pcdsn:8442/ \"POST /jid_slac/jid/ws/replace_counters/230295b1-8c39-4a69-b9f0-27954f2f437e HTTP/1.1\" 200 103\nError: required argument 'global' is not configured.\nError: required argument 'global' is not configured.\n{'setup': {'root_dir': '/cds/data/psdm/mfx/mfxp19619/scratch/test/'}, 'test': {'parameter': 'value'}}\nTask successfully executed\n"}

[2022-04-14, 00:30:27 UTC] {taskinstance.py:1282} INFO - Marking task as SUCCESS. dag_id=test, task_id=test, execution_date=20220414T002946, start_date=20220414T002947, end_date=20220414T003027
[2022-04-14, 00:30:27 UTC] {local_task_job.py:154} INFO - Task exited with return code 0
[2022-04-14, 00:30:27 UTC] {local_task_job.py:264} INFO - 0 downstream tasks scheduled from follow-on schedule check

eLog log:

Actual command /cds/sw/package/autosfx/btx/scripts/elog_trigger.py -d test -q psanaq -n 1 -c /cds/sw/package/autosfx/btx/tutorial/test.yaml 
{
  "conf": {
    "Authorization": "Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1c2VyIjoiZnBvaXRldmkiLCJleHBlcmltZW50X25hbWUiOiJtZnhwMTk2MTkiLCJleHBpcmVzIjoxNjQ5OTI0OTg1LjQxNzY1Nn0.Hj_15vVfGLxsi7dZC3PJ2mHq7_JA49WipgCiBoAByS1D81H7VCV6ZufZQgcGOa4X8wOi6yvpdmvNyfvfuno1n0rMvAqUJdEuODC4FKRnNDRfXwCkuYq2XP-J7EsXh-Cx7VCbblpMQUMZCKUTn5vy4jblMG5k7ldlmW3o9NtoU5oQRpkWzU2V1V7kU-2n5HIAgWGwltDtjRG71b4oot4exOBWRq6cA8_nRKdFOOTvQtB1yM24OnwKQOg3qQDgYK5rDfp3lgtAWhyHbMQToqS6lHqP7K5sjmWrVaGd3ViPwURGRc-F1HNyqlcz9BEuEcf5dkuKOd8YIFdkOp0S5m5z2cu4sGWzlL0GkDki7seTVkJr8NWF_0D_z8M6dKV3MpskVEfue4rVgB72PgG3tA3QBVhvMxsEf3PsX8VAJqPwwaoEilxx8TQZbKt4Bw6x58UgEqJb5itKEo8fFITSBBSAesW0abFXF9OEGVu2N8qxWZerYocwu9Rej6KgoWsFp-FzP0LflHDnJ8pu6GIG_J9c5LOuojubFRLLqPeI26iyorysuIcKbotDubOtf_uQOsCq9JvfNskpAOZ7lGLsXxOBoxJVNRyVZ_lkowJxE9fpV4HhIlscO9zj58tvDeCd3ER0OHDpLDAWoFGvYUL-iGkG6aj_2Z16ZniXDIRIpDcd7Tg",
    "JID_UPDATE_COUNTERS": "http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2",
    "experiment": "mfxp19619",
    "parameters": {
      "config_file": "/cds/sw/package/autosfx/btx/tutorial/test.yaml",
      "dag": "test",
      "jid_update_counters": "http://psdm02.pcdsn:8442/jid_slac/jid/ws/replace_counters/62576af8b72ad5a60048f9e2",
      "ncores": "1",
      "queue": "psanaq"
    },
    "run_id": "592022-04-14T00:29:45.893539",
    "user": "fpoitevi"
  },
  "dag_id": "test",
  "dag_run_id": "4646cb75-ce29-4375-9e8f-8c81f9bdbbb1",
  "end_date": null,
  "execution_date": "2022-04-14T00:29:46.023176+00:00",
  "external_trigger": true,
  "logical_date": "2022-04-14T00:29:46.023176+00:00",
  "start_date": null,
  "state": "queued"
}

SLURM log:

[fpoitevi@psanagpu107 scratch]$ cat slurm-360306.out 
/cds/sw/package/autosfx/btx/scripts/main.py -c /cds/sw/package/autosfx/btx/tutorial/test.yaml -t test
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): psdm02.pcdsn:8442
DEBUG:urllib3.connectionpool:http://psdm02.pcdsn:8442 "POST /jid_slac/jid/ws/replace_counters/230295b1-8c39-4a69-b9f0-27954f2f437e HTTP/1.1" 200 103
Error: required argument 'global' is not configured.
Error: required argument 'global' is not configured.
{'setup': {'root_dir': '/cds/data/psdm/mfx/mfxp19619/scratch/test/'}, 'test': {'parameter': 'value'}}
Task successfully executed

`setup_metrology` DAG

We list the tasks that define each DAG. We consider a task completed when successfully ran manually on the FFB nodes, and a DAG completed when successfully ran through the eLog.

fetch_mask: #71

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t fetch_mask -n 1

fetch_mask:
  dataset: '/entry_1/data_1/mask'

Creates ${config.root_dir}/mask/r000.npy

fetch_geom: #71

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t fetch_geom -n 1

fetch_geom:

Creates ${config.root_dir}/geom/r000.geom

report

Wiki - get started: add info on how to create computer accounts

SLAC:

create SLAC UNIX account here: https://userportal.slac.stanford.edu/
take Cyber Security Training: http://training.slac.stanford.edu/web-training.asp
more LCLS related info here: https://lcls.slac.stanford.edu/computer-accounts
SDF: https://sdf.slac.stanford.edu/public/doc/#/getting-started

Document and maybe fix geometry format conversion (cctbx, psana, crystfel)

At the moment, it seems that for the Jungfrau4M and the epix10k2M, the tools available do not work as expected.

Automatic DAG definition from YAML file.

That sounds good! As an alternative, I wasn't sure if we were going to have another entry in the yaml that defined which of the tasks would be run and in which order.

@apeck12 that's a nice feature to have in the future! This would require something that interprets the YAML and writes the DAG definition file, something like:

write_dag_from_yaml(config, dag_file)

where config would be https://github.com/lcls-users/btx/blob/main/tutorial/det_distance.yaml and dag_file would be https://github.com/lcls-users/btx/blob/main/dags/det_distance.py

Definitely something we should implement!

Originally posted by @fredericpoitevin in #34 (comment)

DRP test: run `index_run` successfully on DRP/FFB nodes

Document how to use the ARP

https://confluence.slac.stanford.edu/pages/viewpage.action?pageId=219269619

List available detectors

more generally, it would be useful to have a method that quickly lists all the PVs available.

Test cases/tutorials

We should add little tutorials with test files for people to test things on. Especially for students coming to the LCLS for the first time this would be useful.

I was also thinking a general description of file formats would be helpful. e.g. A lot of people don't seem to know where to find specific information, or how to check whether detector offsets have been applied correctly in crystfel etc. Maybe a troubleshooting wiki would be useful too... e.g. my data isn't indexing, what are common causes at LCLS and how do I check that I'm doing what I think I am doing?

`stream_interface` fails when no crystal found

geometry conversion for MFX latest geometry on `mrxv`

`indexing`: cell-free option

add that feature
cluster and index each cluster, then display representative indexed image for all.

`index_run` DAG: add clean-up task

Write and document `setup_om` scripts

We need a dag that spans the following tasks:

fetch_mask, which takes as input the detector (e.g. jungfrau4M or epix10k2M), experiment, format (crystfel, cctbx, psana), and savename. If the experiment is latest, then the latest mask from the given detector will be retrieved. After retrieving the correct mask, it will be formatted if needed and saved.
fetch_geom, which does the same as above, except retrieving / converting / saving the most recent geom file.
deploy_om, which deploys OM.

update Wiki

Transfer from the apeck12/sfx_utils repository.

Stream interface IndexError

The following worked:

st = StreamInterface("/reg/d/psdm/mfx/mfxlz0420/scratch/tmalla/optimize/lyzozyme/stream_files/lyzo-v1.stream") 
st.plot_peakogram()

But this one did not:

st = StreamInterface("/reg/d/psdm/mfx/mfxlz0420/scratch/tmalla/optimize/clen-p2/stream_files/Geo-104.stream")
st.plot_peakogram()

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_29822/1044716678.py in <module>
----> 1 st = StreamInterface("/reg/d/psdm/mfx/mfxlz0420/scratch/tmalla/optimize/clen-p2/stream_files/Geo-104.stream")
      2 st.plot_peakogram()

~/sfx_utils/sfx_utils/stream_interface.py in __init__(self, input_file, cell_only)
      9         self.stream_file = input_file
     10         self.cell_only = cell_only
---> 11         self.stream_data = self.read_stream(input_file)
     12 
     13     def read_stream(self, input_file):

~/sfx_utils/sfx_utils/stream_interface.py in read_stream(self, input_file)
     62         stream_data = np.array(stream_data)
     63         if not self.cell_only:
---> 64             stream_data[:,-1] = xtal_utils.compute_resolution(stream_data[:,2:8], stream_data[:,8:11])
     65 
     66         return stream_data

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

`run_diagnostics` DAG

We list the tasks that define each DAG. We consider a task completed when successfully ran manually on the FFB nodes, and a DAG completed when successfully ran through the eLog.

image_stats
display

Implement "particle hit finder" task based on photon number threshold

Setup cron job to rsync `summary/` from DRP to psana nodes for eLog display.

Figure out how to assign different number of cores for each tasks in DAG

Figure out how to kill DAG jobs through eLog

Also: if a SLURM job is killed outside of Airflow, Airflow does not seem to report it as "CANCELLED", but reports it as "DONE. Would be nice to see the job "CANCELLED" in Airflow - or some information that it failed somehow

NERSC remote analysis

Write a short note on how to login to NERSC (using sshproxy so the connection lasts 24h), and using NoMachine.

Write eLog script for automation

Figure out how to write a slurm script that will be used to define a Workflow in the eLog, that would launch at the end of each run, and display some result in the eLog Summary.

`update_mask` DAG

We list the tasks that define each DAG. We consider a task completed when successfully ran manually on the FFB nodes, and a DAG completed when successfully ran through the eLog.

build_mask: #72

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t build_mask -n 1

build_mask:
  thresholds: -10 2000
  n_images: 10
  n_edge: 2
  combine: True

pulls a collection of random images from the run and masks pixels whose intensities fall outside the user-defined threshold values. This mask can be combined with a previous mask, and a set width of border pixels can be masked as well depending on the config task arguments.

report
display

Update geom header when fetching from mrxv

This hack was used:

In [1]: from btx.misc.metrology import modify_crystfel_header
In [2]: modify_crystfel_header("r0000.geom", "r0001.geom")

Essentially we always want the header to be CrystFEL compatible by uncommenting the lines that crystfel expects. Not currently done when just fetching from mrxv, but behenate (which this is done for) won't always be run.

AutoSFX legacy scripts

Some of it might be re-usable, making a note here of where the various pieces of scripts and code can be found:

Most recent: https://github.com/slaclab/autosfx
Used IRL on August 2020:
Older code:

Stream interface - check plot axis labels

Not sure but there might be a confusion in how peak and I are interpreted in the stream files and how they are referred to in the plot.

To be checked...

`summarize_sample` DAG

We list the tasks that define each DAG. We consider a task completed when successfully ran manually on the FFB nodes, and a DAG completed when successfully ran through the eLog.

stream_analysis

$ . elog_submit.sh -c ../tutorial/mfxlv4920.yaml -t stream_analysis -n 8

stream_analysis:
  tag: 'test'

can be used to produce the peakogram and cell distribution plots, which will be saved in: ${config.root_dir}/index/figs

Reconstruct 3D volume from formatted "particles"

and document the steps followed.

Depends on #40

Check required arguments in config file prior launching task.

See 07b410f

TODO can be found here:

btx/scripts/main.py

Line 20 in 722645c

#TODO: check required arguments in config dictionary here.

Detector geometry history

The current method for figuring out what was the latest geometry optimized for a given detector consists in consulting the following Confluence page: https://confluence.slac.stanford.edu/display/PSDM/Geometry+History#GeometryHistory-Geometries

It would be nice to have a centralized place to share optimized geometry files in a push/pull way, together with some documentation on how they were built and how others can deal with them.

Read HKL files and compare them

It would be nice to be able to read 2 HKL files (say light and dark), and show how their intensity compare across similar reflections.

[h1,k1,l1,i1] + [h2,k2,l2,i2] ---intersect----> [h,k,l,i1,i2]

and plot i2 versus i1 across resolution shells.

Cell volume as a function of time

Add a diagnostic to plot the cell volume as a function of time / run number. It seems that this may be an issue at MFX but not CXI, where the cell volume gradually fluctuates due to drift in the detector distance or beam energy.

sourcing the correct python in elog_submit.sh

For some reason, we've had to specify the full python / mpirun paths in elog_submit.sh since the line source /reg/g/psdm/etc/psconda.sh -py3 supplied to sbatch doesn't seem to be properly recognized. With tpprwr we may be able to avoid this altogether, but we should address this issue in the meantime regardless.

`wrap_experiment` DAG

This DAG needs to be defined, but essentially would aim for:

#202
consolidating analysis results. Done in #201
#203

Check for gain artifacts

Generate a task that checks for gain artifacts. Sabine's suggestion: examine radial profiles for 3k random patterns from a run and look for discontinuities in the individual image radial profiles. May be useful to generate an average radial profile from the misses. Flag images that don't match the expected radial profile.

	# Fetch the URL to post progress update
	update_url = os.environ.get('JID_UPDATE_COUNTERS')

	def test(config):
	print(config)
	requests.post(update_url, json=[config])

lcls-users / btx Goto Github PK

btx's Introduction

Contributing

Previewing The Site Locally

A note on relative links

btx's People

Contributors

Stargazers

Watchers

Forkers

btx's Issues

Experiment Timeline

DAGs

Directory structure

Processing

Result display

SLAC:

Recommend Projects

Recommend Topics

Recommend Org