swincas / cookies-n-code Goto Github PK

View Code? Open in Web Editor NEW

This project forked from caseresearch/code-review

30.0 10.0 34.0 52.43 MB

A repo for code review sessions at CAS

Home Page: http://astronomy.swin.edu.au/

License: MIT License

Python 1.01% Shell 0.02% Jupyter Notebook 85.16% HTML 13.71% C 0.06% C++ 0.02% Makefile 0.01% CSS 0.02%

cookies-n-code's Introduction

Cookies-n-Code, CAS

A repository for all the current and past discussions and tutorialss at Cookies 'n' Code at Centre for Astrophysics and Supercomputing (CAS), Swinburne.

Ground Rules

Keep your comments constructive.
As much as possible, keep your comments positive. If you have to say “That’s not the right way to do that”, maybe add “but it’s a good/common starting point”.
Keep conversation to a minimum when someone is presenting/talking to the whole group.

Covered Topics

You can find all of the covered topics here.

Beginner Friendly Topics

Git

Python

The Python Visualisation Mega-Month

Interactive Python Plots

Profiling and Speeding Up Python Code

Python MCMC

Jupyter Notebook and emcee (Caitlin Adams, @caitlinadams)

Python MPI

Python Testing and Package Development

Talks at the Python in Astronomy Conference series

All the talks are stored and indexed on Zenodo

Misc. Python Tutorials

Shell

SQL

Intro to SQL (Sarah Hegarty, @)

OzStar

IDES

How and why you should use a debugger in Python (Oliver Coad, @olivercoad)

Misc. Topics

SSH Keys and Config

Code and Plot Brags

Code and Plot Brags (Many authors - added by Manodeep Sinha, @manodeep)

cookies-n-code's People

Contributors

Stargazers

Watchers

cookies-n-code's Issues

SSH Keys Slides

I missed the presentation on SSH Keys last fortnight. If there were slides/notes could you upload them @abatten ? Realizing I have to set it up for OzStar... >.>

script which purges environment and load module not working

I have a script which loads various modules after deactivating environment but after running script the environment remains.

script.sh:

source deactivate
module purge
module load moog/july2014

I'm running the code with ./script.sh

typedef struct in C

I was wondering what is the difference and implications between the following two cases?

typedef struct foo{
    int x;
} foo;

and

typedef struct {
    int x;
} foo;

Is the first one just needed for linked lists?

(@manodeep edited for syntax highlighting)

Need help setting up Travis CI, documentation hosting and Appveyor for package (e13Tools)

Title says it all, basically.
I have my own package called e13Tools, and I would like to add some features to add like Travis CI and documentation hosting. I am having trouble doing so, so I would like some help with that if possible.

PBS being driven by a C script

I am looking to run 3 C scripts followed by a Python script iteratively over a number of snapshots. The C scripts are parallelized and the Python script is not (the Python script just handles setting up ini files and cleaning up for the next snapshot iteration).

Ideally I would like to have a master C script to handle running each of these other scripts via the system() command. This way I can explicitly handle the return values of each of my scripts.

However since these would all be deployed to largemem, I need to think about the correct way to request processors/memory from PBS. For example, should it look like

#!/bin/bash
#PBS -q largemem
#PBS -l nodes=2:ppn=32
#PBS -l walltime=01:00:00:00
#PBS -l pmem=1000gb

/Path/to/driver_script

And then inside the driver script have

// Loop over Snapshots // 
exit_code =  system(mpirun -np 64 /path/to/first_script first_ini.ini);
// Handle exit code //

exit_code =  system(mpirun -np 64 /path/to/second_script second_ini.ini);

// etc...

Will the driver_script be executed once and the 64 processors wait idly until they're called with the mpirun call? Do I need to deal with nasty synchronization (please no)?

I definitely will give this implementation a shot myself but I thought I'd throw it open and see if anyone has had experiencing with executing multiple (parallel) codes within the scheduling system.

Cheers!

rpy2 installation through pip failing

I have yet another issue with a python package installation! This one is about the package rpy2, which interfaced between R and python. I have this installed just fine on my desktop, but I've been having difficulty installing it for a long time on my laptop. Thanks to the fact that I'm still using python 2, I need to install an older version of the package. When I do so (either by using pip, or when downloading and installing it manually), I get the following error:

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/7d/8zmh9zyj6d7_yc4sgd5y59_hxn3yk1/T/pip-build-sE9iOi/rpy2/setup.py", line 441, in <module>
        [os.path.join('doc', 'source', 'rpy2_logo.png')])],
      File "/Users/sbellstedt/anaconda2/lib/python2.7/distutils/core.py", line 151, in setup
        dist.run_commands()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/distutils/dist.py", line 953, in run_commands
        self.run_command(cmd)
      File "/Users/sbellstedt/anaconda2/lib/python2.7/distutils/dist.py", line 972, in run_command
        cmd_obj.run()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 280, in run
        self.find_sources()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 295, in find_sources
        mm.run()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 526, in run
        self.add_defaults()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 562, in add_defaults
        sdist.add_defaults(self)
      File "/Users/sbellstedt/anaconda2/lib/python2.7/site-packages/setuptools/command/py36compat.py", line 36, in add_defaults
        self._add_defaults_ext()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/site-packages/setuptools/command/py36compat.py", line 119, in _add_defaults_ext
        build_ext = self.get_finalized_command('build_ext')
      File "/Users/sbellstedt/anaconda2/lib/python2.7/distutils/cmd.py", line 312, in get_finalized_command
        cmd_obj.ensure_finalized()
      File "/Users/sbellstedt/anaconda2/lib/python2.7/distutils/cmd.py", line 109, in ensure_finalized
        self.finalize_options()
      File "/private/var/folders/7d/8zmh9zyj6d7_yc4sgd5y59_hxn3yk1/T/pip-build-sE9iOi/rpy2/setup.py", line 152, in finalize_options
        config += get_rconfig(r_home, about)
      File "/private/var/folders/7d/8zmh9zyj6d7_yc4sgd5y59_hxn3yk1/T/pip-build-sE9iOi/rpy2/setup.py", line 312, in get_rconfig
        rc = RConfig.from_string(rconfig, allow_empty = allow_empty)
      File "/private/var/folders/7d/8zmh9zyj6d7_yc4sgd5y59_hxn3yk1/T/pip-build-sE9iOi/rpy2/setup.py", line 272, in from_string
        + '\nin string\n' + string)
    ValueError: Invalid substring
    -fopenmp
    in string
    -fopenmp -L/usr/local/lib -F/Library/Frameworks/R.framework/.. -framework R -lpcre -llzma -lbz2 -lz -licucore -lm -liconv

I originally thought this could be some issue with the xcode installation, but after some checking I think that's all up-to-date. Any ideas??

C++ library installed, but cannot be found

I'm trying to install the python implementation of Profit (a bulge-disc decomposition code), called pyprofit. It can be installed via pip, but in order for this to work the C++ library libprofit needs to be installed.

I have gone through the process of installing libprofit (I thought successfully) via cmake / make, but when I go to install pyprofit I get an error:

error: No libprofit installation found on your system.

How do I point the computer to the installation it just did?

Bandwidth Caps for Git LFS

I recently started tracking a large file using Git LFS. It's only a single file that is ~128Mb in size.

The repository is connected to my tests so each time I want to build my code on Travis it has to access the large file. After working for a few hours, I got an email stating


Git LFS has been disabled on your personal account jacobseiler because you’ve exceeded your data plan by at least 150%. Please purchase additional data packs to cover your bandwidth and storage usage:

https://github.com/account/billing/data/upgrade

Current usage as of 20 Aug 2018 02:38AM UTC:

Bandwidth: 1.5 GB / 1 GB (150%)
Storage: 0.13 GB / 1 GB (13%)

This seems to say that I can only git clone the repo a certain amount of times before I go over the bandwidth limit? I'm absolutely baffled by this...

Does anybody have any experiencing using Git LFS and care to offer some help?

Getting jupyter notebooks to work with conda envs

While trying to set up the tutorial this week, I thought I'd set up a conda environment with emcee and chainconsumer so that any user could easily run the notebook. Then I discovered that even when I run jupyter notebook from a conda environment, the notebook isn't actually using the environment. I checked this by running sys.executable which gave me:

'/Users/cadams/anaconda3/bin/python'

when it should have given me:

/Users/cadams/anaconda3/envs/chainconsumerenv/bin/python

I've tried various solutions as suggested in https://stackoverflow.com/questions/37085665/in-which-conda-environment-is-jupyter-executing but something almost always broke.

Does anyone have any experience solving this? Can we do this at cookies-n-code this week instead of a tutorial?

Caitlin

Running Jupyter notebook though OzSTAR

Hey everyone,

Trying to run jupyter notebook through ozstar but when I copy the link I get 'The site cannot be reached'. Can someone point out what I've done wrong?

Python in Astronomy

This is a personal page for @manodeep to upload the talks for Python in Astronomy workshop.

Copying data between two remote computers

I need to copy files from the Netherlands (ASTRON) to ICRAR, UWA. I have access to the ICRAR repository, and the ASTRON data is stored in an ftp server with anonymous access.

help with remote vscode

I had finally gotten my VSCode working remotely on ozstar where my loaded python environment was being detected (so I had the linter working, and potentially the debugger)

But I ran into the quota limit on home directory, so I deleted the .vscode directory (it was ~ 800MB!), based on the internet's advice that the directory would be re-created. The directory has indeed been re-created but now I do not have any python being detected :'(

@olivercoad Help?

Plotting for multiple snapshots for multiple simulations

This isn't really an issue more of a "Could I have done it better?". I have multiple simulations, e.g.,
galaxies_model1 = '/lustre/projects/p004_swin/jseiler/september/galaxies/mysim_IRA_z5.000' galaxies_model2 = '/lustre/projects/p004_swin/jseiler/late_september/galaxies/tiamat_IRA_z1.827'
and within each simulation I wish to read in data, perform some calculations (such as the Stellar Mass Function) and then plot at different snapshots (e.g. comparing snapshots [78, 64, 53] with snapshots [73, 61, 40] for each simulation respectively).

My question is then the best way to handle this in Python. I want it to be extendable, so I can plot an arbitrary number of models each with an arbitrary number of snapshots.

My current method (which probably makes Manodeep very disappointed in me) is to abuse nested loops. E.g.,

SMF = [] 
for model_number in range(number_models):
    ## Set up the Arrays ##
    SMF.append([])
    for snapshot_idx in range(len(SnapList[model_number])):
      SMF[model_number].append(np.zeros((NB_gal), dtype = np.int32)) # NB_gal is number of mass bins.
    ## Read in a data file that contains information for a small number of galaxies across all snapshots. ##
    for snapshot_idx in range(len(SnapList[model_number])):
      (counts_local, bin_edges, bin_middle) = AllVars.Calculate_Histogram(mass_gal, bin_width, 0, m_gal_low, m_gal_high) # Bin the Stellar Mass 
      SMF[model_number][snapshot_idx] += counts_local # Update the stellar mass function.

This then produces an array that can be indexed via the model number and the snapshot, SMF[model_number][snapshot_idx], to produce the calculated data. Plotting is then the same process of looping over range(number_models) and SnapList[model_number]. This is useful as well because each simulation may require different normalizations which can be thrown in another nested array.

Any tips on how this complex plotting could be improved?

Cheers

Least square's fitting on series of 2048x2048 images

I want to know the detector response for various levels of illumination (read: determine the pixel to pixel sensitivity or flat field of a detector). To solve this problem involves a simple analytical linear fit:

                        Y = a*X + b

where X is level of 'illumination' (10x1)
Y is the series of images (2048204810)
a is the detector linear response (2048x2048)
b is the additive read noise (2048x2048).

I have written some code to solve this but it is painfully slow to run (and involves a double for loop GASP).

a=np.zeros((2048,2048))
b=np.zeros((2048,2048))
for i in range(2048):
        for j in range(2048):
            a[i,j],b[i,j] = np.polyfit( X,Y[:,i,j],1 )

Is there a smarter way of doing this? The problem could require iteration and hence solving it multiple times so a more efficient way would be preferable.

Annotating points on the plot

Hi,
what solution is the best to annotate points on the plot (more than a few)?

np.loadtxt will not load/recognize floats (and I use it to load data);
Manually typing:

ax.annotate('cool galaxy',Mass[0], SFR[0] )
...
ax.annotate('boring galaxy',Mass[45], SFR[45] )

take ages and ages (and makes my jupyter notebook ugly).

Suggestions?

(@manodeep edited for syntax highlighting)

LSST Challenge: PLAsTiCC

This was posted on arXiv: http://arxiv.org/abs/1810.00001v1 (edit: for @jacobseiler)

The challenge is here: https://www.kaggle.com/c/PLAsTiCC-2018 challenge

Problems with Using Fortran and Github Actions

So I am having problems testing my Fruitbat package using Github actions because it requires a Fortran compiler.

I am getting this error:

ImportError: cannot import name 'dmdsm' from partially initialized module 'pygedm' (most likely due to a circular import)

You can see the action here: https://github.com/abatten/fruitbat/runs/2621302889?check_suite_focus=true

This is a known issue with MacOS-M1 and Big Sur, but I have also found this issue with Ubuntu. You can see this being discussed in the pygedm package (which is the one that requires a fortran compiler.
FRBs/pygedm#8

FRBs/pygedm#7

The question is, does anyone have experience using Fortran compilers with Github actions.

Standardising the "beginner" sessions

One suggestion made last week was that we have a few "1st-year" sessions that we schedule once per month or so, to run throughout the year. The idea here is that new students can attend these to get started in coding, but they don't have to be taken in any particular order, so students starting at any time of the year can jump right in. Practically speaking, we should have 5 or 6 if we want to have the "course" run twice per year with one of these sessions every month or so.

Currently, our list is:

python: importing packages, writing loops and conditionals, useful functions (like print and enumerate), object types (integers, floats, strings, lists, dictionaries, and tuples), etc.
Unix: man pages, cat/less/more, find (basics), ls (and useful flags), rm (and dangers), etc.
git/github: basics of working on your own and contributing to someone else's code (forks)
slurm: submitting jobs, getting interactive sessions. Might not be necessary if ADACS has regular tutorials
ssh: ssh-keys, X11 forwarding, ssh config files, etc.
machine learning?
Thoughts?

Old python package is overriding newly installed version

After updating my version of matplotlib to the most recent version, I quickly came upon the error that the most recent version of six was required for the new version of matplotlib to operate.

The version I have installed on my machine is 1.4.1, and I required 1.10.0 or higher. I used pip to install the new version, however every time I run python, the old version is sourced. I can't use pip to uninstall the old version, so clearly past Sabine installed the older package through a non-pip mechanism.

How can I either uninstal the old version so that it doesn't override the new version, or alternatively, point python to the path of the new one, to source it instead?

Coordinate epoch conversions

I am struggling to find a way in astropy to convert a set of current apparent coordinates (RA, DEC) into coordinates at a different epoch (eg. J2000), and vice-versa. Astropy has mulitple frames of coordinates (FK5, ICRS, etc) and any coordinate has two properties: equinox and obstime. I cannot figure out what functions to use to be able to convert the coordinates in a given frame (say ICRS) between two epochs.

OSX laptop fan running full-speed after reboot

I kept encountering this issue that my laptop fan would run continuously after a reboot. I tracked it down by using console, and seeing that there was a continuously repeating log entry

Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ReceiverHelper[83341]): Could not find and/or execute program specified by service: 2: No such file or directory: /usr/local/libexec/ReceiverHelper.app/Contents/MacOS/ReceiverHelper
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ReceiverHelper[83341]): Service setup event to handle failure and will not launch until it fires.
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ServiceRecords[83342]): Could not find and/or execute program specified by service: 2: No such file or directory: /usr/local/libexec/ServiceRecords.app/Contents/MacOS/ServiceRecords
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ServiceRecords[83342]): Service setup event to handle failure and will not launch until it fires.
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ReceiverHelper[83343]): Could not find and/or execute program specified by service: 2: No such file or directory: /usr/local/libexec/ReceiverHelper.app/Contents/MacOS/ReceiverHelper
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ReceiverHelper[83343]): Service setup event to handle failure and will not launch until it fires.
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ServiceRecords[83344]): Could not find and/or execute program specified by service: 2: No such file or directory: /usr/local/libexec/ServiceRecords.app/Contents/MacOS/ServiceRecords
Aug 21 10:05:19 Manodeeps-MacBook-Pro com.apple.xpc.launchd[1] (com.citrix.ServiceRecords[83344]): Service setup event to handle failure and will not launch until it fires.

Searching on the internet seemed to imply [sudo] launchctl remove com.citrix.ReceiverHelper and [sudo] launchctl remove com.citrix.ServiceRecords would solve the problem. And previously that was the case, until the next reboot. And then I would have to do this launchctl remove again

However, this morning my laptop restarted "unexpectedly", and since then the launchctl remove stopped working and my laptop fan was continuously running at full speed.

Turns out that the way to stop these "daemons" was to prevent them from starting up in the first place. Within the folder, /Library/LaunchAgents/, there were two files com.citrix.ServiceRecords.plist and com.citrix.ReceiverHelper.plist. Each of these files contained a key called Disabled like so:

<key>Disabled</key>
<false/>

I replaced the ~~false~~ with true in both the plist files and voila. Those daemons are not invoked any more after reboots.

Added here in case someone else with a Swin laptop is facing a similar issue.

Renaming existing repo without breaking the Universe.

Fast 2D Binning to Produce 3D Datacube

I have been working on some code to create IFU like observations from hydrosim data. What my code does is it takes in the positions, velocities, and masses of particles, does some rotations, then creates 2D maps of mass, velocity, and velocity dispersion. The final 2D binning step is performed by scipy.binned_statistic_2d and this can just provide me with a single value for each pixel.

The problem is that this loses a lot of kinematic information since the distributions are rarely (if ever) perfectly gaussian. E.g. when I get the velocity value for a pixel I just take the mean velocity of all the particles in that bin, but if the distribution is a double gaussian I would never know. In the end I would like to produce cubes, instead of 2D images, where the third dimension is a histogram of values. This could then be combined with emission line and/or stellar spectra to create IFU like datacubes. My own attempts to do this have been ridiculously slow.

-rob

Conda needs `ruamel_yaml` to install `ruamel_yaml`

Today I tried running conda update spyder. I'd just recently fixed an issue where, by downgrading conda from 4.6.4 to 4.5.11, conda would fail with an error about the environment having been modified by a more recent version of conda. (I don't know how relevant this is, but I'll include it anyway.) Anyway, I'd just successfully run conda update conda, and decided to upgrade my editor. I probably should've been suspicious when it wanted to download something like 40 new packages as a result of this simple command, but I left it downloading while I went to lunch.

Getting back after lunch, I found the following error:

ERROR conda.core.link:_execute(656): An error occurred while installing package 'defaults::qt-5.9.7-h468cd18_1'.
Rolling back transaction: done

LinkError: post-link script failed for package defaults::qt-5.9.7-h468cd18_1
location of failed script: /Users/dberke/anaconda3/bin/.qt-post-link.sh
==> script messages <==
<None>
==> script output <==
stdout: 
stderr: cp: /Users/dberke/anaconda3/bin/Assistantapp: No such file or directory
rm: /Users/dberke/anaconda3/bin/Assistantapp: No such file or directory
cp: /Users/dberke/anaconda3/bin/Designerapp: No such file or directory
rm: /Users/dberke/anaconda3/bin/Designerapp: No such file or directory
cp: /Users/dberke/anaconda3/bin/Linguistapp: No such file or directory
rm: /Users/dberke/anaconda3/bin/Linguistapp: No such file or directory
cp: /Users/dberke/anaconda3/bin/pixeltoolapp: No such file or directory
rm: /Users/dberke/anaconda3/bin/pixeltoolapp: No such file or directory
cp: /Users/dberke/anaconda3/bin/qmlapp: No such file or directory
rm: /Users/dberke/anaconda3/bin/qmlapp: No such file or directory

return code: 1

I tried starting Spyder and it failed with the error /bin/bash: /Users/dberke/anaconda3/bin/pythonw: No such file or directory. I then tried conda install spyder, and got the following error:

Traceback (most recent call last):
  File "/Users/dberke/anaconda3/lib/python3.6/site-packages/conda/common/serialize.py", line 19, in get_yaml
    import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'

(cutting out a bunch of stack trace...)

ImportError: No yaml library available.
To proceed, conda install ruamel_yaml

Trying to run conda install ruamel_yaml, however, produced the exact same error, as did any other conda command. I tried running pip install ruamel_yaml, and got an error:

pip install ruamel_yaml
-bash: /Users/dberke/anaconda3/bin/pip: No such file or directory

(This despite the fact that I'd used pip a few hours before.) A which pip pointed me to /usr/local/bin/pip, so I tried using that to install this ruamel_yaml. It ran without giving an error, but I can't import ruamel_yaml in the Python interpreter, and conda keeps throwing the same error. And to top it all off, Spyer is apparently gone as well, so I'm out an editor.

Edit: running ~$ find . -name "ruamel*" gives the following output:

./anaconda3/conda-meta/ruamel_yaml-0.11.14-py36h9d7ade0_2.json
./anaconda3/lib/python3.6/site-packages/ruamel_yaml-0.11.14-py3.6.egg-info
./anaconda3/pkgs/ruamel_yaml-0.11.14-py36h9d7ade0_2
./anaconda3/pkgs/ruamel_yaml-0.11.14-py36h9d7ade0_2/lib/python3.6/site-packages/ruamel_yaml
./anaconda3/pkgs/ruamel_yaml-0.11.14-py36h9d7ade0_2/lib/python3.6/site-packages/ruamel_yaml-0.11.14-py3.6.egg-info
./anaconda3/pkgs/ruamel_yaml-0.15.46-py36h1de35cc_0
./anaconda3/pkgs/ruamel_yaml-0.15.46-py36h1de35cc_0/lib/python3.6/site-packages/ruamel_yaml
./anaconda3/pkgs/ruamel_yaml-0.15.46-py36h1de35cc_0/lib/python3.6/site-packages/ruamel_yaml-0.15.46-py3.6.egg-info
./anaconda3/pkgs/ruamel_yaml-0.15.46-py36h1de35cc_0.tar.bz2

Edit 2: Ah, system pip is Python 2.7, so using it to install doesn't do anything for me. And pip3 fails with

Traceback (most recent call last):
  File "/Users/dberke/anaconda3/bin/pip3", line 6, in <module>
    from pip._internal import main
ImportError: cannot import name 'main'

Having trouble running emcee jobs on OzSTAR

I've just transitioned over to OzSTAR, but can't get my emcee jobs to run. They appear to be submitted but then disappear. No error or output files are being generated. The program was running fine on g2, so I think it must be something either with installed packages or my sbatch script. @manodeep have you run emcee on OzSTAR yet?

The steps I took to install emcee are:

load the anaconda package
create a conda environment
pip install emcee

The jobscript looks like this:

#SBATCH -J emcee_mock_1
#SBATCH -o ozstar.swin.edu.au:/home/cadams/submissions/output/emcee_mock_1_simultaneous_meanmocknexp_nexpnorm_baddzero_fbeta_kmax0p1
5_z0p1_sigg3p0_sigu15p0_d2030_2018528.out
#SBATCH -e ozstar.swin.edu.au:/home/cadams/submissions/error/emcee_mock_1_simultaneous_meanmocknexp_nexpnorm_baddzero_fbeta_kmax0p15
_z0p1_sigg3p0_sigu15p0_d2030_2018528.err
#SBATCH --account=oz073
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --time=24:00:00
#SBATCH --mem-per-cpu=4G

echo `date`
module purge
module load anaconda3/5.1.0
source activate rsd_dv_analysis
cd /fred/oz073/cadams/RSD_DV_Analysis
srun python emceerun.py ./inputoutput/Covariances/odensz0.1_pvz0.053_d2030_ kmin0.002500_kmax0.150000_sigmag3.000000 norsd_kmin0.150
000_kmax1.000000 kmin0.002500_kmax0.150000_sigmag3.000000_sigmau15.000000 kmin0.002500_kmax0.150000_sigmau15.000000 ./inputoutput/Gr
idded_Data/ mock_1 _odensz_gridded_30mpch_odens_sample_meanmocknexp_nexpnorm.txt _pvz_mostmass_gridded_20mpch_vel_sample_consterr_0.
12_compl_NEW.txt ./inputoutput/Chains/ simultaneous_meanmocknexp_nexpnorm_baddzero_fbeta_kmax0p15_z0p1_sigg3p0_sigu15p0_d2030 400 50
0 --simultaneous &
wait
echo `date`
echo Job done
source deactivate

Within emceerun.py, I have requested 16 threads:

dd_fit_args, dd_add_args = gen_dd_args(args.cov_path, args.data_name,
                                    args.dd_lin_ext, args.dd_nonlin_ext)
vv_args = gen_vv_args(args.cov_path, args.data_name, args.vv_ext)
dv_args = gen_simultaneous_dv_args(args.cov_path, args.data_name,
                                            args.dv_ext)

print("Reading data")
odens_data_dict = get_odens_data(args.data_path, args.data_name,
                                        args.dens_specifier)
vel_data_dict = get_vel_data(args.data_path, args.data_name,
                                        args.vel_specifier)

n_dimensions = 4 #Number of free parameters
n_walkers = args.nwalkers #Number of independent walkers
n_steps = args.nsteps #Number of steps each walker takes

save_interval = 300 #Save chains every x seconds 300 -> 5 minutes

chain_file, loglike_file, chi2_file = gen_output_files(args.out_path,
                                                args.data_name, args.out_name)

r_g = 1.0 #Initially fixed cross-correlation coefficient
fsig8_initial = 0.5
bsig8_fit_initial = 1.2
beta_fit_initial = fsig8_initial/bsig8_fit_initial
bsig8_add_initial = 1.2
sigv_initial = 250
ld_initial = 0.09

initialguess = [fsig8_initial, sigv_initial, ld_initial, beta_fit_initial]
perturbation = [0.01, 1, 0.01, 0.01]

pos_initial = [initialguess + perturbation*np.random.randn(n_dimensions) for i in range(n_walkers)]

sampler = emcee.EnsembleSampler(n_walkers, n_dimensions, lnprob_full, threads=16, args=[r_g, dd_fit_args, dd_add_args, vv_args, dv_args, odens_data_dict, vel_data_dict])

run_emcee(save_interval, sampler, pos_initial, n_walkers, n_steps, n_dimensions, chain_file, loglike_file, chi2_file)

I've also tried submitting the job on an interactive node using salloc --account=oz073 --nodes=1 --ntasks-per-node=16 --time=4:00:00 --mem-per-cpu=4G
The print statements from emceerun.py did appear for each thread, but then the program never got any further.

Any help would be greatly appreciated! I'm at a complete loss for what I'm doing wrong.

version clash with pip install of emcee (2.2.1) on g2

I have loaded the default version of python on g2 (currently python-2.7.2) and found that it uses the old version of emcee (1.2.0). So I used pip to install emcee-2.2.1, but python will not recognise this version of it (i.e. it defaults to emcee-1.2.0). Adding the directory (where pip installed the new version) to $PYTHONPATH does not work for importing the updated version either.

Pip and OzStar

Howdy Y'all,

I'm trying to use the package pre-commit which runs a set of scripts or checking on every commit that you make. I am having issues with compatibility of Pip on OzStar.

Here are the steps I'm taking. If possible, could someone go to a repository that is Git-controlled on OzStar and attempt to follow these steps?

$ pip install pre_commit
$ pre-commit --version
pre-commit 1.18.1

This installs the package as expected and everything is hunky dory.

$ pre-commit install

This installs the hooks required.

To run, pre-commit requires a .pre-commit-config.yaml file to tell it what checks it should run on every commit. Here is my .yaml file that I am using.

repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v1.2.3
    hooks:
    - id: flake8

Basically, on every commit, flake8 will run and check Pep8 compliance (#EllertIsHappy).

At this point, pre-commit should be able to run. We can check this with

$ pre-commit

This is where I get the crash.

(py3.7) [jseiler@farnarkle1 rsage]$ pre-commit
[INFO] Stashing unstaged files to /home/jseiler/.cache/pre-commit/patch1565761019.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Restored changes from /home/jseiler/.cache/pre-commit/patch1565761019.
An unexpected error has occurred: CalledProcessError: Command: ('/home/jseiler/.cache/pre-commit/repo6j0kyldw/py_env-python3.7/bin/python', '/home/jseiler/.cache/pre-commit/repo6j0kyldw/py_env-python3.7/bin/pip', 'install', '.')
Return code: 1
Expected return code: 0
Output: 
    Processing /home/jseiler/.cache/pre-commit/repo6j0kyldw
    
Errors: 
    Could not install packages due to an EnvironmentError: [('/home/jseiler/.cache/pre-commit/repo6j0kyldw/.git/objects/pack/pack-e19eaf5d2a40d7a63ef1a73ac1a6f2934cdd7e5c.pack', '/fred/oz004/jseiler/pip_tmp/pip-req-build-ulrdsljz/.git/objects/pack/pack-e19eaf5d2a40d7a63ef1a73ac1a6f2934cdd7e5c.pack', "[Errno 13] Permission denied: '/fred/oz004/jseiler/pip_tmp/pip-req-build-ulrdsljz/.git/objects/pack/pack-e19eaf5d2a40d7a63ef1a73ac1a6f2934cdd7e5c.pack'"), ('/home/jseiler/.cache/pre-commit/repo6j0kyldw/.git/objects/pack/pack-e19eaf5d2a40d7a63ef1a73ac1a6f2934cdd7e5c.idx', '/fred/oz004/jseiler/pip_tmp/pip-req-build-ulrdsljz/.git/objects/pack/pack-e19eaf5d2a40d7a63ef1a73ac1a6f2934cdd7e5c.idx', "[Errno 13] Permission denied: '/fred/oz004/jseiler/pip_tmp/pip-req-build-ulrdsljz/.git/objects/pack/pack-e19eaf5d2a40d7a63ef1a73ac1a6f2934cdd7e5c.idx'")]

Because pre-commit is trying to boot up a Virtual Env using Pip, there is a permission problem. Normally, you would circumvent this using the --user flag but since this command isn't being run by me in the command line, I can't add that flag!

I have attempted to fix the issues by setting the TMPDIR environment variable to /fred/oz004/jseiler/pip_tmp/ but alas it is still yelling at me.

Any insights would be welcome.