justinbois / bebi103 Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 15.0 369.92 MB

Utilities for BE/Bi 103

License: BSD 3-Clause "New" or "Revised" License

Python 66.57% Shell 0.05% Stan 0.11% Jupyter Notebook 33.27%

bebi103's People

Contributors

Stargazers

Watchers

Forkers

dangeles smsaladi enricozb xyicheng stevertaylor goertsen madelynw babicb tmorrell muirjm tomroesch mrazomej zhikaiyang gduscher aristoteleo

bebi103's Issues

Put in NaN handing in cat plots

Typo in viz.fill_between

In the function viz.fill_between, in line 157, it states line_alpha=line_width, which should be line_alpha=line_alpha.

N and n_cores in SBC

When running N = 9 on 9 cores, I get 81 SBC samples. N should now be per-core.

I was using the module and think I found a bug in the viz.im_click() function. It seems that there is a call to bokeh.layout.row() that should be a call to bokeh.layouts.row(). Here is the code to reproduce the error:

im = np.array([[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6]]) bokeh.io.show(bebi103.viz.im_click(im))

The error is as follows:

AttributeError Traceback (most recent call last)
in ()
1 im = np.array([[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6]])
----> 2 bokeh.io.show(bebi103.viz.im_click(im))

~/anaconda3/lib/python3.6/site-packages/bebi103/viz.py in im_click(im, color_mapper, plot_height, plot_width, length_units, interpixel_distance, x_range, y_range, no_ticks, x_axis_label, y_axis_label, title, flip)
2118
2119 div = bokeh.models.Div(width=200)
-> 2120 layout = bokeh.layout.row(p, div)
2121
2122 p.js_on_event(bokeh.events.Tap, display_event(div, attributes=['x', 'y']))

AttributeError: module 'bokeh' has no attribute 'layout'

corner() fails with only one parameter

corner() fails when plotting with only one parameter. It should just give a plot of the histogram or ECDF.

Importing bebi103

Hey Justin,

I'm trying to use bebi103 on AWS to do something with stan, but I'm running into an issue. I think that something has happened where the up to date numpy is not communicating with scikit-learn in the way that bebi103 wants them to. Here is the error I get:

`---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
in
----> 1 import bebi103

~/miniconda/lib/python3.7/site-packages/bebi103/init.py in
12 from . import viz
13
---> 14 from . import image
15
16 try:

~/miniconda/lib/python3.7/site-packages/bebi103/image.py in
4 import numba
5
----> 6 import skimage.io
7 import skimage.measure
8

~/miniconda/lib/python3.7/site-packages/skimage/io/init.py in
9 from .collection import *
10
---> 11 from ._io import *
12 from ._image_stack import *
13

~/miniconda/lib/python3.7/site-packages/skimage/io/_io.py in
2
3 from ..io.manage_plugins import call_plugin
----> 4 from ..color import rgb2gray
5 from .util import file_or_url_context
6 from ..exposure import is_low_contrast

~/miniconda/lib/python3.7/site-packages/skimage/color/init.py in
59 hpx_from_rgb)
60
---> 61 from .colorlabel import color_dict, label2rgb
62
63 from .delta_e import (deltaE_cie76,

~/miniconda/lib/python3.7/site-packages/skimage/color/colorlabel.py in
3 import numpy as np
4
----> 5 from .._shared.utils import warn
6 from ..util import img_as_float
7 from . import rgb_colors

~/miniconda/lib/python3.7/site-packages/skimage/_shared/utils.py in
6 import numbers
7
----> 8 from ..util import img_as_float
9 from ._warnings import all_warnings, warn
10

~/miniconda/lib/python3.7/site-packages/skimage/util/init.py in
10 from .unique import unique_rows
11 from ._invert import invert
---> 12 from ._montage import montage
13
14 from .._shared.utils import copy_func

~/miniconda/lib/python3.7/site-packages/skimage/util/_montage.py in
1 import numpy as np
----> 2 from .. import exposure
3
4
5 all = ['montage']

~/miniconda/lib/python3.7/site-packages/skimage/exposure/init.py in
----> 1 from .exposure import histogram, equalize_hist,
2 rescale_intensity, cumulative_distribution,
3 adjust_gamma, adjust_sigmoid, adjust_log,
4 is_low_contrast
5

~/miniconda/lib/python3.7/site-packages/skimage/exposure/exposure.py in
3 from ..color import rgb2gray
4 from ..util.dtype import dtype_range, dtype_limits
----> 5 from .._shared.utils import warn
6
7

ImportError: cannot import name 'warn' from 'skimage._shared.utils' (/home/ec2-user/miniconda/lib/python3.7/site-packages/skimage/_shared/utils.py)`

On AWS, I have:
numpy 1.16.2 py37h7e9f1db_0
bebi103 0.0.41 pypi_0
scikit-image 0.15.0 py37he1b5a44_0 (conda-forge)

The conda-forge scikit-learn is the latest version I tried.

Do you have any suggestions for what I could do?

stan.sbc() does not handle non-scalar parameters

If a parameter is defined as non-scalar in the stan model for a prior predictive check, it cannot be given as parameter input for stan.sbc(). In stan._get_prior_sd(), standard deviations are only computed for parameters given as input, but the function stan.xarray_to_ndarray() transforms non scalar parameters to have an entry for each element instead.

Datashader import error.

Every time I import the module I always get the following warning:

bebi103-0.0.55-py3.7.egg/bebi103/viz.py:37: UserWarning: DataShader import failed with error 
"cannot import name 'encode_utf8' from 'bokeh.embed.notebook' 
(/Users/mrazomej/anaconda3/lib/python3.7/site-packages/bokeh/embed/notebook.py)".
Features requiring DataShader will not work and you will get exceptions.

The error seems to be in import datashader.bokeh_ext.

When I use corner plots with datashade=True I get the following error:

~/anaconda3/lib/python3.7/site-packages/bebi103-0.0.55-py3.7.egg/bebi103/viz.py in corner(samples, pars, labels, datashade, plot_width, plot_ecdf, cmap, color_by_chain, palette, divergence_color, alpha, single_param_color, bins, show_contours, contour_color, bins_2d, levels, weights, smooth, extend_contour_domain, plot_width_correction, plot_height_correction, xtick_label_orientation)

NameError: name 'datashader' is not defined

Any ideas on why this is happening?

Computational environment:

CPython 3.7.7
IPython 7.13.0

bebi103 0.0.55
bokeh 2.0.1
datashader 0.10.0

compiler   : Clang 4.0.1 (tags/RELEASE_401/final)
system     : Darwin
release    : 18.7.0
machine    : x86_64
processor  : i386
CPU cores  : 8
interpreter: 64bit

Variable names/column names

Check to make sure no variables are named "lnprob," "beta," and the like.

viz.predictive_ecdf incorrectly computes vertical line segments when diff=True

I made the following plot using bebi103.viz.predictive_ecdf when doing posterior predictive checks (my likelihood is discrete, negative binomial):

(Note the cropped y-axis.)

Looks good, I thought, but the quantiles are so narrow, it's hard to see where my observed data actually are relative to the predictive quantiles. So I turned the diff option in predictive_ecdf to True and generated this plot:

It appears that the vertical line segments of the data ECDF are being computed as if diff=False. As the next fig shows, this is not unique to discrete distributions, but I never would have seen it in the following continuous model unless I had gone back looking for it:

SBC needs posterior and prior to have the same parameters

In the case when a posterior and a prior to not define the same parameters, SBC throws an uninformative error. A check should be applied for this so that the error message can be informative

Plotting confidence intervals on ECDFs of bootstrap samples

[Feature Request] cmdstan error reporting from within bebi103.stan.sbc

As per slack conversation, currently there is no way to specify an output dir for the .txt files produced by cmdstan inside bebi103.stan.sbc. If the top-level function took a dictionary of extra args to pass to sm.sample(), then the user could direct the cmdstan output to a more accessible location, making debugging much easier.

Versions > 0.0.12 do not import on some systems

ubuntu 14.04 / Python 3.6.2 installed via anaconda
bebi103 versions > 0.0.12 DO NOT WORK (0.0.16/0.0.15/0.0.14/0.0.13 tested)
bebi103 versions <= 0.0.12 Do work

Error Message:

import bebi103
Traceback (most recent call last):
File "", line 1, in
File "/home/aiden/anaconda3/lib/python3.6/site-packages/bebi103/init.py", line 7, in
from . import pm
File "/home/aiden/anaconda3/lib/python3.6/site-packages/bebi103/pm.py", line 10, in
from .hotdists import *
File "/home/aiden/anaconda3/lib/python3.6/site-packages/bebi103/hotdists.py", line 147, in
class HotConstantDist(pm.ConstantDist):
TypeError: function() argument 1 must be code, not str

Additional Info:
Python modules
alabaster==0.7.10
anaconda-client==1.6.3
anaconda-navigator==1.6.4
anaconda-project==0.6.0
asn1crypto==0.22.0
astroid==1.5.3
astropy==2.0.1
Babel==2.5.0
backports.shutil-get-terminal-size==1.0.0
beautifulsoup4==4.6.0
bebi103==0.0.9
biopython==1.69
bitarray==0.8.1
bkcharts==0.2
blaze==0.10.1
bleach==1.5.0
bokeh==0.12.7
boto==2.48.0
Bottleneck==1.2.1
category-encoders==1.2.4
certifi==2016.2.28
cffi==1.10.0
chardet==3.0.4
click==6.7
cloudpickle==0.4.0
clyent==1.2.2
colorama==0.3.9
colorcet==0.9.1
conda==4.3.29
contextlib2==0.5.5
cryptography==1.8.1
cycler==0.10.0
Cython==0.26
cytoolz==0.8.2
dask==0.15.2
datashader==0.6.1
datashape==0.5.4
decorator==4.1.2
distributed==1.18.1
docutils==0.14
entrypoints==0.2.3
et-xmlfile==1.0.1
fastcache==1.0.2
Flask==0.12.2
Flask-Cors==3.0.3
rope-py3k==0.9.4.post1
scikit-image==0.13.0
scikit-learn==0.19.0
scipy==0.19.1
seaborn==0.8
selenium==3.0.2
simplegeneric==0.8.1
singledispatch==3.4.0.3
six==1.10.0
snowballstemmer==1.2.1
sortedcollections==0.5.3
sortedcontainers==1.5.7
Sphinx==1.6.3
sphinxcontrib-websupport==1.0.1
spyder==3.2.3
SQLAlchemy==1.1.13
statsmodels==0.8.0
sympy==1.1.1
tables==3.4.2
tblib==1.3.2
terminado==0.6
testpath==0.3
Theano==0.9.0
toolz==0.8.2
tornado==4.5.2
tqdm==4.15.0
traitlets==4.3.2
unicodecsv==0.14.1
wcwidth==0.1.7
Werkzeug==0.12.2
widgetsnbextension==3.0.2
wrapt==1.10.11
xarray==0.9.6
xlrd==1.1.0
XlsxWriter==0.9.8
xlwt==1.3.0
zict==0.1.2

[Feature Request] Option to change transparency of confidence intervals in viz.predictive_ecdf() and viz.predictive_regression()

It would be nice to have the option to change the transparency of the confidence intervals in
viz.predictive_ecdf() and viz.predictive_regression(). This would enable the user to plot multiple plots on top of each other.

Put in marker_kwargs

In cat plots, with kwargs argument applies to creation of the figure, and the marker_kwargs apply to the markers.

[Feature Request] Raise warning for unused keyword arguments if bokeh figure is initialized.

When using viz.predictive_ecdf(), viz.histogram() and viz.predictive_regression(), any keyword argument can be given, if a bokeh figure is initialized in advance and given as a keyword argument. The following code works just fine, however I think it should be avoided to silently ignore keyword arguments. At least a warning should be returned.

import bebi103
import numpy as np
import bokeh.plotting
import bokeh.io

bokeh.io.output_notebook()

sample = np.array([np.random.normal(0, 10, size=1000) for i in range(10)])

p1 = bokeh.plotting.figure(
    plot_width=200,
    plot_height=200
)


bokeh.io.show(
    bebi103.viz.predictive_ecdf(
        sample,
        p=p1,
        foo="bar"
    )
)


p2 = bokeh.plotting.figure(
    plot_width=200,
    plot_height=200
)

bokeh.io.show(
    bebi103.viz.histogram(
        sample[0],
        p=p2,
        foo="bar"
    )
)

p3 = bokeh.plotting.figure(
    plot_width=200,
    plot_height=200
)

bokeh.io.show(
    bebi103.viz.predictive_regression(
        sample,
        sample[0],
        p=p3,
        foo="bar"
    )
)

with package versions
CPython 3.7.5
IPython 7.11.1

bokeh 1.4.0
bebi103 0.0.52
numpy 1.17.4
jupyterlab 1.2.6

DOI link not working

The DOI link in README.md leads to a "Page not found" website.

cmdstanpy args changed

CmdStanPy changed arguments for their functions as addressed in #221.
Therefore when calling functions such as bebi103.stan.sbc there is an error because of the deprecated sampling_iters argument that was changed to iter_sampling.

Allow SBC to retain and return samples when there is a crash

When doing an SBC calculation with bebi103.stan.sbc(), if one of the sampling runs crashed (e.g., by hitting an infinity or something), then all of the SBC samples gathered up to that point are unavailable, which can be frustrating because they took a lot of computation time to acquire.

We need to allow access of samples even when a crash occurs.

Release for 2015 edition?

Might be good to tag commit 998b923 as a release, so that API-incompatible changes can be made for the coming year while also making sure old lesson/homework notebooks stay functional.

Put in default axis labels for cat plots

Defaults could just be DataFrame column names if axis labels are not specified.