Git Product home page Git Product logo

dataanalysisrecipes's Introduction

Data Analysis Recipes

Chapters from Hogg's non-existent book.

Authors:

(Contributions have come from all of the following.)

  • David W. Hogg, New York University
  • Jo Bovy, Institute for Advanced Study
  • Dan Foreman-Mackey, University of Washington
  • Dustin Lang, Princeton University

License:

Copyright 2010, 2011, 2012, 2013, 2014, 2015, 2016 the authors. All rights reserved.

If you have interest in using or re-using any of this content, get in touch with Hogg.

Style notes:

  • tentative: use "pdf" not "PDF".
  • When at the end of the sentence, put the \note after the period, but when at the end of a phrase, put the \note before the comma or parenthesis.
  • Make sure the endnotes can be read on their own, outside of context.
  • Be careful with the words "error", "uncertainty", "probability", "frequency", "likelihood".
  • Use () for function arguments, and [] for grouping/precedence.
  • Define macros; remember "1, 2, infinity".
  • Put new terms in \emph{}, put only referred-to words in quotation marks.
  • Do in-text itemized lists with \textsl{(a)}~ and so on.

Git migration notes:

When I want to import stuff from the old SVN repository, I do the following:

  1. I create a new github repository called foo and follow the svn import instructions.

  2. I git clone that repository and do things like move the files into a directory structure that won't conflict with the current structure, like:

     cd
     git clone [email protected]:davidwhogg/foo.git
     cd foo
     mkdir straightline
     git add straightline # I think this is maybe needed?
     git mv *.pdf straightline
     # etc
     # . . .
     git commit -a -m "fixed up directory structure"
     git push
    
  3. I make a subtree merge or something like that (I am new to all this) like so:

     cd
     cd DataAnalysisRecipes
     git pull # to get up-to-date
     git remote add foo [email protected]:davidwhogg/foo.git
     git fetch foo
     git merge foo/master
     git push
    
  4. Then I delete the foo repo from github so as not to confuse myself.

dataanalysisrecipes's People

Contributors

alessandro-gentilini avatar changhoonhahn avatar davidwhogg avatar dstndstn avatar jobovy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataanalysisrecipes's Issues

ideas for straightline from Froeschle

  • the scale dependency of the Bayes Factor (i.e. the evidence) when using the line angle instead of the slope as a parameter if the data points have units.
  • the issue of hitting the lower limits of the uncertainties in both directions with the posterior probability distribution, because the model does not know where to put the scatter. I showed you a corner plot with this effect. You suggested to give not only upper limits but also sensable lower limits to the uncertainties.
  • the question of Markus about how one can put into a model the difference between all data points lying on one side of the line or equally distributed on both sides of the line. In our case we couldn't tell our model that it is an important finding that all redshifts lie above zero.

Uncertainties and variances

I believe there is a typo in

When the uncertainties are Gaussian and their variances $\sigma_{yi}$
which should be

    When the uncertainties are Gaussian and their variances $\sigma_{yi}^2$

I also found this problem to be poorly worded

\begin{problem}\label{prob:chi2}
Re-do the fit of \problemname~\ref{prob:easy} but setting all
$\sigma_{yi}^2=S$, that is, ignoring the uncertainties and replacing
them all with the same value $S$. What uncertainty variance $S$ would
make $\chi^2 = N-2$? Relevant plots are shown in
\figurename~\ref{fig:chi2}. How does it compare to the mean and
median of the uncertainty variances $\allsigmay$?
\end{problem}

The statement

setting all $\sigma_{yi}^2=S$, that is, ignoring the uncertainties and replacing them all with the same value $S$

is not self-consistent.
I believe it should read as

\begin{problem}\label{prob:chi2} 
 Re-do the fit of \problemname~\ref{prob:easy} but setting all 
 $\sigma_{yi}=S$, that is, ignoring the uncertainties and replacing 
 them all with the same value $\sqrt(S)$.  What uncertainty variance $S$ would 
 make $\chi^2 = N-2$?  Relevant plots are shown in 
 \figurename~\ref{fig:chi2}.  How does it compare to the mean and 
 median variance of the uncertainty $\allsigmay$? 
 \end{problem} 

add DFM to author list

should be on README

should be on PGM chapter (is)

should be on probability.tex if/when he adds a PGM to that chapter

svn history seems to have disappeared

I am pissed, because I thought the import from SVN was supposed to preserve it; and it did at the foo stage; the git fetch and git merge seemed to eradicate it...?

Series needs a better name!!

I now don't like Data Analysis Recipes. Other options:

  • Data Analysis
  • Data Analysis for natural sciences and engineering
  • The Joy of Data Analysis
  • ... add suggestions!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.