Git Product home page Git Product logo

detrending's Introduction

Light curves & detrending workgroup at #exosamsi

Information

Room 219

Office Hours 9am

Members

How to join the exosami organization

One option is: if you star this repository, I'll add you to the organization. Alternatively, if you want to have more fun with git+GitHub:

  • Figure out how to add your name to the members list in this file 😄.
  • Then submit a pull request:

How to contribute to the repository

Once you're a member of the organization, you should be able to push to the main repository. To do this, you can clone the main repository by running:

git clone https://github.com/exosamsi/detrending.git

or, if you already have a checked out copy, you can point it at the main repository by running:

git remote set-url origin https://github.com/exosamsi/detrending.git

In this step, origin is the name of the remote. origin is what git called the repository that you originally cloned.

detrending's People

Contributors

aprsa avatar benmontet avatar davidwhogg avatar dfm avatar hpparvi avatar jessielchristiansen avatar mrtommyb avatar pdbaines avatar ruthangus avatar rwolpert avatar saturnaxis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

detrending's Issues

Injected data and earth pointing

Tom:
Hi team,
I’ve been plating with injecting and recovering 1-year orbital period planets. I started using a dumb median filter but got stuck with detections always occurring after earth points through faulty filtering at data gaps.
I then tried using Gal’s detrender and am finding it causes massive spikes after every earth point. What am I doing wrong? I’ve attached a plot showing the a dumb median filter in red and Gal’s one in blue. I tried playing with the window size with little success.

By the way, the data files contain a column called SAP_QUALITY. The values in this column mean things, ie. there is a numerical value for ‘earth-point just happened’. I’ve started trying just throwing away the data for two days after every earth-point.

Cheers,
Tom

Bekki:
We're working on various approaches to avoid detections at data gaps. For example, what I'm doing now is:

  1. Identify change points based on time gaps and/or big flux jumps
  2. Detrend each segment between change points separately using running median filter with 3-day width
  3. Delete data 3 days before and after each gap
  4. Use wavelets to interpolate between gaps
    And I think others are working on different things. But basically for now most of us are sacrificing data near gaps.

Ruth is working with Gal's median filter too and I think she's implementing something to try to address this issue.

Tom:
Thanks for the update Bekki. I too am finding that sacrificing data near gaps is a necessary evil.
Why do you remove data before the gaps? These data shouldn’t be affected by the nasty ‘thermal ramps’, the spacecraft doesn’t know its about to do an Earth-point.
(I guess we should probably be having this conversation on Gibhub, oops)

Hope all is well there with you guys! It’s rainy here in SF!! in June!!

Papers we may be working on

Title: Just use the PDC SAP
Abstract: Who needs a fancy detrending algorithm?

Title: A wavelet-based transit likelihood? No thanks.
Abstract: Chi-squared for good enough for our grand-advisors.

Title: The CMB or the Kepler CCD array?
Abstract: We know the temperature of one these pretty well.

Title: A Gaussian approach to transit detection
Abstract: We drew a box around the transit light curves and assumed that the flux through the surface was proportional to the amount of planets inside.

Title: God wouldn't make other planets like the Earth.
Abstract: We set the prior to zero and didn't find any.

Build issues

From @benmontet

On EPD numpy 1.6.something, the build fails with the error

/usr/include/string.h:548:5: error: unknown type name ‘__locale_t’

/usr/include/string.h:552:18: error: unknown type name ‘__locale_t’

untrendy/_untrendy.c: In function ‘untrendy_find_discontinuities’:

untrendy/_untrendy.c:27:46: error: ‘NPY_ARRAY_IN_ARRAY’ undeclared (first use in this function)

A list of Earth-like candidates? You wish

I wouldn't dignify these with the term "candidates" but in the pound stars folder, I've put a zipped "gallery" of png images of the two most promising signals for each "pound star" (i.e. 72 bright, low-noise stars with known small, low-period planets). You'll see some examples of possible signals, and even more examples of detrending fails and transit-duration-timescale stellar variability (e.g. an especially deep wiggle among tons of similar wiggles).

A couple examples:
7286173_2
7364176_1

The colors represent a rainbow of time, from early (purple) to late (red). If the transit seems to be mostly one color (i.e. only occurring at one time) or different depths at different times/colors, be very suspicious.

Supposedly my program automatically masked out the signals of the known planets in the system but I haven't checked if that succeeded for every star (it worked for the several that I checked). Also, it's very possible that my detrending introduced transit signals or distorted true transit signals (see my other post, coming soon, about the recovery rate of Dan's injected planets; about 24% of planets are being detected and all are above 120 ppm transit depth). Speaking of, Ruth and Billy have been investigating how detrending distorts transits; you'll see that in our slides when we post them (and some interesting points and results by Hannu)

If any of these not-really-candidates catches your eye, let me know if you'd like me to take a closer look. Let me know if you'd like any more information or even if you'd like to have a "how the other half lives" type experience by looking at some of my IDL code.

The exact process through which they were obtained: detrend PDC using running median filter on each segment (between gaps); trim away edges of each segment; use wavelets to interpolate between gaps; at each datapoint, remove a 13 hour/80 ppm transit and record delta likelihood if delta likelihood is greater than zero [Note: this is equivalent to imposing a super-strict prior; I'm going to try imposing a weaker prior soon]; fold and sum likelihoods to identify best period. As you'll see in the other post, at the moment the chi^2 likelihood is working better than (my implementation of) the wavelet likelihood for this search process (i.e. performing better in recovering Dan's planets), but I'm hopeful that can be improved on.

All the best,

Bekki

Tutorial for working with sandbox data

Hey @saturnaxis, can you add a little tutorial for how to interact with the Sandbox data? You've already pretty much written it on the SAMSI group.

How about a README.md file in the sandbox directory of this repository?

Make the Savitzy Golay be a import-able module

I want to import the Savitzy-Golay filtering as a module and then call it like

import SGfilt #obviously pick whatever name you choose
SGfilt.do_filtering(time,flux,**kwargs)

where the keywords take sensible defaults.

Bekki, WTF are you doing?

@dawsonri how are you so awesome at finding candidates? What method are you using and what are your tricks? If there is a paper, point us to it. If there isn't, can we help you write it?

Proposal for common data format

We need a way of comparing the output of all of the detrending algorithms. This will (probably) involve something including but not limited to:

  • running a standardized search algorithm on all the different outputs
  • visualizing the results of the different methods in the same way/simultaneously
  • other things?

Anything that we do will benefit a common output format for the codes (for obvious reasons).

I see 2 main options:

  1. ASCII tables (gasp!) with specified columns (kbjd, detrended_flux, detrended_flux_uncert, ...)
  2. FITS tables with the same format as the original Kepler data products (including the relevant metadata) with added columns with the same information as above

The first option is far easier to implement in any programming language (lowering the barrier to entry) so I'm probably inclined to go with that but the second one seems more useful (and self-contained) for the search phase depending on what we decide to do.

Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.