Git Product home page Git Product logo

contracosta's Introduction

contracosta

Exploring wavelength dependent starspot contrast, now funded!

Content:

  • Gif animations for KeplerSciCon talk slides
  • TESS GI Cycle 2 proposal (Not selected)
  • TESS GI Cycle 4 proposal (Selected and funded 7/28/2021!)

Authors:

  • Michael Gully-Santiago (UT Austin, formerly Kepler/K2 GO Office)
  • Caroline Morley (UT Austin)
  • Ryan Hartung (UT Austin)

contracosta's People

Contributors

gully avatar ryanhartung avatar

Watchers

 avatar  avatar

contracosta's Issues

Explain the method better, based on feedback from 2019 Panel review

The cycle 2 panel review states:

The proposal does not clearly justify whether the Kepler/TESS contrasts are sufficient to
determine spot coverage fractions. The method itself is adequately demonstrated, but it is unclear
from the TESS and Kepler throughput curves shown in Figure 1c and the transit depths shown in
Figure 1b that the wavelength separation between the two is sufficient for determining meaningful
constraints on spot coverage fractions.

EPIC 201189968 seems to have missing data

When I attempt to download the Kepler EVEREST lightcurve for "EPIC 201189968", I receive a large error message (attached as an image).
Screenshot 2022-06-27 184345

The file and my program were working perfectly fine on Friday.

Calculate the Temperature of the Starspots using ATess over AKepler and Teff

  • Create a new column in data frame - T_nearest (round each Teff to the nearest 100th)
  • Create a new column in data frame - CTESS
  • Create a new column in data frame - CKepler
  • Create a new column in data frame - Tspot ():
    def get_Tspot(Tnearest, Atess/Akepler): ... return Tspot
    To find the interesection find the minimum (numpy.argmin) of the difference of the absolute value of the two functions
  • Spot check the data by hand to make sure that we are getting the correct answers
  • Show the spot-checking procedures by hand (include graphs)

20220801_161201
20220801_161145

Address feedback from Cycle 2: Cosmic Variance

The panel feedback in 2019 states this major weakness:

There is a risk to achieving the proposed goals based on the assumption of the similarity
between the distinct TESS/Kepler samples (cosmic variance). While the utility of using the
overlapping Kepler targets to determine relative contrast ratios and, thus, calibrate the TESS data
set, is well-motivated, the suggestion that the cosmic variance between the TESS and Kepler
samples average out, keeping subsample mean amplitudes the same, was not fully justified.

Convert TESS-to-K2 amplitude ratio into a starspot contrast assuming the peak is spot-free

Imagine the peak of the lightcurve represents a spot-free surface (i.e. a calm yellow Sun ๐ŸŒž )

Now imagine the valley (i.e. the "trough" or minimum) of the lightcurve represents the coexistence of a single dark reddish spot ๐Ÿ”ด and a surrounding yellow Solar-like disk (i.e. like this sunflower ๐ŸŒป )

The TESS-to-K2 amplitude ratio is some number, for a given star, let's say 0.6 (so the TESS amplitude is 60% as deep as the K2 amplitude).

If the star were black โšซ How big would the spot have to be? But a truly black spot would have the same amplitude in TESS as it would in K2.

  1. What flux ratio does the spot have to have in order to achieve a 60% ratio in amplitude?
  2. Can you jointly constrain the size of the spot with this flux ratio?
  3. Write down a system of algebraic equations to related the filling factor of spots $f_{\rm spot}$, the amplitude of TESS $A_{\rm TESS}$, the amplitude of K2 $A_{\rm K2}$, and the starspot contrast $c_{\rm spot}$.

How to analyze outliers in K2 or TESS data

Analyze the K2 data by cutting it into 3 subsets each being 27 days from the beginning, middle, and end of each star to remove outliers. This isn't an urgent issue, just something to keep in mind.

How to filter TESS search results when many are available?

We have a problem of abundance: often lightkurve search queries to MAST result in more than one result--- different data reduction authors, different sectors, and different exposure times (aka "cadence"). While it would be nice to handle all of these in a systematic way, it becomes a "big data" challenge and/or heterogeneous data problem.

I propose we solve this problem in two steps.
For now: simply pick whatever lightcurve comes first in the search result: lc = sr[0] This choice will help use move forward with building our analysis tools.

But eventually, I think we should adopt the ELEANOR-LITE standard (#11). These are not available yet (see the referenced issue). But when they are we can simply flip the switch to exclusively use those.

List of things to complete

  • Rerun the data using normalized light curves
  • Recreate the figures and replace the current ones for the paper
  • Start working on creating gollum heat maps comparing TESS and Kepler star spot ratios
  • Add more citations to the first table of the paper
  • Recreate the 04_01 notebook and spot check the values in the plot and sample
  • Take spectrum from Gollum (resample) and find the weighted average between the TESS and Kepler curve (specutils graph)
  • Go to Gollum and go to tutorials and go to resample
  • Download the two figures tess_response to use for data

8/1/22:

  • Finish histogram on notebook 02_04

Downsize the total sample to a "pathfinder sample"

We downloaded and inspected the entire Table 2 or Reinhold & Hekker 2020, it totals over 30,000 sources. @RyanHartung and I decided we should cut down the sample to a more manageable subsample. This subsample will serve as a pathfinder for the overall project. For now we think we should cut based on Teff (i.e. spectral type).

Port some of the Proposal text into the paper draft

We wrote a nice, award-winning proposal. We should recycle some of that proposal text to serve as the introduction and methods sections of the paper. There are useful equations that we could and should expand upon.

The audience and aims for a paper differ from those of a proposal, so we will retool some wording and organization. But the metaphorical bones are there.

Quantify what fraction of our sample "worked": meaning the period and amplitude are trustworthy

We want to quantify what fraction are in the green box, compared to the total sample:

image

@RyanHartung informs that: out of 4,196 attempted TESS sources, only 1085 had postage stamp data available, and therefore lightcurves on MAST. So about 25.9% had data available. We expect that with the addition of eleanor-lite, we should get close to 99% of sources with a ready-made (and hopefully trustworthy) lightcurve.

We have essentially convinced ourselves that it will be very difficult to measure periods greater than about 7 days. Therefore we're pretty much stuck with even less than the rapid rotator sample (which had been defined for <10 days). 222 sources came up with TESS periods greater than 7 (these are probably not trustworhy). So therefore, about 20% have to be discarded, leaving us with at most about 3300 sources.

Of those 3300 some fraction will suffer from various lightcurve data artifacts, and will have to be discarded. We are targeting a "survival rate" of about 55%, which would yield about 1800 sources. We want to make the survival rate as high as possible, but there is an economic tradeoff in time. So the 55% number is a cost/benefit analysis of the value of our time in scrutinizing all the flaws in the lightcurves.

Pathfinder exploration 2: Can we see hints of starspot contrast "Locus" in the pathfinder sample?

Recreate the plot from the proposal:
fig2.pdf

The black points should be the K2 "EVEREST" points, with plot coordinates $(x,y)$ representing (Period EVEREST, Amplitude EVEREST).

Then, plot the TESS points as different color dots, with plot coordinates $(x,y)$ representing (Period TESS, Amplitude TESS). Our prediction is that those TESS points should fall in the orange region.

Was our prediction correct?? Why or why not? Do we need more data to answer?

Explore the degeneracy of Amplitude ratio, f_spot, T_spot, and T_ambient

All we measure for a given star is an amplitude ratio between TESS and K2. There is a subtle "self dilution effect" that we allude to in the proposal. Most practitioners have not contemplated this effect, so there is not a great reference for it. This paper can help you think about the degeneracies though. Here is my attempt. Imagine three stars:

image

The simple spot blocks some amount of bright yellow light compared to the spot-free scenario, let's say 2% of the area with an 0.5 contrast, so 1% of the flux is lost in the simple spot case compared to the spot-free case.

But now consider the star with a major polar cap of spots that is seen at all rotational phases. The simple spot tends to block the brighter-on-average flux. So that same 2% for the disk area is blocking say 2.8% of the yellow stuff.

So mapping between amplitude of modulation ratio, f_spot, T_spot, and T_ambient is subtle and partially degenerate.

There is a way to quantify degeneracy: Simply brute-force compute a grid cross all permutations of f_spot and T_spot that are consistent with:

  1. The TESS-to-K2 amplitude ratio
  2. The T_eff reported in the catalog we are using (set T_eff = T_ambient for simplicity)

For each star we will now have a heatmap of $f_{spot}$ and $T_{spot}$. It is unweildy to work with these heat maps for many stars! So we have to combine them in some way. That will be another task. For now let's get familiar with this idea of degeneracy, and think about how we may scale out this analysis.

Pathfinder exploration 1: Compare periods measured from Kepler/K2 to periods measured from TESS.

Now that we have a pathfinder sample (#8) and a populated metadata table (#13), we should explore what the data is telling us. First up is to assess whether the periods derived by K2 data (e.g. EVEREST) equal the periods derived in TESS data. To that end, I recommend:

Make a plot with the following characteristics:

  • the plot should be square plt.figure(figsize=(6,6))
  • the plot $x$ axis should go from 0.5 to 10 days
  • the plot $y$ axis should go from 0.5 to 10 days
  • the $x$ axis should be "Period, Kepler"
  • the $y$ axis should be "Period, TESS"
  • plot a diagonal line hint: $[x, y]$ coordinates should go from (0.5, 0.5) to (10,10), the line connecting these forms a diagonal line

plot the x,y coordinates as individual stars as points on this plot. The coordinates will line up along the diagonal straight line if the periods match i.e. (6.2, 6.2) and (3.4, 3.4) and so on. Points that veer off this line have mismatched periods, e.g. (4.1, 3.7) or (9.1, 1.3).

So making this plot forms a "quality assurance" experiment at-a-glance.

Explore how TESS-to-K2 contrast maps to T_spot/T_ambient ratio

The ratio that you derived in #25 is a single number between zero to one. For a given star the physics of what you are measuring is: the flux ratio of the spot in one color (TESS filter, "red") to some other color (Kepler filter, blue-green-orange ish? it is "white light").

That scalar number has a physical origin and physical interpretation: temperature.

The temperature of the spot is lower than the temperature of the ambient photosphere.

In this assignment, we are asking you to find a relationship between the contrast measured in #25 and the physical temperature of the spot. There are two main ways to obtain this relationship: A) through the ratio of black body radiation, or through the use of model photospheres (Figure 1 from the TESS Cycle 4 proposal text). I recommend we use the latter. You will have to install gollum to get the model atmospheres.

Pathological examples to check for understanding:

What contrast ratio would you measure if the spot were absolute zero Kelvin and the ambient photosphere were 5700 Kelvin?

What contrast ratio (approximately) would you measure if the spot were 5699.999 Kelvin and the ambient photosphere were 5700 Kelvin?

Subtasks for this goal:

  1. Install gollum
  2. Replicate Figure 1 from the proposal (ask for the Jupyter notebook that generated it), adapt it using gollum behind-the-scenes.

Quality assurance: Spot check outliers

The plots in #14, #15 , and #16 are likely to yield outliers, points that are way outside the expected range. Randomly select one of these outliers by finding them in the DataFrame. Make two plots for this one object:

  1. Normalized EVEREST lightcurve (with horizontal lines indicating flux loss)
  2. Normalized TESS lightcurve (with horizontal lines indicating flux loss)

Assign a period "by eye". Does it match the automatically computed period? Why or why not?
Compute an amplitude by eye. Does it match the 5%-95% percentile-based estimate we made? Why or why not?

Create a new "Rapid Rotator" subsample that should constitute about 10% of the entire sample

@RyanHartung estimates that expanding to the entire 35,000+ source K2 sample would "merely" take about a day or two. So let's consider expanding the pathfinder sample to a bit larger, maybe say 3,000 sources. This sample represents a "Scale-up test", with about 10% of the entire sample.

We can have three samples:

  1. Pathfinder (413, or about 1%)
  2. Rapid Rotator ( $\sim$ 3000, or about 10%)
  3. Full sample (35,000+, or 100%)

I recommend that this sample has:

  1. No cut on $T_{\mathrm{eff}}$
  2. Period $&lt;10$ days
  3. No cut on Amplitude

You may need to pick a period threshold different from 10 days in order to achieve our goal of 3000 sources in the sample. My guess is that threshold is somewhere in the range: $P_{cut} \in (7, 13)$ days, so we can guess-and-check. The 3000 number is approximate, so anywhere between 3000 and 3500 sources is fine.

TESS Cycle 4 proposal

The TESS Cycle 4 proposal is Due on Friday January 22, 2021. We plan to adapt our unfunded Cycle 2 proposal for this cycle.

Scaling out the K2/TESS Period/Amplitude analysis to the "pathfinder sample"

Ok! We have identified a pathfinder sample with 413 sources (rows in the table). We now want to compute the period and amplitude for each source's TESS and K2 lightcurves. So for each source we want to populate new entries in the dataframe:

Screenshot from 2022-06-20 16-15-25

Currently those entries are populated with NaN placeholders.

We also want to find out how long this process takes, so let's write down the computation time of the for loop step. (note that lightkurve auto-caches so likely this number will reflect the mere computation time and not the download time.)

Set up a LaTeX paper draft in the repository

We will stick to using GitHub to house our $\LaTeX$ paper draft for now. At some point we may migrate to Overleaf to facilitate real-time sharing. Until then, I like the idea of the paper remaining part of the overall project revision history. Also, writing $\LaTeX$ in VS Code is convenient and works offline.

The one downside is the prospect for merge conflicts, so if you are working on the paper simply be sure to git pull and git push often to avoid getting too out-of-sync.

Pathfinder exploration 3: Comparison to the Reinhold & Hekker 2020 paper

Remake the figures in #14 and #15, but with the following change:

Rather than comparing EVEREST to TESS, compare EVEREST to Reinhold & Hekker's (Period, Amplitudes).

We should get the same answer, since they used the same K2 data. They will likely have some differences though. Those differences symbolize "repeatability", and serve as another important quality assurance step.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.