Git Product home page Git Product logo

Comments (8)

JessicaS11 avatar JessicaS11 commented on June 16, 2024

Hello @arrran! Thanks for getting in touch. Could you please provide a some more information on the inputs you're using so I can recreate the problem you are having? For example, the exact commands you are running (including their inputs) that are giving you an empty output. gda_lib.ATL08_to_dict() takes two inputs, a list of files and a dictionary of dataset variable names/paths, but you only mentioned which values you used for the latter. In addition, the .head() method must be called on a dataframe (in this case, we're using a geodataframe), but gda_lib.ATL08_to_dict() will return a dictionary, not a dataframe.

Thanks for using icepyx - we'd love to have you join the team and learn more about your examples/use cases!

from icepyx.

arrran avatar arrran commented on June 16, 2024

Hi Jessica,
Thanks for the help!

I'm following your spatial_subsetting_vis.ipynb notebook, the only difference is I am using a different polygon (points attached in txt file), and I am not using a subregion.

If you look at your spatial_subsetting_vis.ipynb notebook on this github it already has the same error message. Scroll down to Out [31] UnboundLocalError: local variable 'df_final' referenced before assignment

-I was wondering how despite that error you managed to jump to In [14] which has the data in a geodataframe, did you load it in a command which later got deleted?

The Out [31] error is solved by adding adding df_final = [] after line 131 of gda_lib.py
With that fix, I still couldn't load to geodataframe as data_dict = ATL08_to_dict(ATL06_fn,dataset_dict) is producing an empty list.

study_area_buffer_points.txt

Thanks very much for the help!!
arran

from icepyx.

arrran avatar arrran commented on June 16, 2024

Attached here is the code I followed in case my description doesnt make sense.
Its just the code from your jupter notebook spatial_subsetting_vis.ipynb, except I have copied in the functions that I was fiddling with.

Cheers!

from icepyx.

JessicaS11 avatar JessicaS11 commented on June 16, 2024

Hello Arran,
Of course! I think I see what the problem is (and why my initial answer may not have been super helpful). When you referenced the filename and commands you were getting errors on, I immediately looked where those functions are called in the relevant example (icepyx/doc/examples/ICESat-2_DEM_comparison_Colombia_working.ipynb). I failed to notice that you were referencing one of the dev-notebooks, which are not meant to be standalone/runnable examples. Rather, the dev-notebook directory is where we put our experimental notebooks as we're creating them and experimenting with and debugging the code. It sounds like we should do a better job hiding them in the directory structure!

That said, I looked into what's going on. From looking at the error message in the dev-notebook the error is actually being produced in the ATL08_to_dict function of topolib, and the problem arises when an earlier step is erroneously producing an empty dataframe (which is why adding df_final=[] removed that error but didn't solve the underlying issue). From looking farther up in the code you shared, it looks like you are searching and downloading ATL06 data, but then trying to extract ATL08 parameters. My guess is that the ATL06 data doesn't have the data you are trying to extract using the dataset_dict variable, which is for ATL08. Thus, an empty dataframe is returned and you ultimately run into the errors you've encountered.

As for solutions, assuming that's the issue, your next step will depend on which dataset you'd actually like to use. If it's ATL08, you will need to edit your icepyx object to order ATL08 instead of ATL06. If it's ATL06, you'll have to figure out which variables and paths you'd like to extract into your dataframe and update the dataset_dict accordingly. We're in the process of working on some major code changes to make it easier to use and provide some default variable lists for each dataset, but we haven't finished compiling those yet. We'd love to have your input as a data user!

from icepyx.

arrran avatar arrran commented on June 16, 2024

Okay great! yeah that makes sense, whoops as well as looking at the wrong notebooks i had 08/06 mixed.

I am using ATL06, just for the higher res (though I dont really know much)

I just need to set the right variables in the dictionary, the goal is to get the data I have downloaded into a geodataframe. Are there any notebooks which do this for ALT06?
I can see the huge ATL06-data-dictionary-v001 but it would probably be easier to follow someone who know what they were doing if it's available.

Thanks very much for the help,

from icepyx.

JessicaS11 avatar JessicaS11 commented on June 16, 2024

I'm glad to hear you were able to sort out the errors!

Currently there's no collated resource I can point you towards for ATL06, though the information does exist. Some potential places to look:

  • We're currently working on a tutorial notebook for variable subsetting, which includes building a dictionary for the desired data variables. You're welcome to view the in-progress notebook.
  • last year's ICESat-2 Hackweek topohack project (which you had to install to use the example notebook) had some notebook examples and code (topolib) related to ATL06.
  • If nothing else, I suspect you should be able to use icepyx to build a variable dictionary that you can feed in as the dataset_dict parameter to the ATL08_to_dict function.

I hope this helps - we'd love for you to contribute whatever you come up with to icepyx!

from icepyx.

arrran avatar arrran commented on June 16, 2024

Ah yep great thanks, and yes keen to contribute anything i write!

One thing I haven't figured out: In the notebooks why do you often define a bounding box or boundary shapefile then turn subset off (e.g. here

Also with subset off, it doesn't seem to download all icesat2 data. What exactly is the subset?

from icepyx.

JessicaS11 avatar JessicaS11 commented on June 16, 2024

Awesome - thanks, @arrran!

You've hit on a question that gets asked of the NSIDC a lot. If you take a look at the data access tutorial notebook, it outlines some of these steps in more detail. The gist of it is that there are actually several stages to data access. First is a simple query that runs quickly based on metadata (CMR format in this case) stored about each granule (without ever looking at any of the actual data files). Next is the actual data order step, which contacts the NSIDC and asks them to make sure the data you want is ready for download. This can include subsetting, which I'll explain more in a second. Third, you actually download the data that's been prepared for your order.

You're probably familiar with subsetting from your own work - we're used to downloading geospatial data as files containing large granules or scenes. Then, we have to extract the actual area (or time period or variables) we want from each file locally. For ICESat-2 data, NSIDC has a subsetting service which will do this as part of the ordering process, eliminating this step from your later workflow and significantly decreasing your file size (which is obviously helpful for both download times and storage). The subsetting service can do spatial, temporal, and variable subsetting as well as a few other types of file conversions - I'd highly recommend taking advantage of it if you don't need full granules! Sometimes, this means that you might have x granules returned by the query but x-y (where y>0) returned in the order. This is because the metadata used to search for the granules isn't perfect, so sometimes when the subsetter starts it determines that there's not actually any data that meet your search criteria within a given granule (I'm sure you've seen this phenomenon happening in online data visualization centers, where a tiny corner of an image is in your search area so the whole granule/scene is given as a result, even though most of it is outside your area of interest).

from icepyx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.