Git Product home page Git Product logo

lecture-datascience.myst's People

Contributors

aadsouza avatar aakashgfude avatar alim-faraji avatar arnavs avatar doctor-phil avatar drdrij avatar jbrightuniverse avatar jlperla avatar mmcky avatar sglyon avatar wenxinm avatar wupeifan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lecture-datascience.myst's Issues

Add a single unit test to a datascience

@mmcky proposed the solutino that the following was added:

:tags: ["remove-cell"]

which should hide both input and output. He said that That way they will execute as part of jupyter-cache but neither input or output will be included for the code-cell that they are contained in for the html. Sounds good enough to me!

Now to verify that this works prior to applying it all over the julia code, we should add in a single unit test in one of the lectures. And we should do it in a very "low risk" place. I suggest going to https://github.com/QuantEcon/lecture-datascience.myst/blob/main/src/applications/visualization_rules.md and adding in something completely at the bottom of the lecture that does

```{code-cell} python
:tags: ["remove-cell"]
print("Unit test running")
assert True, "Failure in unit test"
```

Instead of at the bottom, maybe put it after the text and before something else just so we can see if it messes up spacing. For example change the text in that lecture to be

    ### Exercise 1
    
    Create a draft of the alternative way to organize time and education -- that is, have two subplots (one for each education level) and four groups of points (one for each year).
    ```{code-cell} python
    :tags: ["remove-cell"]
    print("Unit test running")
    assert True, "Failure in unit test"
    ```
    Why do you think they chose to organize the information as they did rather than this way?

OK, after adding that, then you need to verify a few things

  • Leave the assert True as on and generate the HTML. Verify that no funny business happened (e.g. no text shows up, there isn't a weird cell, etc.) and the spacing also wasn't drastically changed
  • Leave the assert True as on and generate the jupyter notebooks. Check if the jupyter notebooks generated with the "Unit test running" and maybe all of that code in a cell.
    • If so, then make sure an issue is posted on the appropriate repo and link to it from this issue. We can launch the notebooks with that small quirk at the end of that notebook until the feature is implemented. People can just ignore the cell until then.
  • I can't remember if we have pdf support yet, but if we do then do the same thing. Hopeefully the cell doesn't show up, but if it does then we should ensure an issue is posted - and not worry about the bug in the pdf output because the stakes are low.
  • Now modify the code to assert False etc. and run things to compille that file locally. Ensure that the error isn't just silent.
  • Make the change to assert False and then commit to a PR. Make sure that the error isn't just silent. This is a "breaking" change and it should be very visible in the CI in one form or another. If it stops the procses of building the full netlify/etc. for the PR that is fine - we just need to make sure the errors aren't silent.

conda and pip install packages on machine

# Install Python packages
conda install python-graphviz
conda install -c conda-forge "nodejs>=10.0" xgboost
pip install qeds fiona geopandas pyLDAvis gensim folium descartes pyarrow --upgrade

Proof of concept devcontainer

@WenxinM @aadsouza Anything that @arnavs givesa you takes precendence, but on downtime we should see what it would take to get a .devcontainer working for this sort of thing, along with the appropriate vscode extensions. In particular, the python and juptyer extensions.

For this, I think the best way to do it is to create a test one in your own githubs, put in a .py file and a .ipynb notebook, etc. from a couple of the of https://github.com/QuantEcon/quantecon-notebooks-datascience And you will need some sort of environment.yml or requirements.txt file for dependencies at some point.

Then, install the remote containers extension in vscode, and use the Add Deveopment COntainer Configuration Files in https://code.visualstudio.com/docs/remote/create-dev-container#_automate-dev-container-creation to try out a "python" one. But it won't have jupyter installed. For that, you can install a conda specific one instad. See https://github.com/microsoft/vscode-dev-containers/tree/master/containers/python-3-anaconda

If that anaonda one works, you can also install the vscode jupyter extension in https://github.com/microsoft/vscode-dev-containers/blob/master/containers/python-3-anaconda/.devcontainer/devcontainer.json#L26-L29 etc. when you modify it.

The idea would be that the new version of this notebook repository would have those extra folders in them, which would enable Github codespaces and vscode remote containers/etc. But if you get things working with a devcontainer for remote use in vscode, it will probably also work in the github codespaces. There aren't a lot of specific customizations required: https://code.visualstudio.com/docs/remote/codespaces

You can look into the docs for this sort of thing (e.g. https://code.visualstudio.com/docs/remote/create-dev-container) but I think that copying a preexisting one with conda setup is a better bet.

Issue List

  • Bibtex working with 2.0 (not essential, but helpful) (@arnavs)
  • Fix images in generated notebooks (@aadsouza, @WenxinM)
  • Development extras: devcontainer, .py generation, flattened folder in notebooks repo, local editing instructions, etc. (@arnavs)
  • Execution in theme -- not an issue, disabled manually by Aakash, just need to re-enable
  • Extras for theme merging
  • Prepare for migration here -- update screenshots, links, paths, etc. to point to this repo and new site (@WenxinM @aadsouza)
  • Do "nice-to-haves" (anything marked later)

Look for Whitespace Errors

Let's say @WenxinM does the first three sections, and @aadsouza does the last two (the first section, "Introduction", is just one lecture.)

Per @mmcky:

Sometimes a > will crop up in myst files that have been generated by sphinx-tomyst. That is because a lot of users get white space wrong in rst so any form of incorrect indentation wraps items in a block_quote object within the sphinx.ast. So these are something to keep an eye out for.

Let's find and fix these if possible. This should be doable based on the source itself. Basically find occurrences of that character and see if they correspond to a valid block quote on the website.

[sphinx-tomyst] Add support for exercise_node

Currently this is an unknown node and it needs to be added to sphinx-tomyst.

We could migrate:

https://github.com/QuantEcon/sphinxcontrib-jupyter/blob/master/sphinxcontrib/jupyter/directive/exercise.py

across to a separate package to support exercise nodes and jupyter-book.

Alternatively we could swap to: sphinx-exercise

however there is currently a limitation in that code-blocks are not executable as documented in this issue. In summary, execution for jupyter-book is handled by jupyter-cache so happens before the sphinx parse phase which means it is more difficult to interpret any nested objects and map to the notebook. They currently just get contained embedded in the markdown cells -- so they are there + syntax highlighted but not executed.

code-cell does not render outputs

On the website, most of the code-cell do not render outputs.

Outputs can be generated in the Python Fundamentals (not in Basics),
Outputs cannot be rendered in Applications and Pandas (except for Index), Scientific Computing (except for Intro to Numpy)

Fix Outline Formatting

Right now, it's some weird standalone highlighted block.

It should look like a normal section, analogous to the current live lectures.

edit: cut section as outline on side

Change Cell-Block into Cell-Code

When changing the exercise sections to markdown blocks, we notice that there are some code-block used in the text (not in the special content blocks). If the code-block is not executable, we might want to change all/some of the code-block into code-cell depending on the content.

Literal Include Fix

WARNING: Include file `.../lecture-datascience.myst/src/applications/_static/colab_full.raw` not found or reading it failed

Notice that _static is not in sub-repos. Similar issue across lectures that have literal include directive.

Exercise ````

Many of the exercise directives are preceded by four tick marks (see below).

````{exercise}

But they aren't closed with four tick marks, which leads to formatting errors.

Pygment and Literal block

lexer name errors

...\src\pandas\storage_formats.md:317: WARNING: Pygments lexer name 'markdown' is not known

...\src\introduction\local_install.md:145: WARNING: Could not lex literal_block as "powershell". Highlighting skipped.

Problem sets access from website

When typing in problem set in the search box, people can access the files for problem sets, although they do not show up in the table of contents. Not sure if this is what we want.

Remove sphinx-exercise dependency

  • Change exercise section at bottom of each lecture to vanilla markdown sections.
  • Change "see exercise below" exercise directives to native jupyter-book note directives.

edit:

  • Change "see exercise below" exercise directives to native jupyter-book note admonition directives with exercise title.
  • code-block to executable code-cell

Cleanup Issues General

  • moving files
WARNING: logo file '_static/qe-logo.png' does not exist
WARNING: favicon file '_static/lectures-favicon.ico' does not exist
  • toctree documentation thnig

Hint Directives

Change "Hint: foo bar" to hint directives on all exercises.

Invalid links in the notebooks

There are two invalid links in the current notebooks, possibly due to the nested structure of the exercise.

  • One is in Lecture Randomness

(https://datascience.quantecon.org/scientific/randomness.html or https://5ff76619566d1a00a3e5f162--nervous-villani-f8256d.netlify.app/scientific/randomness.html)
There should be a link to the lecture Applied Linear Algebra in exercise 3 but it does not show up. The corresponding code is line 484 in the md file (I couldn't find a way to refer to the codes in the md files):

**Exercise 3**

Let's revisit the unemployment example from the {doc}`linear algebra lecture <applied_linalg>`.

We'll repeat necessary details here.
  • The other is in Lecture Introduction to Numpy

There should be a link to the lecture Control Flow in exercise 5. It does not show up here https://datascience.quantecon.org/scientific/numpy_arrays.html but works as expected here https://5ff76619566d1a00a3e5f162--nervous-villani-f8256d.netlify.app/scientific/numpy_arrays.html. The corresponding code is line 417 in the md file:

**Exercise 5**

Let's revisit a bond pricing example we saw in {doc}`Control flow <../python_fundamentals/control_flow>`.

@arnavs

Update Paths

e.g. for https://github.com/QuantEcon/lecture-datascience.notebooks

index page of lectures with toc-tree

either suppress because of toc-tree rendered in margin or fix. same issue across sub-repos.

.../lecture-datascience.myst/src/applications/index.md:22: WARNING: Directive 'tableofcontents': No content permitted

:maxdepth: 2

applications/visualization_rules
applications/regression
applications/recidivism
applications/maps
applications/classification
applications/working_with_text
applications/ml_in_economics
applications/heterogeneity

Hard-code exercise list

Exercise list directive on bottom of all .md files with exercises does not render.

edit: hardcode exerciselist equivalent

[Migration] Warnings Issued by sphinx-tomyst build of myst

Here is the build output for the current sphinx-tomyst build of the myst files from lecture-source-ds

~/repos-collab/lecture-source-ds  sphinx-tomyst ✔                                                                                             24d22h  
▶ make clean && make myst       
Removing everything under '_build'...
Running Sphinx v3.1.2
making output directory... done
/lecture-source-ds/conf.py:432: RemovedInSphinx40Warning: The app.add_stylesheet() is deprecated. Please use app.add_css_file() instead.
  app.add_stylesheet('css/custom.css')
WARNING: html_static_path entry '_static' does not exist
building [mo]: targets for 0 po files that are out of date
building [myst]: targets for 47 source files that are out of date
updating environment: [new config] 47 added, 0 changed, 0 removed
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... not found                         
parsing bibtex file /lecture-source-ds/src/applications/applications.bib... parsed 17 entries
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... up to date                        
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... up to date                        
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... up to date                        
reading sources... [100%] scientific/randomness                                                                                                        
/lecture-source-ds/src/applications/regression.rst:271: WARNING: Inline interpreted text or phrase reference start-string without end-string.
/lecture-source-ds/src/applications/visualization_rules.rst:3: WARNING: Duplicate explicit target name: "this article".
/lecture-source-ds/src/python_fundamentals/basics.rst:897: WARNING: Inline emphasis start-string without end-string.
/lecture-source-ds/src/python_fundamentals/basics.rst:897: WARNING: Inline emphasis start-string without end-string.
looking for now-outdated files... none found
pickling environment... done
checking consistency... lecture-source-ds/src/problem_sets/index.rst: WARNING: document isn't included in any toctree
done
preparing documents... done
CONFIG [definition_list] support for definition list in myst requires:                                                                                 
            myst_deflist_enable = True
        to be specified in the conf.py
CONFIG [definition_list] support for definition list in myst requires:                                                                                 
            myst_deflist_enable = True
        to be specified in the conf.py
CONFIG [definition_list] support for definition list in myst requires:                                                                                 
            myst_deflist_enable = True
        to be specified in the conf.py
CONFIG [definition_list] support for definition list in myst requires:                                                                                 
            myst_deflist_enable = True
        to be specified in the conf.py
writing output... [100%] scientific/randomness                                                                                                         
WARNING: SKIP python_fundamentals/collections [transition] objects are not supported by sphinx-tomyst
build succeeded, 7 warnings.

Some problematic links

  • introduction, course description: invalid link to the previous projects.

  • introduction, cloud setup: link to the old website, which should be to the new homepage.

  • introduction: troubleshooting: link to the old website, which should be to the new homepage.

  • introduction: troubleshooting: issue tracker, the current one is on QuantEcon/quantecon-notebooks-datascience

Homepage and Links to the Previous Projects

The homepage is still under construction. It is blank right now and we need to add:

  • text
  • links to VSE and to the previous projects
  • contributors and links to their website
  • icons of and links to the 5 main chapters

Also, in the lecture Course Description, there should be a link to the previous project as well, which will be added once the website for the previous projects is migrated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.