quantecon / lecture-datascience.myst Goto Github PK
View Code? Open in Web Editor NEWSource repository for QuantEcon Datascience
Home Page: https://datascience.quantecon.org
Source repository for QuantEcon Datascience
Home Page: https://datascience.quantecon.org
@mmcky proposed the solutino that the following was added:
:tags: ["remove-cell"]
which should hide both input and output. He said that That way they will execute as part of jupyter-cache
but neither input or output will be included for the code-cell that they are contained in for the html. Sounds good enough to me!
Now to verify that this works prior to applying it all over the julia code, we should add in a single unit test in one of the lectures. And we should do it in a very "low risk" place. I suggest going to https://github.com/QuantEcon/lecture-datascience.myst/blob/main/src/applications/visualization_rules.md and adding in something completely at the bottom of the lecture that does
```{code-cell} python
:tags: ["remove-cell"]
print("Unit test running")
assert True, "Failure in unit test"
```
Instead of at the bottom, maybe put it after the text and before something else just so we can see if it messes up spacing. For example change the text in that lecture to be
### Exercise 1
Create a draft of the alternative way to organize time and education -- that is, have two subplots (one for each education level) and four groups of points (one for each year).
```{code-cell} python
:tags: ["remove-cell"]
print("Unit test running")
assert True, "Failure in unit test"
```
Why do you think they chose to organize the information as they did rather than this way?
OK, after adding that, then you need to verify a few things
assert True
as on and generate the HTML. Verify that no funny business happened (e.g. no text shows up, there isn't a weird cell, etc.) and the spacing also wasn't drastically changedassert True
as on and generate the jupyter notebooks. Check if the jupyter notebooks generated with the "Unit test running" and maybe all of that code in a cell.
assert False
etc. and run things to compille that file locally. Ensure that the error isn't just silent.assert False
and then commit to a PR. Make sure that the error isn't just silent. This is a "breaking" change and it should be very visible in the CI in one form or another. If it stops the procses of building the full netlify/etc. for the PR that is fine - we just need to make sure the errors aren't silent.moved prep_ahs.R
from source repo, but it needs to run to generate the training and testing datasets.
tag as no execute
# Install Python packages
conda install python-graphviz
conda install -c conda-forge "nodejs>=10.0" xgboost
pip install qeds fiona geopandas pyLDAvis gensim folium descartes pyarrow --upgrade
@WenxinM @aadsouza Anything that @arnavs givesa you takes precendence, but on downtime we should see what it would take to get a .devcontainer
working for this sort of thing, along with the appropriate vscode extensions. In particular, the python and juptyer extensions.
For this, I think the best way to do it is to create a test one in your own githubs, put in a .py
file and a .ipynb
notebook, etc. from a couple of the of https://github.com/QuantEcon/quantecon-notebooks-datascience And you will need some sort of environment.yml
or requirements.txt
file for dependencies at some point.
Then, install the remote containers extension in vscode, and use the Add Deveopment COntainer Configuration Files
in https://code.visualstudio.com/docs/remote/create-dev-container#_automate-dev-container-creation to try out a "python" one. But it won't have jupyter installed. For that, you can install a conda specific one instad. See https://github.com/microsoft/vscode-dev-containers/tree/master/containers/python-3-anaconda
If that anaonda one works, you can also install the vscode jupyter extension in https://github.com/microsoft/vscode-dev-containers/blob/master/containers/python-3-anaconda/.devcontainer/devcontainer.json#L26-L29 etc. when you modify it.
The idea would be that the new version of this notebook repository would have those extra folders in them, which would enable Github codespaces and vscode remote containers/etc. But if you get things working with a devcontainer for remote use in vscode, it will probably also work in the github codespaces. There aren't a lot of specific customizations required: https://code.visualstudio.com/docs/remote/codespaces
You can look into the docs for this sort of thing (e.g. https://code.visualstudio.com/docs/remote/create-dev-container) but I think that copying a preexisting one with conda setup is a better bet.
.py
generation, flattened folder in notebooks repo, local editing instructions, etc. (@arnavs)later
)Let's say @WenxinM does the first three sections, and @aadsouza does the last two (the first section, "Introduction", is just one lecture.)
Per @mmcky:
Sometimes a > will crop up in myst files that have been generated by sphinx-tomyst. That is because a lot of users get white space wrong in rst so any form of incorrect indentation wraps items in a block_quote object within the sphinx.ast. So these are something to keep an eye out for.
Let's find and fix these if possible. This should be doable based on the source itself. Basically find occurrences of that character and see if they correspond to a valid block quote on the website.
Currently this is an unknown node
and it needs to be added to sphinx-tomyst
.
We could migrate:
across to a separate package to support exercise nodes and jupyter-book
.
Alternatively we could swap to: sphinx-exercise
however there is currently a limitation in that code-blocks
are not executable as documented in this issue. In summary, execution for jupyter-book
is handled by jupyter-cache
so happens before the sphinx
parse phase which means it is more difficult to interpret any nested objects and map to the notebook. They currently just get contained embedded in the markdown cells -- so they are there + syntax highlighted but not executed.
On the website, most of the code-cell do not render outputs.
Outputs can be generated in the Python Fundamentals (not in Basics),
Outputs cannot be rendered in Applications and Pandas (except for Index), Scientific Computing (except for Intro to Numpy)
Per executablebooks/jupyter-book#1137, 2,0 doesn't work.
I saw an issue in the heterogenous effects model (see #58 ). I think that you guys need to go through every one of the pages looking at the existing one on the other screen and make sure you don't see any issues. Post bugs here as you find them, and then we can fix them one-by-one.
maintain this in the main code repo. Copy over in the build process.
https://quantecon.github.io/lecture-datascience.myst/applications/heterogeneity.html
Probably needs xgboost
or whatnot in the einvorment.
Right now, it's some weird standalone highlighted block.
It should look like a normal section, analogous to the current live lectures.
edit: cut section as outline on side
When changing the exercise sections to markdown blocks, we notice that there are some code-block
used in the text (not in the special content blocks). If the code-block
is not executable, we might want to change all/some of the code-block
into code-cell
depending on the content.
The Github Action should:
gh-pages
WARNING: Include file `.../lecture-datascience.myst/src/applications/_static/colab_full.raw` not found or reading it failed
Notice that _static
is not in sub-repos. Similar issue across lectures that have literal include
directive.
Many of the exercise directives are preceded by four tick marks (see below).
````{exercise}
But they aren't closed with four tick marks, which leads to formatting errors.
lexer name errors
...\src\pandas\storage_formats.md:317: WARNING: Pygments lexer name 'markdown' is not known
...\src\introduction\local_install.md:145: WARNING: Could not lex literal_block as "powershell". Highlighting skipped.
For windows on vscode and osx/linux.
When typing in problem set
in the search box, people can access the files for problem sets, although they do not show up in the table of contents. Not sure if this is what we want.
Right now only a hardcoded ubtton https://jupyterbook.org/interactive/launchbuttons.html#jupyterhub-buttons-for-your-pages
What would be required to have it configurable...
Get this working with a hardcoded ubc syzygy jupyerhub button. Turn off later if we wish.
jupyter nbconvert --to myscript.py thenotebook.ipynb
Push the output to the notebooks folder. Flatten to get rid of that setup.
exercise
directives to native jupyter-book note
directives.edit:
exercise
directives to native jupyter-book note
admonition
directives with exercise title.code-block
to executable code-cell
WARNING: logo file '_static/qe-logo.png' does not exist
WARNING: favicon file '_static/lectures-favicon.ico' does not exist
Change "Hint: foo bar" to hint directives on all exercises.
There are two invalid links in the current notebooks, possibly due to the nested structure of the exercise.
(https://datascience.quantecon.org/scientific/randomness.html or https://5ff76619566d1a00a3e5f162--nervous-villani-f8256d.netlify.app/scientific/randomness.html)
There should be a link to the lecture Applied Linear Algebra in exercise 3 but it does not show up. The corresponding code is line 484 in the md file (I couldn't find a way to refer to the codes in the md files):
**Exercise 3**
Let's revisit the unemployment example from the {doc}`linear algebra lecture <applied_linalg>`.
We'll repeat necessary details here.
There should be a link to the lecture Control Flow in exercise 5. It does not show up here https://datascience.quantecon.org/scientific/numpy_arrays.html but works as expected here https://5ff76619566d1a00a3e5f162--nervous-villani-f8256d.netlify.app/scientific/numpy_arrays.html. The corresponding code is line 417 in the md file:
**Exercise 5**
Let's revisit a bond pricing example we saw in {doc}`Control flow <../python_fundamentals/control_flow>`.
sklearn.util.testing
used in heterogeneity lectures causing module not found error.iid
argument in model_selection.GridSearchCV
used in recidivism and ml in economics. see https://stackoverflow.com/questions/60869018/gridsearchcv-deprecated-iid-future-warninge.g. for https://github.com/QuantEcon/lecture-datascience.notebooks
either suppress because of toc-tree rendered in margin or fix. same issue across sub-repos.
.../lecture-datascience.myst/src/applications/index.md:22: WARNING: Directive 'tableofcontents': No content permitted
:maxdepth: 2
applications/visualization_rules
applications/regression
applications/recidivism
applications/maps
applications/classification
applications/working_with_text
applications/ml_in_economics
applications/heterogeneity
Exercise list directive on bottom of all .md
files with exercises does not render.
edit: hardcode exerciselist
equivalent
Here is the build output for the current sphinx-tomyst
build of the myst
files from lecture-source-ds
~/repos-collab/lecture-source-ds sphinx-tomyst ✔ 24d22h
▶ make clean && make myst
Removing everything under '_build'...
Running Sphinx v3.1.2
making output directory... done
/lecture-source-ds/conf.py:432: RemovedInSphinx40Warning: The app.add_stylesheet() is deprecated. Please use app.add_css_file() instead.
app.add_stylesheet('css/custom.css')
WARNING: html_static_path entry '_static' does not exist
building [mo]: targets for 0 po files that are out of date
building [myst]: targets for 47 source files that are out of date
updating environment: [new config] 47 added, 0 changed, 0 removed
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... not found
parsing bibtex file /lecture-source-ds/src/applications/applications.bib... parsed 17 entries
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... up to date
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... up to date
checking for /lecture-source-ds/src/applications/applications.bib in bibtex cache... up to date
reading sources... [100%] scientific/randomness
/lecture-source-ds/src/applications/regression.rst:271: WARNING: Inline interpreted text or phrase reference start-string without end-string.
/lecture-source-ds/src/applications/visualization_rules.rst:3: WARNING: Duplicate explicit target name: "this article".
/lecture-source-ds/src/python_fundamentals/basics.rst:897: WARNING: Inline emphasis start-string without end-string.
/lecture-source-ds/src/python_fundamentals/basics.rst:897: WARNING: Inline emphasis start-string without end-string.
looking for now-outdated files... none found
pickling environment... done
checking consistency... lecture-source-ds/src/problem_sets/index.rst: WARNING: document isn't included in any toctree
done
preparing documents... done
CONFIG [definition_list] support for definition list in myst requires:
myst_deflist_enable = True
to be specified in the conf.py
CONFIG [definition_list] support for definition list in myst requires:
myst_deflist_enable = True
to be specified in the conf.py
CONFIG [definition_list] support for definition list in myst requires:
myst_deflist_enable = True
to be specified in the conf.py
CONFIG [definition_list] support for definition list in myst requires:
myst_deflist_enable = True
to be specified in the conf.py
writing output... [100%] scientific/randomness
WARNING: SKIP python_fundamentals/collections [transition] objects are not supported by sphinx-tomyst
build succeeded, 7 warnings.
using :nonumber:
and setting exercise title to local number
introduction, course description: invalid link to the previous projects.
introduction, cloud setup: link to the old website, which should be to the new homepage.
introduction: troubleshooting: link to the old website, which should be to the new homepage.
introduction: troubleshooting: issue tracker, the current one is on QuantEcon/quantecon-notebooks-datascience
The homepage is still under construction. It is blank right now and we need to add:
Also, in the lecture Course Description, there should be a link to the previous project as well, which will be added once the website for the previous projects is migrated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.