google / prog-edu-assistant Goto Github PK

An easy system to create autograding tests in Jupyter Python notebook and deploy to cloud.

License: Apache License 2.0

Python 1.47% Jupyter Notebook 94.23% JavaScript 0.11% Go 3.15% Shell 0.16% CSS 0.07% Starlark 0.77% Dockerfile 0.04%

prog-edu-assistant's Introduction

Programming Education Assistant

A project to create a set of tools to add autograding capability to Python programming courses using Jupyter or Colab notebooks.

Who is this project for?

The main target audience is teaching staff who develops programming courses using Jupyter notebooks. The tools provided by this project facilitate addition of autogradable tests to programming assignments, automatic extraction of the autograding tests and student versions of the notebooks, and easy deployment of the autograding backend to the cloud.

The main focus is Japanese universities, so the example assignments provided in the exercises/ subdirectory are mostly in Japanese language.

How to integrate autograder to your course

If you have a course based on Jupyter notebooks and want to integrate the autochecking tests, there are multiple different way how the autochecking tests can be run:

Inside the student notebook (e.g. on Colab). The execution of autochecking tests is handled within the same Python Runtime that student uses. Note that this approach only supports self-checking, and cannot be used for grading student work. See the details in docs/colab.md.
Hosted on Google Cloud Run. The scripts in this repository provide a server and build scripts to build a Docker image that can be deployed to Google Cloud Run. The student submissions can be optionally logged. See the details in docs/cloudrun.md.
Manual execution via scripts. This can be used for local grading of student submissions against the tests defined in the instructor notebook. See the details in docs/grading.md.

The markup format for instructor notebooks is common and described here:

exercises/README.md.

Development environment setup

If you want to contribute to this project, start authoring notebooks or contribute to the project development, see SETUP.md for instructions on how to set up the development environment.

License

Apache-2.0; see LICENSE for details.

Disclaimer

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

prog-edu-assistant's People

Contributors

Stargazers

Watchers

Forkers

yunabe salikh keiichiw jpgovela sumathi16 utdatamining neotim global-localhost global19 global19-atlassian-net asahichine inakaboy isabella232 arata-1972 chembioscripting ghas-results

prog-edu-assistant's Issues

Run linters and formatters on commit

It's not a big deal when only a few people are working on the project.
But it makes things slower meaninglessly if many people commit code with their favarite code formats (e.g. v.s. white-spaces in MD).

Rewrite notebook extraction tool in Python

Currently the notebook extraction tool for extraction of autograder tests
and student versions of notebook is written in Go, but this requires either a Go installation
or using heavyweight Bazel build system.
It would be nice to enable pure Python lightweight integration use case by implementing
the notebook processing step in Python.

Publish the base autograder image to Docker Hub

Currently the build procedure builds everything from sources
using only standard Debian images as a base.
It would be nice to regularly build and publish to Docker Hub a base autograder
image with prebuilt binaries so that the full autograder image could be built
just by adding a layer with autograder tests.

Support non-ipynb submissions

Currently autograder accepts only one kind of submissions -- ipynb notebooks-- , and it finds the metadata inside of the ipynb file.

There are use cases for the autograder where the submission is not a notebook:
-- Submitting HDF5 (h5) model files created by Keras (important in courses related to Machine Learning)
-- Submitting stand-alone python scripts (important in Python courses to teach students to use stand-alone scripts)

Since the server requires metadata to match a submission to the assignment, the upload server should be extended to support taking the metadata in addition to the submission file upload.

There are a few options for inferring metadata:

The upload form can take additional input parameters, which could be provided by a form or as a hidden form input at the client side.
Upload file name could serve as a key in metadata lookup, where the metadata is stored on the server in advance
Upload URLs can encode metadata key inside (i.e. the upload URL is different for each assignment)

Create a custom web front-end designed for teaching

The notebook interface (Jupyter or Colab) is designed to be universal tool for multitude of use cases (data exploration, reports, dashboards, computational experiments etc.), and has a bunch of design traits that make it less than ideal tool for introductory programming teaching.

It would an interesting idea to prototype a purpose-built web frontend specifically designed for introductory programming instruction. Key requirements:

It should not have a notion of cell execution order; perhaps there should be one big cell with function definitions (state-changing code prohibited there), which is executed automatically after any change and a IPython-style console for ad-hoc code execution (where execution order is always fixed to top-to-bottom, and it's impossible to re-execute old cells)
It should have a comprehensive inline help system that could be used to show either reference documentation (API, function names, argument lists etc.) or the instructional content (code tutorial or the assignment).

From the implementation point of view, it's likely the easiest to hack something inside of Jupyter lab, as it is supposed to be easy to design new interaction flows. Alternatively, it could be a completely new frontend that uses Jupyter kernel interface for talking to the kernel.

Create a formal release

In order for the autograder to be useful to more people, it should have a formal release process
so that people could depend on it in a stable manner.

Support Google Colab for authoring workflow

Currently the authoring workflow assumes Jupyter notebooks, but it would be nice to support authoring in Colab, as it allows for better collaboration of multiple people and quick turn around.

Specifiically, the following is necessary:

Ability of the build system to download notebooks from Colab / Google Drive
Ability to install prog_edu_assistant_tools package from PyPI (#34)

Change the assignment_notebook Bazel rule to accept multiple languages

The data flow of the assignment notebook generation works as follows:

a single master notebook contains the content in multiple languages (with markers)
and autograder tests
multiple student notebooks can be generated from master notebook perl-language
a single directory with autograder tests is extracted from the master notebook.

Current Bazel rule "assignment_notebook" is capable of extracting autograder tests
and producing a student notebook for one language.

It is cumbersome to add rules for extraction of student notebooks in additional languages,
so the rule should be improved to support multiple languages directly.

Enable external integration

In order for the autograder to be useful to external people, it is necessary to be able to depend on the autograder code from external projects.

It would be nice to have an example of such integration and a how-to guide too.

Publish prog_edu_assistant_tools package to PyPI

Publishing the tools package to PyPI would unlock increase author convinience and enable more workflows:

Starting a master notebook using standard tools and without depending on the local installation of the autograder work directory
Authoring flow in Google Colab notebooks

Support translation for autograder messages

Autograder tests are code, so it is very inconvenient to have multilingual tests.
However, the messages that are returned to students are just message strings,
so they are well suited for setting up a message translation layer (by lookups).

There should be a translation layer at the stage of creating an autograder report
to enable a multilingual experience for the autograder.

Create an integration test for local testing of the backend

The local integration test should work as follows:

Build the autograder image
Start the autograder image locally without authentication, just like script start-local-combined.sh does
Parse the master notebooks and extract the submissions:
3.1 Canonical solutions from cells with %%solution magic
3.2 Incorrect solutions from cells with %%submission magic
3.3 Ideally the autotests for incorrect solutions should also be extracted
Compose the submission notebooks similarly to creating student notebooks
Send the uploads to local server, check the response is 200 OK and (ideally) check the error message looks as expected by autotests

The local integration testing should be made available as a script or a Bazel rule that could be referenced from and external repo.

Collect user consent during log in flow

There is considerable interest in using the code submissions collected by the autograder for further research. It is required to collect students' consent to enable collection of data.

The authentication flow is a natural place to add a consent form, so it should be possible to ask the user for their consent (and store their response) right after login screen.

Support LTI tool invocation flow

The autograder can be integrated as an LTI tool, i.e. a web application that is registered with an LMS (Learning Management System), that is formally added as an assignment to students.
User flow can look like this:

The LMS admin adds autograder instances as an LTI tool to LMS
The course facilitator creates the assignments and adds links to specific units to a course in LMS
Students click on the links in the course in their LMS and open the Python programming assignment in Colab
The autograder grades the submissions and records the results back to LMS

A specific more technical description of the HTTP requests flow on LTI tool launch (based on limited understanding of LTI 1.3, so details may not be correct)

The student clicks on a link that goes to autograder server. The request includes student ID, role and the assignment ID ("line item" in LTI speak)
The autograder records the assignment ID and student ID in the user session and initiates authentication with LMS OAuth provider flow to obtain a bearer token
The autograder responds to the student request by redirecting them to Colab notebook with an assignment. Ideally it should pass the authentication token to Colab, but since templating feature in Colab is currently disabled for security reasons, it's more likely the autograder will need to show a page with token with an instruction to copy token, click on the Colab link, then paste the token inside of Colab notebook
On notebook submission, the student should indicate whether their submission is intermediate checkpoint or final submission
After grading a final submission, the autograder may submit grades back to LMS, using the line item, user ID and authentication token recorded in the user session.

Create a helper library and samples for code checks

Writing autograding tests for student submissions is not a trivial task, as there are unlimited number of ways how the submission may be incorrect. For best student experience, it is necessary to detect many of the possible cases and provide as specific feedback message as possible.

To facilitate creation of good autograder tests, we need to have

A python library with functions that can do common checks (e.g. presence/type of a variable or a function)
A collection of code snippets with explanations.

Drop message queue support in the backend

Deployment via Google Cloud Run turned out to be very well-behaving and convenient:

no need to manually tune capacity (VM right-sizing etc.)
no need to pay if requests are not coming (so can keep a backend instance available for longer)

Thus it would make sense to migrate to exclusive use of Cloud Run backend and clean up the code that was necessary for docker compose deployment:

multiple Docker images (server, worker)
message queue code

If people want to have more control over deployment, I guess one could use KNative directly.

Support hover annotations on code

It would be nice to be able to annotate user code with additional messages ,and hints.
Since the code submission is already included in the autograder report,
the most natural place to show the annotations to the user would via hover annotations,
perhaps with some visible markers on the source code.

On the backend side, there should be a support in autograder tests to emit source code annotations.