Git Product home page Git Product logo

applied-machine-learning-intensive's Introduction

Applied Machine Learning Intensive

Overview

The Applied Machine Learning Intensive (AMLI) is a collection of content that can be used to teach machine learning. The original content was created for a 10-week, bootcamp-style course for undergraduate college students. Designed for students who weren’t necessarily majoring in computer science, the goal was to enable participants to apply machine learning to different fields using high-level tools.

The content primarily consists of slides, Jupyter notebooks, and facilitator guides. The slide decks are written in marp markdown syntax, which can be exported to other formats. The Jupyter notebooks were written in and targeted to run in Colab. The instructor guide as an odt document.

Answer Keys

Applied Machine Learning Intensive instructional materials are available open source for faculty looking to run this program for students. This repository offers all slide decks, facilitation guides, labs, and gradable items. Because the program is considered academic in nature, we ask that interested faculty fill out the form below to receive a password to unlock the answer keys. We will provide you with a password that can be used to unlock the keys using a standard zip program or the tools/unlock_labs.py tool found in this repository.

Please fill out the following brief form to receive the answer keys for the curriculum:

https://docs.google.com/forms/d/e/1FAIpQLSd9v0az2wmKP659Xx5SlS7WPbQPD3u3yLXZMn0LHf3Vjj-ziw/viewform

The information that you submit will be maintained in accordance with Google’s Privacy Policy.

Licensing Information

All course content (Colabs, slides, guides, and materials) are open sourced under the CC-BY-4.0 International license. All code contained in this course is open sourced under the Apache 2.0 license.

Attribution and license information for content not created by Google will be presented in the speaker notes.

applied-machine-learning-intensive's People

Contributors

joshmcadams avatar roesch88 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

applied-machine-learning-intensive's Issues

T04-05 Using Prebuilt Models Project: Add challenges to avoid having student just cut/paste code from example[138756088]

Students really liked this project. They said it was a very cohesive unit with the three previous colabs. A few of them asked for it to be less copy/paste of code, since they didn't really know what was actually happening in the code.

Maybe adding some additional concrete challenge problems that force them to actually update the code. Or some additional more specific competencies that require them to dig into the details a little more, may be helpful.

T03-09 Linear Regression Project: Investigate a better data set for the project[138756760]

The dataset is not appropriate for linear regression.

Unfortunately, this dataset didn’t allow contrast between good/bad outcomes

Perhaps it was intended for ""logistic regression."" There does not seem to be a linear relationship between any of the variables. You may consider using this dataset for the data exploration project and the airline data for regression. But it should be checked if the airline data actually has any linear relationships. It would be ideal to have examples of variables that are and are not linearly related.

T05-01 Classification with SciKit Learn: Exercise 6, “scores” is a vague term which confuses the students as it suggests a scoring method, like accuracy or precision.[138755668]

In exercise Exercise 6, “scores” is a vague term which confuses the students as it suggests a scoring method, like accuracy or precision.

Although there is not a straightforward better term, “decision scores” or “decision function output” may be less confusing. If this is changed, the example should be changed to be consistent with this.

T02-05 Intermediate Pandas: Colab too massive. Consider alternatives.[138756765]

Colab is too massive.

Investigate breaking it into 2 colabs, or Maybe try intro Pandas, then baby linear regression, then intermediate pandas, then harder regression. So much of the pandas work is about data cleaning (which is the bulk of the work in data science), but is frustrating and challenging for beginners. It’s bringing the mood of the class down a little bit. But maybe if we alternate learning pandas with using it to get ready to build a model, then they’ll see the payoff of building a model/plotting.

TXX-09 Webframeworks: Consider Flask in lieu of Django[138816043]

Django tutorial seemed a bit disjointed from the day. Flask might be simpler and more useful to get across the learning objective

Question about why use a web framework?
Make clear that command line is not part of django
Provide an introduction to Jupyter
Have exercises where you’re calling .py files

T05-02 Bucketization Features: It's also very short and could probably be part of another linear regression module.[138762001]

"It's also very short and could probably be part of another linear regression module.

It's really just about defining a piecewise function, which isn't a huge jump from regular regression. Maybe it makes most sense between univariate linear regression and polynomial regression. Then the discussion is univariate linear regression, multivariate linear regression, piecewise linear, polynomial. And we can think of this as a bridge from linear to polynomial. Because if the ""buckets"" are close together and follow a polynomial-like pattern, then it may be a decision on what type of model to use."

T02-07 Statistical Analysis of Data: The exercises are worded very vaguely.[138767364]

The exercises are worded very vaguely.

Scaffolding the problems would be helpful (maybe suggest exact functions to use, break each problem into commented subparts of the solution block, etc.). Note: We gave this Colab as an optional exercise (for students who finished other work early). Many students felt that this should have been required and wanted to cover these topics together as a class (with a lecture/activity and resulting assignment).

T04-04 OpenCV: Use Numpy operations in lieu of OpenCV built-in functionality?[138756758]

OpenCV, like PIL, has a lot of built-in functionality that can be done just as easily with Numpy. I think this would be a better way to do things with the students, so that we scaffold on the work we’ve already done with them.

For example, suppose we’ve loaded the car.jpg image as “image” with cv. Then “image” gets treated as a Numpy array. Hence, to convert RGB to BGR we could simply do this:

image=image[:,:,::-1] #reverse the order of B,G,R
or this:
image[:,:,[0,2]]=image[:,:,[2,0]] #swap B and R

Horizontal and vertical flips are similar:

image=image[::-1,:,:] #Flip across horizontal axis
and
image=image[:,::-1,:] #Flip across vertical axis

Here’s a vertical compression by a factor of 2:
image=image[::2,:,:]

T06-03 Convolutional Neural Networks: Additional resources to share with students[138774020]

Additional resources:

""What is the best CNN architecture for MNIST?"" - Chris Deotte
-Ideas on troubleshooting hyperparameters CNN's
https://www.kaggle.com/cdeotte/how-to-choose-cnn-architecture-mnist

""What do we understand about Convolutional Neural Networks"" - Isma Hadji and Richard P. Wildes
https://arxiv.org/pdf/1803.08834.pdf
(94 pages)

""Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet"" - S. Hasanpour, M. Mohsen Fayyaz, M. Sabokrou ,and E. Adeli
https://arxiv.org/pdf/1802.06205.pdf

""A Framework for Designing the Architectures of Deep Convolutional Neural Networks"" Saleh Albelwi and Ausif Mahmood
https://www.mdpi.com/1099-4300/19/6/242

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.