Git Product home page Git Product logo

python-fundamentals-legacy's Introduction

D-Lab Python Fundamentals Workshop

Datahub Binder License: CC BY 4.0

This repository contains the materials for D-Lab’s Python Fundamentals workshop. No prior experience with Python is required to attend this workshop.

Workshop Goals

This four-part, interactive workshop series is your complete introduction to programming Python for people with little or no previous programming experience. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

Each of the parts is divided into a lecture-style coding walkthrough interrupted by challenge problems, discussions of the solutions, and breaks. Instructors and TAs are dedicated to engaging you in the classroom and answering questions in plain language.

  • Part 1: Introduction to Python and Jupyter Notebooks, variables, data types, and functions.
  • Part 2: Data structures, loops, conditionals, and creating functions.
  • Part 3: Libraries, File I/O, and scientific computing.
  • Part 4: Error handling, style, and an applied, in-depth project.

Installation Instructions

Anaconda is a useful package management software that allows you to run Python and Jupyter notebooks easily. Installing Anaconda is the easiest way to make sure you have all the necessary software to run the materials for this workshop. If you would like to run Python on your own computer, complete the following steps prior to the workshop:

  1. Download and install Anaconda (Python 3.9 distribution). Click the "Download" button.

  2. Download the Python Fundamentals workshop materials:

    • Click the green "Code" button in the top right of the repository information.
    • Click "Download Zip".
    • Extract this file to a folder on your computer where you can easily access it (we recommend Desktop).
  3. Optional: if you're familiar with git, you can instead clone this repository by opening a terminal and entering the command git clone [email protected]:dlab-berkeley/Python-Fundamentals.git.

Is Python Not Working on Your Laptop?

If you do not have Anaconda installed and the materials loaded on your workshop by the time it starts, we strongly recommend using the D-Lab Datahub to run the materials for these lessons. You can access the DataHub by clicking the following button:

Datahub

The DataHub downloads this repository, along with any necessary packages, and allows you to run the materials in a Jupyter notebook that is stored on UC Berkeley's servers. No installation is necessary from your end - you only need an internet browser and a CalNet ID to log in. By using the DataHub, you can save your work and come back to it at any time. When you want to return to your saved work, just go straight to DataHub, sign in, and you click on the Python-Fundamentals folder.

If you don't have a Berkeley CalNet ID, you can still run these lessons in the cloud, by clicking this button:

Binder

Binder operates similarly to the D-Lab DataHub, but on a different set of servers. By using Binder, however, you cannot save your work.

Run the Code

Now that you have all the required software and materials, you need to run the code.

  1. Open the Anaconda Navigator application. You should see the green snake logo appear on your screen. Note that this can take a few minutes to load up the first time.

  2. Click the "Launch" button under "JupyterLab" and navigate through your file system on the left hand pane to the Python-Fundamentals folder you downloaded above. Note that, if you download the materials from GitHub, the folder name may instead be Python-Fundamentals-main.

  3. Open 00_workshop_setup.ipynb to begin.

  4. Press Shift + Enter (or Ctrl + Enter) to run a cell.

Note that all of the above steps can be run from the terminal, if you're familiar with how to interact with Anaconda in that fashion. However, using Anaconda Navigator is the easiest way to get started if this is your first time working with Anaconda.

Additional Resources

Check out the following online resources to learn more about Python:

About the UC Berkeley D-Lab

D-Lab works with Berkeley faculty, research staff, and students to advance data-intensive social science and humanities research. Our goal at D-Lab is to provide practical training, staff support, resources, and space to enable you to use R for your own research applications. Our services cater to all skill levels and no programming, statistical, or computer science backgrounds are necessary. We offer these services in the form of workshops, one-to-one consulting, and working groups that cover a variety of research topics, digital tools, and programming languages.

Visit the D-Lab homepage to learn more about us. You can view our calendar for upcoming events, learn about how to utilize our consulting and data services, and check out upcoming workshops. Subscribe to our newsletter to stay up to date on D-Lab events, services, and opportunities.

Other D-Lab Python Workshops

D-Lab offers a variety of Python workshops, catered toward different levels of expertise.

Introductory Workshops

Intermediate and Advanced Workshops

Contributors

  • Emily Grabowski
  • Pratik Sachdeva
  • Christopher Hench
  • Rochelle Terman

python-fundamentals-legacy's People

Contributors

aculich avatar akokai avatar alexestes avatar bradfora avatar cadherin avatar emilygrabowski avatar geoffbacon avatar guptasoumya avatar henchc avatar pssachdeva avatar rochelleterman avatar samyag1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-fundamentals-legacy's Issues

Kernal dying

kernal was dying for me and a couple of other participants on datahub. I wasn't hitting up against memory and I was sure to shutdown notebooks as I moved along. Day 3 seemed to be particularly problematic.

Part 1 : typos

  • Fix link to 'documentation' at the top of notebook 5
  • 'An argument with out the proper number of arguments' --> 'A function without the proper number of arguments' near the top of notebook 5

Inconsistency in Notebook 9

Notebook 9: in Principles of Custom Functions, adjust the 'Plan' section output to be a dataframe, rather than two lists, to match the rest of the section.

Provide Guidance for Windows Anaconda Installation

While installing Anaconda on Windows 10, I get this pop-up and I am not sure what I should do. Should I keep the defaults or should I change something in the advanced options of the installation? It would help to have some documentation about what to do when I see this screen.

Anaconda

Typo in notebook 5

'An argument with out the proper number of arguments' --> 'A function without the proper number of arguments' near the top of notebook 5

Unnecessary line continuation (backslash) characters for dictionary literals

It's not necessary to use line continuation (backslash) characters in first two code cells with dictionary literals in
Day_3/12_Dictionaries.ipynb.

poets_dict = {"name": "Forough Farrokhzad", \
            "year of birth": 1935, \
            "year of death": 1967, \
            "place of birth": "Iran", \
            "language": "Persian", \
            "works": ["Remembrance of a Day","Unison","The Shower of Your Hair","Portrait of Forough"]}

works just fine as this:

poets_dict = {"name": "Forough Farrokhzad",
            "year of birth": 1935,
            "year of death": 1967,
            "place of birth": "Iran",
            "language": "Persian",
            "works": ["Remembrance of a Day","Unison","The Shower of Your Hair","Portrait of Forough"]}

This should be just fine in both python3 and python2 (I think?!) so I'm not sure why this was used in the first place?

Also, we should introduce the term "dictionary literal" as this version of defining as dictionary is using the "literal" syntax. For some explanation see: https://softwareengineering.stackexchange.com/questions/319547/literals-versus-instantiating-by-name-lists-and-dicts-in-python and https://www.python.org/dev/peps/pep-0586/

Day 3 Comprehensions & Day 4 Notebooks

16_Comprehensions

  • Consider adding an if, else list comprehension as well (question came up from participants)

16_Answers

  • For each comprehension challenge question, we may want to reinitialize or use new variable names to be used in the answers so the solutions can be created without using the variable names that were already run and set using a for loop (even if the code was incorrect, you would get the correct answer if you reuse the variables)

UPR_Workbook

  • Import statements for 'os' and 'csv' are easy to miss as they fall right under the long introduction text. May want to move these down to Part B where
  • 'csv' module is never called, so there is no need to import it

Part A1

  • The parameter name 'file_name' in section 1.5 is different from the originally used variable 'filename' (no _) used in section 1.1, which could lead to some confusion. We may want to redefine the function parameter as 'filename' to keep it consistent

Part A2

  • In section 2.2, the instructions list the wrong variable names. The 3 new lists should be 'accept_recs', 'examine_recs', 'reject_recs'

Part A3

  • Might be a little out of scope for this workshop, but we could use this part as a good opportunity to introduce Regular Expressions (regex), and demonstrate what a solution using regex might look like.
    *Same comment throughout the notebook. For clarity, may want to use the actual variable name in the markdown
  • We do not specify what strings to use for the 'decision_type' variable. We may want to make that explicit

DataFrame subsetting (notebook 17 and 6)

Streamline subsetting to 2-3 key ways (i.e. selecting a column and boolean mask)-- add framing to point students to pandas workshop for further work with dataframes/subsetting

Notebook 17 part 1 scaffolding

Increase scaffolding for notebook 17 part 1 (importing data), and include skeleton code for the answers to help guide students.

Day 3 Workshop Notes 12-8-2021

11 attendees at beginning of workshop. Student comments at start : in this class there is a lot of material to get acquainted with when not familiar with programming, may need another class after this one to continue to learn. Student question concerning order of executing cells in a notebook, having to do with lengths/red/green/blue example. Instructor demonstration about how to restart kernel and clear outputs, note that we don't have to work on notebook in a linear fashion. 16 participants as of 11:00 am. Introduced and discussed python pretty print (pprint library), working with dictionaries/collections. Bulk of time spent on notebook 12 and really drilling down into dictionaries and files, having students share their screens/code and working through it. 14 attendees after the noon break. 15 as of 12:52. Workshop went super smooth.

Day 2 Notebook 11 & Day 3 Notebooks

11_Scope

  • We use docstrings in some of the functions, but never talk about what they are. We may want to have a quick blurb introducing what the triple quotes mean and what docstrings are used for.

13_Files:

  • Bash is introduced without any context, may want to add a little blurb about it here
  • For Challenge 1, some participants decided to read in the file using Pandas. This sets 'Alameda' to the header. Here, it may be a good opportunity to include a hint about setting one of the function arguments 'header=None', and linking to the Pandas documentation to introduce how to read source code docs.
  • Need "import numpy as np" statement for Challenge 2 question

15_Errors

  • File Errors: The fourth bullet detailing the UnsupportedOperation error probably should say, ""write" flag instead of the "read" flag", not the other way around because the code cell underneath uses the write flag.
  • We should clarify that errors_01 and errors_02 packages are custom .py files we have written, maybe by specifying this during the import statement.

15_Answers

  • error_02 module cannot be found because it is not in the same directory as the answers notebooks. Either we would need to copy the errors_02 module to the answers, or create an importable directory that contains both of these error_* files (e.g., error directory with an init.py that contains both error_* files)

17_Beautiful-Code

  • Not sure what the import statements are for under the "Long Lines & Continuations" section. They also throw errors.

day 1 , lesson 6 change parameter to argument

In the sentence "A function is a piece of code that is called by name. It can be passed data to operate on (ie. the parameters) and can optionally return data (the return value)." change parameters to arguments.

Curriculum Review

Day 2, 07 Answers and and 07_lists have different answers for low_high question
Add time, biobreaks, and breakout group suggestions (in the questions / problems) into the presentation so it’s not skipped in remote workshops
Teaching Day 4 is hard because it’s not very interactive, especially in remote workshops
Across all of these, create a set of instructions that we can project into the screen (powerpoint maybe?)
Download Anaconda
Go to github dlab
How to open a file in Jupyter Notebook
Also FAQ that we can add into the invitation about whether we will provide recordings

Part 2 08_loops Issue with mountain_df

Under Challenge 2: Looping Through a Series, the end of a code block for creating the mountains_df data frame called "mountain_df" instead of "mountains_df".

Part 2, 08_loops error with rounding

Under "Loops for repeated computation", the list is currently:

tires = [41,35,28]

This should be changed to:

tires = tires = [40.9, 35.2, 28.4]

This will show the round() in the loop is actually working.

Data Frame methods (notebook 12)

Reduce the number of methods introduced to 2-3 and include a challenge question to reinforce learning ( move other functions to an appendix or other section)

Parts 2-4 considerations

Part 2:

Potentially split up notebook 6 into two sections, lists and dictionaries/dataframes.

Notebook 9: in Principles of Custom Functions, adjust the 'Plan' section output to be a dataframe, rather than two lists, to match the rest of the section.

Consider reducing material slightly (I ended up teaching the second half of notebook 9 on Day 3)

Part 3:

  • Consider reducing number of skills covered slightly. Some suggestions:
  • Shorten or remove numpy notebook (consider introducing the numpy package in the Libraries notebook (10) instead)
  • Reduce number of functions introduced in pandas notebook, and increase guided exercise

Part 4:

  • Include answer for notebook 15 challenge 1 in solutions notebook
  • Increase scaffolding throughout the exercise in notebook 17, clarify steps explicitly, and include skeleton code in some areas to help facilitate coding. Potentially shorten sections 2 and 3 slightly to reduce the total amount of material

Split notebook 6

Split up notebook 6 into two sections, lists and dictionaries/dataframes.

Suggested Revisions for the Day 4 project

The Day 4 project got a bit sidetracked by the complexity of section 4.4 Combine output dictionaries. So I added a section 2.4 to the attached solutions notebook and then reworked the rest of the code as one way to make it a bit more digestable. This file would need to be cleaned up a bit and then worked into the other two notebooks but I think it would be worthwhile because this is 95% great notebook.
UPR_Solutions-Patty-revisions.ipynb.zip

Update installation instructions to say Python 3.7 or higher

The instructions currently specify Python 3.7 exactly, which confuses some detail-oriented people because they can no longer find the older 3.7 now that Anaconda provides a newer 3.8 version of Python.

This confusion can be avoid by updating the instructions everywhere (not just in this repo, but we should check for this elsewhere, too) by just saying Python 3.7 or higher.

---------- Forwarded message ---------
From: D-Lab Frontdesk [email protected]
Date: Tue, Oct 26, 2021 at 2:17 PM
Subject: Re: Python 3.7 Installation problem?
To: Matt Nolan [email protected]

Hi Matt, the instructions should say Python 3.7 or higher. Thanks for pointing out the confusion!

On Tue, Oct 26, 2021 at 10:08 AM Matt Nolan [email protected] wrote:
The instructions for the class seem to be out of date.

The page you pointed to contains places to download Python 3.8 but not Python 3.7.

Python 3.7 isn't listed in the archive

Maybe its the notes in the google calendar event that are mistaken it says 3.7, should it say 3.8?

Can you add some clarity about what exactly I am supposed to download to be prepared for the class today? thanks

Software Requirements: Installation Instructions for Python Anaconda

D-Lab Python Fundamentals
This is the repository for D-Lab's introductory Python-Fundamentals workshop series. Laptop, Internet connection, and Zoom account required.

Workshop goals
There are four folders (one for each day) that contain the notebooks we will walk through for each day:

Day_1 - Running Python, Jupyter Notebooks, variables assignment, data type conversion, working with strings, built-in functions

Day_2 - Lists, for-loops, conditional statements, writing your own functions, scope

Day_3 - Dictionaries, reading and writing data from and to files, installing and importing libraries, debugging errors, list comprehensions, beautiful code

Day_4 - Python application for information retrieval. You will extract targeted information from a text data set of United Nations documents to generate tabular data in a .csv file suitable for subsequent statistical analysis. Everything needed for this exercise is covered in Days 1, 2, and 3.

Installation Instructions
Download and install Python Anaconda distribution 3.7 and the workshop materials to get started. Before Part 1 be sure to:

Download and install Python Anaconda distrubtion 3.7 --> Click "Download" and then click 64-bit "Graphical Installer" for your current operating system.

====

I searched the page and Download isn't found. Towards the bottom of the link to 3.7 installation its python 3.8 downloads

Reduce material in part 3 - numpy

Reduce/remove numpy notebook. I see potential for featuring numpy in the 'libraries' notebook (10) rather than a standalone notebook

Workshop Notes PyFun 4-5-2022 through 4-14-2022

Attendees = 8 at start. Wide range of student interests in learning python - to broaden skills, for jobs, not to get left behind when it comes to programming languages, 13 as of 9:36. Workshop going super smoothly. Lots of good questions and student interest. Drop reasons include choppy zoom/audio, other causes include other commitments. 12 as of 10:33. Some of the usual issues with Anaconda - tab completion not working, files not showing up in file list when attempting to open in Jupyter Notebook, installs not working properly. Love that we are offering Datahub link as an alternative to facilitate learning in what is most often the first experience for folks who are interested in learning python. 8 as of 11:15. 9 as of 11:33.

Shift custom functions to Part 1

Move 10_custom_functions to Part 1, and streamline it down so that it doesn't use material from Part 2. Move the challenges to Part 2.

introduction of encoding = "utf-8" without explanation

In section Read a text file in one line this lesson on Files Day_3/13_Files.ipynb introduces encoding = "utf-8" without any explanation.

my_text = open("../Day_4/data/txts/fiji2014.txt", encoding = "utf-8").read()

Also it turns out for OSX (my OS) and Python3, it is not necessary to include the encoding parameter at all, as this works just fine:

my_text = open("../Day_4/data/txts/fiji2014.txt").read()

But this may fail elsewhere (even with Python3) if the default encoding for either the OS or the filesystem is something other than utf-8.

So, either we need to remove this (simplest solution) from the code or we need to provide an explanation and motivating example(s) (which could probably be a whole module itself on text file encodings, etc).

Day 4 Solution Notebook Part 5 `process_document` function uses global variable

The process_document function in part 5 of the Day 4 Solution Notebook does not extract the year from each file. Instead, it uses the global year variable assigned in section 1.2 of the notebook, whose value is 2014. As a result, even for the text files with 2013 in the name, the final output csv has 2014 as the year.

Since the read_recommendations function already extracts the year and country from the filename but does nothing with this information, it might be good to have this function ALSO return year and country on top of the recs. Then there could be a local variable year within the process_document function that stores these returned variables in year and country variables.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.