Git Product home page Git Product logo

dsc-looping-over-collections-lab's Introduction

Looping Over Collections - Lab

Introduction

In this lab, we will be practicing what we know about for loops. We will use them to reduce the amount of code we write by hand to iterate through collections. We will use data from the excel file, cities.xlsx, that has data on different cities, their populations, and their areas. Finally, we will use this information to plot and compare each city. Let's get started!

Objectives

You will be able to:

  • Use a for loop to iterate over a collection

Identifying When To Use a For Loop

In the last lesson, we worked with some of our travel data. Additional data has been compiled in the cities.xlsx excel spreadsheet. Let's retrieve this data from excel using the Pandas library. Don't worry if Pandas feels unfamiliar, it will be covered in detail later. For now, just follow the provided code and get a feel for what is happening. First, read the information from the excel file as a list of dictionaries, with each dictionary representing a location. Then, assign this list to the variable cities.

import pandas as pd
file_name = './cities.xlsx'
travel_df = pd.read_excel(file_name)
cities = travel_df.to_dict('records')

Next, retrieve the first three city names, stored as the 'City' attribute of each dictionary, and 'Population' of each of the cities. Then plot the names as our x_values and the populations as our y_values using the matplotlib library. Again, don't worry about understanding all of the details behind what matplotlib is doing. It will be covered in more detail soon.

import matplotlib.pyplot as plt

%matplotlib inline

x_values = [cities[0]['City'], cities[1]['City'], cities[2]['City']]
y_values = [cities[0]['Population'], cities[1]['Population'], cities[2]['Population']]
 
plt.bar(x_values, y_values)
plt.ylabel('Population')
plt.title('City Populations')
 
plt.show()

Of course, as you may have spotted, there is a good amount of repetition in displaying this data. Just take a look at how we retrieved the data for our x_values and y_values. And you'll notice that, unless we know the exact number of cities and populations in our excel file, this method of retrieving data might miss some data or try to access values that don't exist.

We can take a close look at this below:

x_values = [cities[0]['City'], cities[1]['City'], cities[2]['City']]
y_values = [cities[0]['Population'], cities[1]['Population'], cities[2]['Population']]

As we can see, if we have any more than 3 lines of data, our x_values and y_values will be incomplete, and if we had only 2 lines of data, our code would break.

So in this lesson, we will use for loop to display information about our travel locations with less repetition and more accuracy.

Instructions

Before we get into creating graphs from our cities data, let's get a bit more comfortable with the data we are working with. Let's see if we can iterate through just one element (i.e. a city dictionary object) to get the area.

buenos_aires = cities[0]
buenos_aires
# here we want to find just the area of buenos_aires
buenos_aires_area = None
# code goes here

buenos_aires_area

Now that we have a bit more familiarity with our dictionaries, we can move to gathering all the information we need to create our traces.

Our cities list contains information about the top 12 cities. For our upcoming iteration tasks, it will be useful to have a list of the numbers 0 through 11. Use what we know about len and rangeto generate a list of numbers 0 through 11. Assign this to a variable called city_indices.

city_indices = None
city_indices # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Now, using the cities list, we want to create a list of the names for each city. Loop through each city and append it's name ('City') to the city_names list.

city_names = []

city_names

Your task is to assign the variable names_and_ranks to a list, with each element equal to the city name and its corresponding rank. For example, the first element would be, "1. Buenos Aires" and the second would be "2. Toronto". Luckily for us, the list of cities that we read from our excel file is already in order by most populous to least. So, all we need to do is add numbers 1 through 12 to the beginning of each city name.

Use a for loop and the lists city_indices and city_names to accomplish this. We'll need to perform some nifty string interpolation to format our strings properly. Check out f-string interpolation to see how we can pass values into a string. Remember that list indices start at zero, but we want our names_and_ranks list to start at one!

names_and_ranks = []

names_and_ranks
# write a for loop that adds the properly formatted string to the names_and_ranks list
# run this cell to check that your output matches the format
print(names_and_ranks[0]) # '1. Buenos Aires'
print(names_and_ranks[1]) # '2. Toronto'
print(names_and_ranks[-1]) # '12. Iguazu Falls'

Ok, now use another for loop to iterate through our list of cities and create a new list called city_populations that has the population for each city (Population).

city_populations = []
# use a for loop to iterate through the list of cities with their corresponding population
print(city_populations[0]) # 2891000
print(city_populations[1]) # 2800000
print(city_populations[-1]) # 0

Great! Now we can begin to plot this data. Again, we'll used matplotlib to create a bar graph with our cities and their respective population data. To do this, we use the .bar() function and pass in our x-axis and y-axis values, add a label and title, and finally we call the .show() function to view our new bar graph.

Note: In the example below, we are adding a custom rotation for our x-axis labels so that they do not overlap.

plt.bar(names_and_ranks, city_populations)
plt.xticks(rotation='vertical')
plt.ylabel('Population')
plt.title('City Populations')
plt.show()

Now we want declare a variable called city_areas that points to a list of all of the areas of the cities. Let's use a for loop to iterate through our cities and have city_areas equal to each area of the city.

city_areas = []
#write a for loop that adds the 'Area' of each city to the list city_areas

Now that we have the city areas and populations, let's plot them to see how the size of each city compares to its population.

plt.bar(names_and_ranks, city_populations)

plt.ylabel('Population')
plt.xlabel('Cities')
plt.title('City Populations')
plt.xticks(rotation='vertical')
 
plt.show()
plt.bar(names_and_ranks, city_areas)
plt.ylabel('Area')
plt.xlabel('Cities')
plt.title('City Areas')
plt.xticks(rotation='vertical')
 
plt.show()

Summary

In this section we saw how we can use for loops to go through elements of a list and perform the same operation on each. By using for loops we were able to reduce the amount of code that we wrote and write more expressive code.

dsc-looping-over-collections-lab's People

Contributors

cheffrey2000 avatar hoffm386 avatar loredirick avatar mas16 avatar peterbell avatar tkoar avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dsc-looping-over-collections-lab's Issues

Problem with spreadsheet

Link to Canvas

Issue Subtype

  • Master branch code
  • Solution branch code
  • Code tests
  • Layout/rendering issue
  • Instructions unclear
  • [x ] Other (explain below)

Describe the Issue

Source

The spreadsheet used during this segment of the class has some data inverted. The population and area for Toronto are inverted causing the results to be wrong. Not only that but the data is not given to the user organized by rank (highest population), which is something said to be true in the notebook.

Concern

Toronto's population and area are inverted and the cities are not organized by rank.

(Optional) Proposed Solution

Fix spreadsheet.

What OS Are You Using?

  • OS X
  • Windows
  • WSL
  • [ x] Linux
  • Saturn Cloud from Canvas

Any Additional Context?

Needs additional python package

Same issue as in the Working with Dictionaries - Lab:

Missing optional dependency 'xlrd'. Install xlrd >= 1.0.0 for Excel support Use pip or conda to install xlrd.

Would need to install an additional python library to work with this file type, which is throwing errors in IllumiDesk - recommend moving the data from a .xlsx file to a csv or something

Solution doesn't follow prompt

In the solution page, when defining city_indicies, the prompt asks for you to use len(). The solution doesn't use len().

Unable to see assignment instructions, only see blank notebook titled "Index"

Link to Canvas

https://learning.flatironschool.com/courses/7380/assignments/258627?module_item_id=613169

Issue Subtype

[ ] Master branch code
[ ] Solution branch code
[ ] Code tests
[X] Layout/rendering issue
[ ] Instructions unclear
[X] Other (explain below)

Describe the Issue

Source

**There is no code or markdown causing the issue, because I cannot even see the assignment instructions. I am able to open Saturn Cloud, but all I see is a completely blank notebook titled "index".**

Concern

There is no code or markdown causing the issue, because I cannot even see the assignment instructions. I am able to open Saturn Cloud, but all I see is a completely blank notebook titled "Index".

(Optional) Proposed Solution

I do not have a suggestion for alternative code or markdown that would resolve the issue.

What OS Are You Using?

  • OS X
  • Windows
  • WSL
  • Linux
  • [X] Saturn Cloud from Canvas

Any Additional Context?

This is the first time in the course that I have really had an issue with Saturn Cloud from Canvas and if I cannot even see the assignment instructions, then obviously I will not be able to successfully complete it. Please resolve the issue as soon as possible. Thank you so much!

Incorrect data

Link to Canvas

https://learning.flatironschool.com/courses/7620/assignments/268916?module_item_id=646890

Issue Subtype

  • Master branch code
  • Solution branch code
  • Code tests
  • Layout/rendering issue
  • Instructions unclear
  • [ x] Other (explain below)

Describe the Issue

Source

The issue is in the downloaded file, not the notebook

Concern

The notebook includes descriptions of what is in the excel file, but those descriptions are incorrect:

  1. The cities are not in order by population.
  2. The values for Toronto's area and population are inverted.
  3. The values for Buenos Aires and Toronto's population differ than what is described in the notebook.

(Optional) Proposed Solution

Please fix the data in the excel file to ensure that it is correct.

What OS Are You Using?

  • OS X
  • [x ] Windows
  • WSL
  • Linux
  • Saturn Cloud from Canvas

Any Additional Context?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.