Git Product home page Git Product logo

effective-pandas's Introduction

effective-pandas's People

Contributors

jzwinck avatar lebigot avatar ocowchun avatar rns avatar tobypatterson avatar tomaugspurger avatar whitehaven avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

effective-pandas's Issues

chapter4_tidy_data variable versus rest plot gives some empty plots

These all tutorial are superbly great, and the author has put lots of efforts on them. The code might have been running excellent on 2016 but now (at 2019 Feb),
some of the codes of chapter_4_tidy_data fails.

Issue 1

example:

g = sns.FacetGrid(tidy, col='team', col_wrap=6, hue='team', size=2)
g.map(sns.barplot, 'variable', 'rest');

Gives:
imagur link

Issue 2

rest.unstack()
        .query('away_team < 7')
        .rolling(7)
        .mean()

Gives all NANS and plot fails.

and so on.

error when loading data

when running cell 2 of the first notebook I get:

SSLError: HTTPSConnectionPool(host='www.transtats.bts.gov', port=443): Max retries exceeded with url: /DownLoad_Table.asp?Table_ID=236&Has_Group=3&Is_Zipped=0 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)')))

It looks a problem related to the connection to the API.
Would it be possible to include as part of the repo the data to work with?
Thanks

Python 3.6 pandas_datareader not found

I am running Windows 10 64 bit, python 3.6, spyder. I have both pandas_datareader and pandas_datareader-0.5.0.dist-info installed in the same site packages as pandas. When I attempt to import the module I receive the following error message:

Python 3.6.2 |Anaconda, Inc.| (default, Sep 19 2017, 08:03:39) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 6.1.0 -- An enhanced Interactive Python.

import pandas as pd

import pandas_datareader as pdr
Traceback (most recent call last):

File "", line 1, in
import pandas_datareader as pdr

ModuleNotFoundError: No module named 'pandas_datareader'

Who do I import the pandas_datareader via another method or have python recognize the module?

Effective pandas part 1 - 'modern-1-url.txt

To setup the same flight data in part 1, it calls for the following to set up the data variable; however, I can't find the text file ('modern-1-url.txt') to pull the data set. Where is this text file or what is in it?

with open('modern-1-url.txt', encoding='utf-8') as f:
data = f.read().strip()

modern pandas 3 - merge impossible

The merge (many-to-one) at the end of the third notebook results in an empty data frame, because the weather data is for 2014 and the flights data for 2017. Your results show flight data for 2014, so I imagine you may be using a different dataset.

This may also have to do with the source data being changed; I also noticed that the underscores are removed from the flight data set, e.g. fl_date has become flightdate and unique_carrier has become uniquecarrier.

p.s. Thanks for sharing your well-written code and insights into pandas, they are a very welcome and useful read!

Visualization: Feather seems not to play nicely under Windows

See conda-forge/feather-format-feedstock#1 for a hint on this. Installation is at best problematic - and I found it impossible.

I worked as follows:
Comment out all the following
import feather

%load_ext rpy2.ipython

%%R
suppressPackageStartupMessages(library(ggplot2))
library(feather)
write_feather(diamonds, 'diamonds.fthr')

And then replace
import feather
df = feather.read_dataframe('diamonds.fthr')
df.head()

with:
from ggplot import diamonds
// type(diamonds) # dataframe...
df = diamonds # primitive!
df.head()

There is one much more mundane issue, which I'll raise separately

File download doesn't work

When executing the cell with file download from the Transtats website, we do not download a zip file but an HTML page containing:

<head>
	<script type="text/javascript" src="js/dot_ostr_analytics.js"></script>
</head>
<body>

start time ==> 5:59:28 PM<br>complete time==> 5:59:28 PM

</body></html>

Can you precise which table we should download from the government website?

Discrepancy when downloading the dataset in the notebook .zip / .csv

Hi,

I've just tried to use the first notebook in the series and it turns out the data is downloaded to a file called "flights.csv" which is then opened as "flights.csv.zip".

Suggestion: correct save filename to "flights.csv.zip". Might be especially useful for beginners...

Best regards,
Florian

Intro notebook UnsortedIndexError on the date range (cell 18)

The last cell of notebook 1 is throwing an exception, along with the message: UnsortedIndexError: 'MultiIndex slicing requires the index to be lexsorted: slicing on levels [4], lexsort depth 3'. I don't understand this, since cell 13 seems like it should be sorting the index.

python 3.7, pandas 0.23.4 and 0.24.1.

Incidentally, upgrading to 0.24.1 also broke cell 6, which crashes with KeyError: "['tail_num'] not in index" (which I don't understand at all).

All of that said, I've now learned about IndexSlice, which is truly awesome! Thanks.

Visualization notebook - pandas_datareader

Arguably would have been nicer to see
from pandas_datareader import fred
at the top with other imports, so I could see dependencies more easily.

Feel free to close with no change of course.

Unable to open file on github or offline

Github just says there was a problem with the file, Jupyter provides the following error message:
Unreadable Notebook:
..\modern_2_method_chaining.ipynb NotJSONError('Notebook does not appear to be JSON: '\n\n\n\n\n<html lang="en...',)

pandas_datareader

Not able to find enough resolution on this issue, so I decided to post (relatively new to GitHub, so apologies in advance if I'm doing something wrong).

Operating system: Windows 10
IDE: VS Code (1.37.0)
Python: 3.7.3
Distribution: Anaconda

  1. I created a virtual environment using Conda
  2. I installed pandas_datareader using: conda install -c anaconda pandas-datareader
  3. I confirmed that the package is present in the virtual enviroment
  4. pandas version: 0.25.0
  5. pandas_datarader:0.7.4
  6. When I run import pandas_datareader in VS Code (and I'm running the command interactively/in Jupyter notebook), I get the following error

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import pandas_datareader

ModuleNotFoundError: No module named 'pandas_datareader

I think another option is to install a previous version of pandas....but at this point, I'm not entirely sure what to do. I wanted to check before I take any further steps.
Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.