magigo / data_science_tool_book_code Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 5.0 224 KB

Python 9.84% HTML 90.16%

data_science_tool_book_code's People

Contributors

Stargazers

Watchers

Forkers

lxwphd airob hwx69375 cybest2010 fredreck

data_science_tool_book_code's Issues

第八章代码运行不了

def read_excel():
"""读取人口普查分民族/年龄/性别统计
"""
excel_content = pd.read_excel("A0201.xls",
skiprows=2)
race_list = excel_content.irow(0)[1:][::3].tolist()
# 去掉字符中间的空格
age_list = map(lambda x: str(x).replace(" ", ""),
excel_content.icol(0)[2:].tolist())
excel_content = pd.read_excel("A0201.xls",
skiprows=4)

def get_num(lines):
    ret_dict = OrderedDict()
    for k, v in lines.to_dict().items():
        new_v_dict = OrderedDict()
        for vk, vv in v.items():
            new_v_dict[age_list[int(vk)]] = vv
        ret_dict[k.split(".", 1)[0]] = new_v_dict  # 将每一列表头中"."号后面的字符去掉
    return ret_dict

result_dict = OrderedDict()
for i, x in enumerate(range(1, 178, 3)):
    ids = [x, x + 1, x + 2]
    race_list[i] = race_list[i].replace(" ", "")
    result_dict[race_list[i]] = get_num(excel_content.icol(ids))

return result_dict

这是作者给出8.1.1节的代码，运行不了报了AttributeError: 'DataFrame' object has no attribute 'irow' ，查了原来是irow和icol属性被弃用了。看了官方的文档，是用iloc代替，文档全英文的我好难看懂啊，换了iloc之后运行又出错'DataFrame' object has no attribute 'tolist' ，还是运行不了。还有注释太少了，这本书属于入门书，注释少看不懂啊，excel_content.irow(0)[1:][::3].tolist()，这里也看不懂，书前面没有讲lambda，代码写了却没有对它加个注释，看起来很困难。这段代码运行不起来，第8章后面也运行不起来，希望作者及时更正下。

SyntaxError was prompted while execute hello.py

I totally wrote as the "Hello.py", but met the following error:
SyntaxError: Non-ASCII character '\xe4' in file E:\Program Files\Learning\Python\Hello.py
on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ f
or details

Internal numbers of objects are not same by id() as the book showed

dollar_rate=USD_to_CNY=6.631
id(6.631)
42354128L
id(USD_to_CNY)
42354152L
id(dollar_rate)
42354152L

Python version:
Python 2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:25:58) [MSC v.1500 64 bit (AM
D64)] on win32

KNN近邻算法中对鸢尾花的数据预处理似乎有问题？

def get_data(loc='iris.csv'):
    with open(loc, 'r') as fr:
        lines = csv.reader(fr)
        data_file = np.array(list(lines))
    data = data_file[1:, 0:-1].astype(float)
    labels = data_file[1:, -1]
    return data, labels

这里的切片我修改为了从0开始

还有一处地方

def try_once():
    data, labels = get_data()
    index = range(len(data))
    data = data[index]
    labels = labels[index]
    index = list(index)
    random.shuffle(index)
    labels = labels[index]
    data = data[index]
    input_data = data[-1]
    data = data[:-1]

python3中不能对rang类型的使用shuffle方法，所以我修改为先转换为list类型的

scikit-learn无法导入

C:\Users\Administrator>python -m pip install sklearn，scipy
这个成功了
但导入时：
C:\Users\Administrator>python

import sklearn
说无法导入NUMPY-MKL

23页有个小错误

23页第3、4行：

"abc"+"dfe"
abcdef
其中一个改一下fe的顺序就行了。
这本书对初学者是比较容易入门的。看了不头痛。不像有些书，太深奥，越看头越大。

magigo / data_science_tool_book_code Goto Github PK

data_science_tool_book_code's People

Contributors

Stargazers

Watchers

Forkers

data_science_tool_book_code's Issues

第八章代码运行不了

SyntaxError was prompted while execute hello.py

Internal numbers of objects are not same by id() as the book showed

KNN近邻算法中对鸢尾花的数据预处理似乎有问题？

scikit-learn无法导入

23页有个小错误

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent