Git Product home page Git Product logo

data_science_tool_book_code's People

Contributors

magigo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

data_science_tool_book_code's Issues

第八章代码运行不了

def read_excel():
"""读取人口普查分民族/年龄/性别统计
"""
excel_content = pd.read_excel("A0201.xls",
skiprows=2)
race_list = excel_content.irow(0)[1:][::3].tolist()
# 去掉字符中间的空格
age_list = map(lambda x: str(x).replace(" ", ""),
excel_content.icol(0)[2:].tolist())
excel_content = pd.read_excel("A0201.xls",
skiprows=4)

def get_num(lines):
    ret_dict = OrderedDict()
    for k, v in lines.to_dict().items():
        new_v_dict = OrderedDict()
        for vk, vv in v.items():
            new_v_dict[age_list[int(vk)]] = vv
        ret_dict[k.split(".", 1)[0]] = new_v_dict  # 将每一列表头中"."号后面的字符去掉
    return ret_dict

result_dict = OrderedDict()
for i, x in enumerate(range(1, 178, 3)):
    ids = [x, x + 1, x + 2]
    race_list[i] = race_list[i].replace(" ", "")
    result_dict[race_list[i]] = get_num(excel_content.icol(ids))

return result_dict

这是作者给出8.1.1节的代码,运行不了报了AttributeError: 'DataFrame' object has no attribute 'irow' ,查了原来是irow和icol属性被弃用了。看了官方的文档,是用iloc代替,文档全英文的我好难看懂啊,换了iloc之后运行又出错'DataFrame' object has no attribute 'tolist' ,还是运行不了。还有注释太少了,这本书属于入门书,注释少看不懂啊,excel_content.irow(0)[1:][::3].tolist(),这里也看不懂,书前面没有讲lambda,代码写了却没有对它加个注释,看起来很困难。这段代码运行不起来,第8章后面也运行不起来,希望作者及时更正下。

KNN近邻算法中对鸢尾花的数据预处理似乎有问题?

def get_data(loc='iris.csv'):
    with open(loc, 'r') as fr:
        lines = csv.reader(fr)
        data_file = np.array(list(lines))
    data = data_file[1:, 0:-1].astype(float)
    labels = data_file[1:, -1]
    return data, labels

这里的切片我修改为了从0开始

还有一处地方

def try_once():
    data, labels = get_data()
    index = range(len(data))
    data = data[index]
    labels = labels[index]
    index = list(index)
    random.shuffle(index)
    labels = labels[index]
    data = data[index]
    input_data = data[-1]
    data = data[:-1]

python3中不能对rang类型的使用shuffle方法,所以我修改为先转换为list类型的

scikit-learn无法导入

C:\Users\Administrator>python -m pip install sklearn,scipy
这个成功了
但导入时:
C:\Users\Administrator>python

import sklearn
说无法导入NUMPY-MKL

23页有个小错误

23页第3、4行:

"abc"+"dfe"
abcdef
其中一个改一下fe的顺序就行了。
这本书对初学者是比较容易入门的。看了不头痛。不像有些书,太深奥,越看头越大。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.