Git Product home page Git Product logo

easy12306's Introduction

easy12306

两个必要的数据集:

  1. 文字识别,model.h5
  2. 图片识别,12306.image.model.h5

识别器数据的下载地址:

  1. 百度网盘
  2. https://drive.google.com/drive/folders/1GDCQyaHr36c7y1H-19pOKjc_EdAI1wn0

python3 main.py <img.jpg>

我把设计思路写在维基中了:https://github.com/zhaipro/easy12306/wiki

如何?

2

~$ python3 main.py 2.jpg 2> /dev/null
电子秤
风铃        # 要找的是以上两样东西
0 0 电子秤  # 第一行第一列就是电子秤
0 1 绿豆
0 2 蒸笼
0 3 蒸笼
1 0 风铃
1 1 电子秤
1 2 网球拍
1 3 网球拍

识别前所未见的图片

8

具体的编号:texts.txt

~$ python3 mlearn_for_image.py 8.jpg
[0.8991613]  # 可信度
[0]          # 0 表示的就是打字机

什么?

只是想拿来识别12306的验证码?可以回顾3.0.0,使用相似图搜索。

easy12306's People

Contributors

busyyang avatar zhaipro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

easy12306's Issues

深度学习的秒记

这里的数据集用于测试,得出的结果:
统计学专家识别的正确率:0.9422140966882884
从统计学专家那里学来的深度学习模型的正确率:0.9811081335640064

统计学对剪纸的识别正确率只有64%,我猜是因为剪纸的种类太多啦。
而深度学习模型识别率最低的是挂钟:

1577/1577 [==============================] - 42s 26ms/step
[0.24407617484627453, 0.9302473050095117]

我猜是因为挂钟和钟表实在是难以区分。
关于钟表的识别力度:

1608/1608 [==============================] - 44s 27ms/step
[0.22922349847223036, 0.9359452736318408]

深度学习对跑步机的识别最有信心:

1564/1564 [==============================] - 43s 27ms/step
[0.0026093199646667294, 1.0]

可以以此证明学习后的神经网络具备识别前所未见的实力吗?
可以说仅1万张图片就够学习了吗?
能不能给机器更少的教材就让它学到有用的技能呢?
实际上它对于验证码的识别力度还可以,但对于真实世界照片的识别力度就没这么高了。

the model was *not* compiled. Compile it manually

E:\ProgramData\Anaconda3\lib\site-packages\keras\engine\saving.py:292: UserWarning: No training configuration found in save file: the model was not compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '

运行mlearn.py的时候,出现这个错误,请问怎么解决,谢谢

cann't deploy your easy122306 successfully on my PC

I just copied your programme and downloaded your datamodes(12306.image.model.h5,model.v3.0.h5) in accordance to your README guidance, unfortunately,a scrutable problem happened after I added datamodes to the same file where the rest of programme is stored .I will share the message that indicates error from python console.

Traceback (most recent call last): File "C:\Users\XM8\Desktop\easy12306-master\easy12306-master\main.py", line 60, in <module> main(sys.argv[1]) IndexError: list index out of range

训练过程

您好,我学习过了一些人工智能的基础知识,现在想做从收集数据,数据预处理,到模型的生成的一系列过程,我发现你的这个项目挺适合我。但我看了代码后并没有模型生成那个过程,我对这部分很好奇,能否提供一下思路,或者数据?谢啦

如何完整地跑一遍代码

你好,我想学习学习楼主的实现过程,想要完整地跑一遍代码,现在我知道:
1 我需要先运行pretreatement.py, 得到data.npz数据集;
2 baidu.py通过baidu API识别标签的结果;
3 第三步我应该做什么?我看mlearn.py以及mlearn_for_image.py需要的.npz或者.npy文件,都不清楚如何生成。在google drive上倒是有,但是想知道如何生成的?

还有,下载图片的话,能下载多少,我下载了1800张左右时候,就开始大量有重复的文件了。

第一次玩卷积神经网络,留个纪念

zhaipro@localhost ~/easy12306> python3 mlearn.py
Using TensorFlow backend.
Train on 10047 samples, validate on 1117 samples
Epoch 1/30
[=] - 14s 1ms/step - loss: 1.9007 - acc: 0.5465 - val_loss: 0.5589 - val_acc: 0.8478
Epoch 2/30
[=] - 14s 1ms/step - loss: 0.2237 - acc: 0.9438 - val_loss: 0.1225 - val_acc: 0.9678
Epoch 18/30
[=] - 14s 1ms/step - loss: 8.1089e-06 - acc: 1.0000 - val_loss: 0.0216 - val_acc: 0.9937
Epoch 30/30
[=] - 14s 1ms/step - loss: 8.0525e-06 - acc: 1.0000 - val_loss: 0.0211 - val_acc: 0.9937

如何让统计学专家发挥更多的实力呢?

a

图片文件的命名规则:<类别>.<出现的次数>.(<在当前类别中出现的频率>).<索引>.jpg

索引只是用来防止文件重名的。

我大概估计得用于判断准确性的参数是:

  1. 出现次数必须大于15次,毕竟出现次数少,统计出来的值可信度也不够。
  2. 频率必须超过0.182,因为有某图片出现的次数足够多,但频率不够高,我猜测其原因可能是哈希算法出错了。

建议弄个使用教程

建议弄个使用教程。新手在运行的过程中,会出现很多莫名其妙的环境问题。

文字识别在texts.npy的评价指标

拜读了您的wiki,如果我没有理解错的话texts.npy中包含了百度识别不正确的验证码,也就是说这部分数据没有标注好的标准答案。
您在wiki中提到了,模型在这部分数据中取得了满意的结果。

我在复现您的代码,在这里有个疑问要请教下。您是如何在这部分没有标注答案的数据上评价模型的,是人工查看一些预测结果然后据此估计模型准确性?还是说有类似于训练时accuracy这样的可量化指标?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.