ypwhs / dogs_vs_cats Goto Github PK

View Code? Open in Web Editor NEW

610.0 23.0 250.0 1.82 MB

猫狗大战

Jupyter Notebook 100.00%

kaggle image-classification keras-tutorials keras resnet inception xception jupyter-notebook

dogs_vs_cats's People

Contributors

Stargazers

Watchers

Forkers

tianxingyzxq quincy515 lixinyue-luna jiacong1990 meshiguge solderzzc superalexander lymanzhang falconzyx mingmingyang 307509256 l1aoxingyu lovehp vtpp2014 dreadlord1984 jizhihang sunjieee rockyrollluo allensmile fanhuaandluomu leezqcst williaml1 algpower wuyanlun samithuang zhangzhaoyin bison31205 eduardocao jianweilin all3xfx nlpformyself huoshan-corleone floridexkj chiuyeelau ahangchen jasonren23 zhudaoruyi lupobear cookiecheng junlino prayeryd haru94 zgcgreat skyfacon cxtao lin88633 slidelucask pzfok yuconan skyninefive yanshui177 learn2pro tonghaoliang sms95 mjiansun yaningx cool-cola yesyu stsui1019 royalteng hahahaqinqin sqccathy zhangyuquansibet anyuray 2prime ai-machinelearning richardsun-voyager orangetangerine helianglen zhuiyuan616124 endpang goodluckwlx dawin2015 winwinjjiang my777777 deepblue0822 metal-joker bruceyang2012 kangdekai rzel yanshuo1992 shushengyang tiankong12 nanfengpo zjsuper lincaiming jaassoon rinawhale walqe chaohuazhu violetzhi quanok110 liyuanyaun flyrainkey ljc44 nemonameless winaway yassinia mzk665 xuewengeophysics

dogs_vs_cats's Issues

@求教杨神。为啥在新分类上loss很低，精确度很高。但实际检测时准确率很低呢

用的是百度宠物狗识别数据100类，模型是inception-v3，resnet-50也试过结果都是这样:

loss: 0.0699 - acc: 0.9895 - val_loss: 0.1637 - val_acc: 0.9898
loss: 0.0676 - acc: 0.9892 - val_loss: 0.1607 - val_acc: 0.9899
loss: 0.0671 - acc: 0.9890 - val_loss: 0.1630 - val_acc: 0.9899
loss: 0.0670 - acc: 0.9888 - val_loss: 0.1620 - val_acc: 0.9898
loss: 0.0674 - acc: 0.9886 - val_loss: 0.1620 - val_acc: 0.9896
loss: 0.0677 - acc: 0.9884 - val_loss: 0.1584 - val_acc: 0.9898
loss: 0.0684 - acc: 0.9883 - val_loss: 0.1607 - val_acc: 0.9897
loss: 0.0685 - acc: 0.9882 - val_loss: 0.1637 - val_acc: 0.9896

但是实际检测错误率0.36以上～

关于网络初始化权重

作者您好，请问您能否分享一下ResNet50，Xception，InceptionV3，VGG19初始化的权重文件，我多次尝试都下载失败了，不胜感激。@ypwhs

测试集文件排列方式

您好，请问为什么测试文件不是按 1, 2, 3 排列，而是按1,10,100,1000,10000,10001？为什么前5个数是等比数列，而到了10000之后就是等差数列呢？为什么是用10000区分呢？

关于keras中的nb_sample不存在问题

请问在keras2中，ImageDataGenerator().flow_from_generator.nb_sample不存在，应该如何解决？谢谢

关于logloss的无穷大问题

首先很感谢作者的技术分享，但对您所设计的损失函数不能很好的理解，且对您的灵感源泉来自与logloss下无穷大问题这个链接也不能打开了，对于这个问题的修复不知道是否可以请您麻烦解释一下。谢谢

求教：为什么gap.ipynb中VGG 网络不用preprocess_input呢？

请问杨神，是VGG16 的preprocess_input会和generator的使用，会打乱RGB顺序得到错误的特征吗吗？

fit_generator steps

培文哥，好像fit_generator里第二个steps参数是指的是batch个数，而文档里用的总的样本数（nb_batchbatch_size）,相当于输出的数据是把源数据集用aug的方式扩充了batch_size倍，感觉steps写成nb_batch扩充倍数更容易读者理解。

keras发布新版本了，代码跑不动了，已经没有地方可以下载1.2.2版本了

您好，我是刚开始接触深度学习的学生，有很多很多不懂的东西。Python也是刚开始学，目前用的环境是
win10，keras2.0.4，python3.5。 pycharm作为IDE。我把代码拷贝下来都不能运行，而且我看代码也觉得有些地方不能理解，比如MODEL.fun_name，这样。在def write_gap(MODEL..,.,.)里MODEL是自己定的一个参数，他为什么会具有nb_sample，fuc_name呢。。。
因为要发论文的关系（确实很水。。但是前一年都在给导师做web上的项目去了，最近两个月才开始从机器学习入门开始。。）想做一个多分类的模型，之前只照着官网上猫狗的例子利用VGG16微调做了个4分类的，想做做其他实验，查了好多资料还是觉得什么都不懂，可能还是要从python基础开始学起，不然改模型真的太吃力了

release里面的特征资源下载不了

如题，请问可否提供百度云或google drive的链接？

谢谢

请问大神这个是优达学城深度学习第二学期的那个项目吗？有没有TensorFlow版本的？？谢谢

为什么ResNet50没有加preprocess_input？

为什么ResNet50没有用preprocess_input,而InceptionV3和Xception需要？
write_gap(ResNet50, (224, 224))
write_gap(InceptionV3, (299, 299), inception_v3.preprocess_input)
write_gap(Xception, (299, 299), xception.preprocess_input)

请教一下预测图片之后的操作

这一段当预测之后为什么要进行这种操作呢？求指教

为什么我用在自己的数据集上，出来的特征0特别多，几乎没有有效的特征值？

请问gap_train中最后使用的test是要使用自己创建符号链接得到的test吗

请问这里的test是什么呀？
为什么我预测出来的在最后结果是这样的

Error of "model.predict_generator"

Hi, thanks for the great code. It is great for a beginner of Keras, like me. But I found some errors when I run the code. In gap.py, the terminal reports that "AttributeError: 'DirectoryIterator' object has no attribute 'nb_sample'". I wonder if it is due to the different versions of keras and is there any solutions to this? Thanks ahead of time!

关于数据增强

我把几个模型组合起来，然后在输出端加上全连接层进行训练，但是loss在1.几就降不下去了，代码如下：

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2'
from keras.models import Model
from keras.callbacks import ReduceLROnPlateau
from keras.utils import plot_model
from keras.layers import *
from keras.applications import *
from keras import backend as K
from keras.preprocessing.image import *
import matplotlib.pyplot as plt
import h5py

base_models = locals()
use_models = {#ResNet50:{'preprocessing':None, 'size':(224, 224)},
              InceptionV3:{'preprocessing':inception_v3.preprocess_input, 'size':(299, 299), 'device':'/gpu:0'},
              Xception:{'preprocessing':xception.preprocess_input, 'size':(299, 299), 'device':'/gpu:1'}, 
              InceptionResNetV2:{'preprocessing':inception_resnet_v2.preprocess_input, 'size':(299, 299), 'device':'/gpu:2'}
              }

model_bottlenecks = []
input_tensor = Input((299, 299, 3))
for model, method in use_models.items():
    width, height = method['size']
    preprocessing = method['preprocessing']
    device = method['device']
    # input_tensor = Input((height, width, 3))
    if preprocessing:
        x = Lambda(preprocessing)(input_tensor)
    with K.tf.device(device):
      base_models[model.__name__] = model(input_tensor=x, weights='imagenet', include_top=False)
      for layer in base_models[model.__name__] .layers:
          layer.trainable = False
      model_bottleneck = GlobalAveragePooling2D()(base_models[model.__name__].output)
      model_bottlenecks.append(model_bottleneck)

with K.tf.device('/gpu:0'):
  stack_bottleneck = concatenate(model_bottlenecks)
  fc = Dropout(0.5)(stack_bottleneck)
  fc = Dense(120, activation='softmax')(fc)

stack_model = Model(input_tensor, fc)
# plot_model(stack_model, to_file='model.png')


datagen = ImageDataGenerator(rotation_range=10,
                             width_shift_range=0.1,
                             height_shift_range=0.1,
                             zoom_range=0.1,
                             horizontal_flip=True,
                             vertical_flip=True
                             )
train_generator = datagen.flow_from_directory("raw_images", (299, 299), batch_size=64)
test_generator = datagen.flow_from_directory("test", (299, 299), batch_size=1)

stack_model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
learning_rate_reduction = ReduceLROnPlateau(monitor='loss', 
                                            patience=3, 
                                            verbose=1, 
                                            factor=0.5, 
                                            min_lr=1e-8)

stack_model.fit_generator(train_generator,
                    steps_per_epoch=train_generator.n/64, 
                    epochs=100, 
                    verbose=1,
                    callbacks=[learning_rate_reduction])
stack_model.save_weights('stack_model.h5')

还有个编程问题，vgg和resnet的preprocess_input与inception的不同，直接加在lambda层会报错，这个要如何处理呢？

运行gap.ipynb出现错误AttributeError: 'DirectoryIterator' object has no attribute 'nb_sample'

AttributeError: 'DirectoryIterator' object has no attribute 'nb_sample'

为什么我们自己生成的gap h5, 模型正确率99,但是测试的结果却很多错位呢？

你好，因为现实的数据没有那么多的图片。我尝试把测试的数据集合变小（1000)个，然后重跑gap.ipynb 生成h5文件

很可惜预测的结果却很多错误。请问这是为什么呢

# coding: utf-8

# In[1]:

import os
import shutil

# In[2]:
SMALL_DATASET_COUNT = 1000

train_filenames = os.listdir('train')
train_cat = filter(lambda x: x[:3] == 'cat', train_filenames)
train_dog = filter(lambda x: x[:3] == 'dog', train_filenames)

test_file_dir = os.listdir('test')
test_filenames = filter(lambda x: int(x[:-4]) <= SMALL_DATASET_COUNT, test_file_dir)


# In[3]:

def rmrf_mkdir(dirname):
    if os.path.exists(dirname):
        shutil.rmtree(dirname)
    os.mkdir(dirname)


# rmrf_mkdir('train2')
# os.mkdir('train2/cat')
# os.mkdir('train2/dog')
#
# rmrf_mkdir('test2')
# os.symlink('../test/', 'test2/test')
#
#
# for filename in train_cat:
#     os.symlink('../../train/'+filename, 'train2/cat/'+filename)
#
# for filename in train_dog:
#     os.symlink('../../train/'+filename, 'train2/dog/'+filename)


# In[ ]:


rmrf_mkdir('train-small-dataset')
os.mkdir('train-small-dataset/cat')
os.mkdir('train-small-dataset/dog')

file_count = 0
for filename in train_cat:
    os.symlink('../../train/' + filename, 'train-small-dataset/cat/' + filename)
    file_count += 1
    if file_count >= SMALL_DATASET_COUNT:
        break

file_count = 0
for filename in train_dog:
    os.symlink('../../train/' + filename, 'train-small-dataset/dog/' + filename)
    file_count += 1
    if file_count >= SMALL_DATASET_COUNT:
        break

rmrf_mkdir('test-small-dataset')
rmrf_mkdir('test-small-dataset/test')

file_count = 0
for filename in test_filenames:
    os.symlink('../../test/' + filename, 'test-small-dataset/test/' + filename)
    file_count += 1
    if file_count >= SMALL_DATASET_COUNT:
        break

请教关于ImageDataGenerator的用法

ImageDataGenerator的作用不是为了增加数据量吗？那么为什么最后保存的h5文件中训练集还是只有25000？

如何修改成多分类

前面数据按照gap.py生成
Found 102945 images belonging to 30 classes.
Found 11456 images belonging to 30 classes.

我想改成30分类，x = Dense(30, activation='softmax')(x)
但是model.fit(X_train, y_train, batch_size=8, nb_epoch=8, validation_split=0.2)出错
错误：ValueError: Error when checking model target: expected dense_1 to have shape (None, 30) but got array with shape (102945, 1)
######3##
我该如何修改shape（102945,1）为（102945，30）
希望得到你的帮助！
谢谢

导出的 h5 文件的三个numpy数组维数问题

你好，首先感谢您的代码。我读了您的代码有很大启发。但是我自己运行除了问题。您的gap.ipynb最后导出的 h5 文件包括三个 numpy 数组：train (25000, 2048)，test (12500, 2048)，label (25000,)。而我得到的是train(399880,2048)，test(199820,2048)，label(25000,)。请问我是哪里出了问题。请求解答，谢谢！