Comments (27)
Hi!
I just ran into this, and I found the problem.
What happens, is that if you just leave the out_name flag as default, it will just create a new folder, and then it will try to load the checkpoint from there.
So the solution is to set the out_name flag to a previous model folder. Either from the code or as a command line parameter. Like
--out_name="20190818.161208 - data - test - x64.z100.uniform_signed.y64.b64"
Peace
from dcgan-tensorflow.
Hi
I am also facing this loading fail issue. Can anyone help me with this.
I successfully trained model but fails at testing.
[] **Reading checkpoints... ./out/20190712.155740 - data - face - x96.z100.uniform_signed.y96.b64/checkpoint
[] Failed to find a checkpoint
[!] Load failed...**
[ 1 Epoch:[ 0/300] [ 0/ 1] time: 7.3325, d_loss: 12.05198956, g_loss: 0.00001405
[ 2 Epoch:[ 1/300] [ 0/ 1] time: 12.2742, d_loss: 10.16373634, g_loss: 0.00009138
[ 3 Epoch:[ 2/300] [ 0/ 1] time: 17.1198, d_loss: 12.54014683, g_loss: 0.00000996
[ 4 Epoch:[ 3/300] [ 0/ 1] time: 22.0227, d_loss: 12.31657314, g_loss: 0.00003112
[ 5 Epoch:[ 4/300] [ 0/ 1] time: 26.9523, d_loss: 12.80654812, g_loss: 0.00008534
[ 6 Epoch:[ 5/300] [ 0/ 1] time: 31.8358, d_loss: 13.03531933, g_loss: 0.00001151
[ 7 Epoch:[ 6/300] [ 0/ 1] time: 36.8203, d_loss: 13.17454529, g_loss: 0.00003603
[ 8 Epoch:[ 7/300] [ 0/ 1] time: 41.7034, d_loss: 13.25488567, g_loss: 0.00006080
[ 9 Epoch:[ 8/300] [ 0/ 1] time: 46.6821, d_loss: 12.10309505, g_loss: 0.00041392
[ 10 Epoch:[ 9/300] [ 0/ 1] time: 51.5538, d_loss: 9.41581345, g_loss: 0.00186340
[ 11 Epoch:[10/300] [ 0/ 1] time: 56.5072, d_loss: 10.10222530, g_loss: 0.00050865
[ 12 Epoch:[11/300] [ 0/ 1] time: 61.3804, d_loss: 4.94413757, g_loss: 0.14236367
[ 13 Epoch:[12/300] [ 0/ 1] time: 66.4153, d_loss: 11.87597275, g_loss: 0.00002049
[ 14 Epoch:[13/300] [ 0/ 1] time: 71.3269, d_loss: 1.26149607, g_loss: 3.08130121
[ 15 Epoch:[14/300] [ 0/ 1] time: 76.1823, d_loss: 12.14586639, g_loss: 0.00001769
[ 16 Epoch:[15/300] [ 0/ 1] time: 81.1462, d_loss: 1.42883611, g_loss: 1.98843646
[ 17 Epoch:[16/300] [ 0/ 1] time: 85.9693, d_loss: 11.67339706, g_loss: 0.00007065
[ 18 Epoch:[17/300] [ 0/ 1] time: 90.9558, d_loss: 0.86891878, g_loss: 5.06913090
[ 19 Epoch:[18/300] [ 0/ 1] time: 96.0543, d_loss: 10.42274475, g_loss: 0.00019286
[ 20 Epoch:[19/300] [ 0/ 1] time: 101.0552, d_loss: 1.43598580, g_loss: 8.86531544
[ 21 Epoch:[20/300] [ 0/ 1] time: 106.2137, d_loss: 3.21956253, g_loss: 0.13176931
[ 22 Epoch:[21/300] [ 0/ 1] time: 111.1346, d_loss: 4.63180637, g_loss: 0.03671605
[ 23 Epoch:[22/300] [ 0/ 1] time: 115.9590, d_loss: 0.81582117, g_loss: 11.40536022
[ 24 Epoch:[23/300] [ 0/ 1] time: 120.7342, d_loss: 1.90553343, g_loss: 0.64395195
[ 25 Epoch:[24/300] [ 0/ 1] time: 125.7073, d_loss: 8.43229103, g_loss: 0.00032412
[ 26 Epoch:[25/300] [ 0/ 1] time: 130.6088, d_loss: 0.95951736, g_loss:
Training is fine but testing is not being done. throwing error of :
raise Exception("Checkpoint not found in " + FLAGS.checkpoint_dir)
Exception: Checkpoint not found in ./out/20190712.103742 - data - face/checkpoint
I have trained on my own dataset.
from dcgan-tensorflow.
@carpedm20 Thanks for the code. It is raising crazy errors:
TypeError: DataType float32 for attr 'Tshape' not in list of allowed values: int32, int64 at
self.z_, [-1, s_h16, s_w16, self.gf_dim * 8]) /model.py in genarator
Can you share your implementation
from dcgan-tensorflow.
If there is no checkpoint of a model, that's a normal output not an error.
from dcgan-tensorflow.
@carpedm20 Thank you very much ! Do you have a email??
from dcgan-tensorflow.
i got the same error,as below:
[*] Reading checkpoints...
[*] Failed to find a checkpoint
[!] Load failed...
[*] 0
[*] 1
[*] 2
[*] 3
[*] 4
[*] 5
[*] 6
[*] 7
[*] 8
[*] 9
[*] 10
[*] 11
[*] 12
[*] 13
[*] 14
[*] 15
[*] 16
[*] 17
[*] 18
[*] 19
[*] 20
[*] 21
[*] 22
[*] 23
[*] 24
[*] 25
[*] 26
[*] 27
[*] 28
[*] 29
[*] 30
[*] 31
[*] 32
[*] 33
[*] 34
[*] 35
[*] 36
[*] 37
[*] 38
[*] 39
[*] 40
[*] 41
[*] 42
[*] 43
[*] 44
[*] 45
[*] 46
[*] 47
[*] 48
[*] 49
[*] 50
[*] 51
[*] 52
[*] 53
[*] 54
[*] 55
[*] 56
[*] 57
[*] 58
[*] 59
[*] 60
[*] 61
[*] 62
[*] 63
[*] 64
[*] 65
[*] 66
[*] 67
[*] 68
[*] 69
[*] 70
[*] 71
[*] 72
[*] 73
[*] 74
[*] 75
[*] 76
[*] 77
[*] 78
[*] 79
[*] 80
[*] 81
[*] 82
[*] 83
[*] 84
[*] 85
[*] 86
[*] 87
[*] 88
[*] 89
[*] 90
[*] 91
[*] 92
[*] 93
[*] 94
[*] 95
[*] 96
[*] 97
[*] 98
[*] 99
from dcgan-tensorflow.
@shartoo Did you trained the model first? Aren't you trying to test (generate samples) a model without any training?
from dcgan-tensorflow.
@carpedm20 no, the output above is exactly what the training step print when training celeA
from dcgan-tensorflow.
@shartoo I am getting same error can you suggest, how to proceed further. Can you share the checkpoint
from dcgan-tensorflow.
@carpedm20 this error occurs when your train data set contains nothing,you should check it out.
from dcgan-tensorflow.
@carpedm20 I have given path to celebA dataset and it is able to take date. It is failing to get a checkpoint. Can you suggest how to train and get our own checkpoint.
from dcgan-tensorflow.
@carpedm20 the code should be ran in two stages, training and generate. The training stage will create checkpoint
directory and generate stage will load it. When running the code at first time,you should change it for training .In main.py
flags.DEFINE_boolean("is_train", False, "True for training, False for testing [False]")
If the variable is_train
is True,it's training step and generate step if it was False.
If you want to run your own dataset ,change code (in `main.py` )
flags.DEFINE_string("dataset", "celebA", "The name of dataset [celebA, mnist, lsun]")
the celebA
to the your dataset name below current directory ./data/ and take care of the image size,they must be 64*64. My code for crop is below(some functions are copied from utils.py
of the project):
# -*- coding:utf-8 -*-
'''
crop the image to square
'''
import scipy.misc
import numpy as np
import glob
import os
def center_crop(x, crop_h, crop_w=None, resize_w=64):
if crop_w is None:
crop_w = crop_h
h, w = x.shape[:2]
j = int(round((h - crop_h)/2.))
i = int(round((w - crop_w)/2.))
return scipy.misc.imresize(x[j:j+crop_h, i:i+crop_w],
[resize_w, resize_w])
def imread(path, is_grayscale = False):
if (is_grayscale):
return scipy.misc.imread(path, flatten = True).astype(np.float)
else:
return scipy.misc.imread(path).astype(np.float)
def imsave(image, path):
return scipy.misc.imsave(path,image)
if __name__=='__main__':
data = glob(os.path.join("./data","human_face", "*.jpg"))
for file in data:
imsave(center_crop(imread(file,False),64),file)
print("crop image %s saved.."%file)
from dcgan-tensorflow.
I am working on super resolution algorithm based on DCGAN. I am getting same error like Reading checkpoints , load failed .. I changed the code as suggested by you.
from dcgan-tensorflow.
@Chaitu1509 I got the same error, did u solve that? it is really strange. almost crazy
from dcgan-tensorflow.
I mean that 'Tshape' one
from dcgan-tensorflow.
@Chaitu1509 I solved it ,see
from dcgan-tensorflow.
requirement
from dcgan-tensorflow.
@shartoo i have already done based on your code but still the same error
$ python main.py --dataset=cloth --input_fname_pattern=".png" --c_dim=1 --is_train=True
[] Reading checkpoints...
[] Failed to find a checkpoint
[!] Load failed...
[] 0
[] 1
[] 2
[] 3
[] 4
[] 5
[] 6
[] 7
[] 8
[] 9
[] 10
[] 11
[] 12
[] 13
[] 14
[] 15
[] 16
[] 17
[] 18
[] 19
[] 20
[] 21
[] 22
[] 23
[] 24
[] 25
[] 26
[] 27
[] 28
[] 29
[] 30
[] 31
[] 32
[] 33
[] 34
[] 35
[] 36
[] 37
[] 38
[] 39
[] 40
[] 41
[] 42
[] 43
[] 44
[] 45
[] 46
[] 47
[] 48
[] 49
[] 50
[] 51
[] 52
[] 53
[] 54
[] 55
[] 56
[] 57
[] 58
[] 59
[] 60
[] 61
[] 62
[] 63
[] 64
[] 65
[] 66
[] 67
[] 68
[] 69
[] 70
[] 71
[] 72
[] 73
[] 74
[] 75
[] 76
[] 77
[] 78
[] 79
[] 80
[] 81
[] 82
[] 83
[] 84
[] 85
[] 86
[] 87
[] 88
[] 89
[] 90
[] 91
[] 92
[] 93
[] 94
[] 95
[] 96
[] 97
[] 98
[*] 99
from dcgan-tensorflow.
@tewea check the command you ran,the parameter --is_train
should be --is_train=True
?
from dcgan-tensorflow.
@shartoo still it is the same error even i did --is_train=True is there something I can change in the crop code? my issue is i have my own dataset i create new cloth folder under ./data and I modify flags.DEFINE_string("dataset", "cloth", "The name of dataset [cloth, mnist, lsun]") as well but still getting the same error
from dcgan-tensorflow.
@tewea this error has nothing to do with crop code, it could be caused by missing correct dataset ,maybe you could put your images to path ./data/cloth
and keep same as other dataset like lsun
.
from dcgan-tensorflow.
@shartoo you are right it works fine with load successful for "celebA" after I repeat twice but still i have issue when i use small pictures but still it shows only empty Grey png pictures in sample folder. how can i implement your crop code in which file name? and still not working with my own dataset I try to copy to the path ./data/cloth my images are downloaded manually and cropped 64*64 size few of them manually for test purpose from http://people.ee.ethz.ch/~lbossard/projects/accv12/index.html site
from dcgan-tensorflow.
It was a long time since my last debugging ,this repository seems not work well for new trianing dataset,i'll sends my code to your email and post my code here after some modify.Please leave your email address.
from dcgan-tensorflow.
Hi
@shartoo My email id is [email protected].. please send me your code. This repo is not working. Checkpoint is not saved after complete training on my dataset and hence Testing is not possible.
Thanks in advance Shartoo.
from dcgan-tensorflow.
I am also having this issue
from dcgan-tensorflow.
from dcgan-tensorflow.
Hi!
I just ran into this, and I found the problem.
What happens, is that if you just leave the out_name flag as default, it will just create a new folder, and then it will try to load the checkpoint from there.
So the solution is to set the out_name flag to a previous model folder. Either from the code or as a command line parameter. Like
--out_name="20190818.161208 - data - test - x64.z100.uniform_signed.y64.b64"Peace
thank u so much!
Under ur help, i'v solved my problem!
from dcgan-tensorflow.
Related Issues (20)
- input _fname_pattern"*.jpg" Synatx Error: Invalid Syntax error in line 91 in main.py
- Why the kernel size of discriminator is 4?
- raise Exception("[!] Entire dataset size is less than the configured batch_size") Exception: [!] Entire dataset size is less than the configured batch_size
- why my model is not converge after 300 epochs HOT 1
- checkpoint not found HOT 2
- What are the in/output node names for Generator and Discriminator? HOT 1
- Solved some problems in my repo/解决了一些问题
- raise Exception("Checkpoint not found in " + FLAGS.checkpoint_dir) Exception: Checkpoint not found in ./out\20200526.133337 - data - retina\checkpoint HOT 5
- Training and Test generating black squares HOT 2
- There are two bugs in the transform function in the utils.py HOT 2
- How to save discriminator network? HOT 2
- Can't create checkpoint
- cannot generate when testing
- failed to teat
- failed to test HOT 1
- NameError:name 'PIL' is not defined HOT 1
- ValueError: could not broadcast input array from shape (1,2048) into shape (98,1024) HOT 2
- TypeError: 'NoneType' object is not subscriptable
- How to generate larger images? HOT 1
- InvalidArgumentError (see above for traceback): Nan in summary histogram for: HistogramSummary_2 [[Node: HistogramSummary_2 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](HistogramSummary_2/tag, discriminator_1/Sigmoid)]] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dcgan-tensorflow.