cqfio / photographicimagesynthesis Goto Github PK
View Code? Open in Web Editor NEWPhotographic Image Synthesis with Cascaded Refinement Networks
Home Page: https://cqf.io/ImageSynthesis/
Photographic Image Synthesis with Cascaded Refinement Networks
Home Page: https://cqf.io/ImageSynthesis/
Hello, I have run the code and train the network,but I want some GTA5's original images and labels.
Where can i get these?
After I run python download_models.py
,
I run python demo_256p.py
and it turns out:
Traceback (most recent call last):
File "demo_256p.py", line 115, in
label_images[ind]=helper.get_semantic_map("data/cityscapes/Label256Full/%08d.png"%ind)#training label
File "/home/ah/disk/jiangyifan/PhotographicImageSynthesis/helper.py", line 15, in get_semantic_map
semantic=scipy.misc.imread(path)
File "/usr/local/lib/python2.7/dist-packages/scipy/misc/pilutil.py", line 156, in imread
im = Image.open(name)
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 2258, in open
fp = builtins.open(filename, "rb")
IOError: [Errno 2] No such file or directory: 'data/cityscapes/Label256Full/00002618.png'
The bugs also happened when I run python demo_512p.py
In document GTA_Diversity_256p.py , you use the get_semantic_map function to get the semantic map, but the color palette of cityscapes is used in the get_semantic_map function. does this mean that the color palette of GTA5 and cityscapes are the same?
i haven't change,it is wrong
FileNotFoundError: [Errno 2] No such file or directory: 'data/cityscapes/Label256Full/00002161.png'
Perhaps the inputs are supposed to be a different format, but changing it to this works much better for me.
def print_semantic_map(semantic,path):
dataset=Dataset('cityscapes')
prediction=np.argmax(semantic,axis=3) # this used to be axis=2 and had a transpose before this
prediction=np.squeeze(prediction) # added this
color_image=dataset.palette[prediction.ravel()].reshape((prediction.shape[0],prediction.shape[1],3))
row,col,dump=np.where(np.sum(semantic,axis=2)==0)
color_image[row,col,:]=0
scipy.misc.imsave(path,color_image)
Once this is done, we can simply print using:
a = get_semantic_map('../cityscapes_data/gtFine/train/aachen/aachen_000000_000019_gtFine_color.png')
print_semantic_map(a, 'test.png')
@CQFIO perhaps you have another intended use for this function?
Thx for the great project first~
My question:
In the train demo,what does " the weights lambda are collected at 100th epoch" mean?
Should I change lambda in "tf.image.resize_area(label,(sp//16,sp//8)))*10/1.5" with new value got in training by hand?
Sincerely
thanks ~
can you provide trained models' direct googledrive URL. I have some connection issues. Thank you!
In the paper it says: "The hyperparameters {λl} are set automatically. They are initialized to the inverse of the number of elements in each layer. After 100 epochs, {λl} are rescaled to normalize the expected contribution of each term kΦl(I) − Φl(g(L; θ)) k1 to the loss." - I am trying to find that part in the demo_256p.py code, but I fail to spot it. From what I can see the weights for p0 to p5 are fixed values.
Can you point me to the part where this is applied?
Also I am wondering if the difference between the weights in demo_256p.py:
#demo_256p.py
p0=compute_error(vgg_real['input'],vgg_fake['input'],label)
p1=compute_error(vgg_real['conv1_2'],vgg_fake['conv1_2'],label)
p2=compute_error(vgg_real['conv2_2'],vgg_fake['conv2_2'],tf.image.resize_area(label,(sp//2,sp)))
p3=compute_error(vgg_real['conv3_2'],vgg_fake['conv3_2'],tf.image.resize_area(label,(sp//4,sp//2)))
p4=compute_error(vgg_real['conv4_2'],vgg_fake['conv4_2'],tf.image.resize_area(label,(sp//8,sp//4)))
p5=compute_error(vgg_real['conv5_2'],vgg_fake['conv5_2'],tf.image.resize_area(label,(sp//16,sp//8)))*10
and the ones in demo_512p.py and demo 1024p.py is intentional?
#demo_512p.py & demo 1024p.py
p0=compute_error(vgg_real['input'],vgg_fake['input'],label)
p1=compute_error(vgg_real['conv1_2'],vgg_fake['conv1_2'],label)/2.6
p2=compute_error(vgg_real['conv2_2'],vgg_fake['conv2_2'],tf.image.resize_area(label,(sp//2,sp)))/4.8
p3=compute_error(vgg_real['conv3_2'],vgg_fake['conv3_2'],tf.image.resize_area(label,(sp//4,sp//2)))/3.7
p4=compute_error(vgg_real['conv4_2'],vgg_fake['conv4_2'],tf.image.resize_area(label,(sp//8,sp//4)))/5.6
p5=compute_error(vgg_real['conv5_2'],vgg_fake['conv5_2'],tf.image.resize_area(label,(sp//16,sp//8)))*10/1.5
thanks for the code. its a cool implementation.
i have two questions about the code
1.
the first is about the 'computer_error' functions.
for exmale,the size of fake and real are NHWC, after expand_dims(tf.reduce_mean(tf.abs(fake-real),reduction_indices=[3]),-1)
,the size is NHW1
then the result label returns NHW20,and the final result of this function is N*20,right?
2.the second question is label:np.concatenate((label_images[ind],np.expand_dims(1-np.sum(label_images[ind],axis=3),axis=3)),axis=3)
why the 19-dimention label maps are concatenated with np.expand_dims(1-np.sum(label_images[ind],axis=3),axis=3))
.
i guess the later parts refers to some transformation error between RGB label images and 19-dimention labels.
thanks
Could you tell me what's the purpose and theory of this code part in get_semantic_map function:
"for k in range(dataset.palette.shape[0]):
tmp[:,:,k]=np.float32((semantic[:,:,0]==dataset.palette[k,0])&(semantic[:,:,1]==dataset.palette[k,1])&(semantic[:,:,2]==dataset.palette[k,2]))"
in the demo 256 code, the weights of different losses are 1,1/1.6,1/2.3,1/2.8,10/0.5. where do these hyperparameters come from?
in the paper, it says they are "inverse of the number of elements in each layer', what do you mean by "number of elements", and how to calculate the weights above?
looking forward to ur reply, thank you
Hello, I started taining the 256p model, however after using all 32 GB of RAM and about 16 GB of SWAP memory the process is killed. Have you experiences a similar issue? The amount of used RAM increases with each iteration of the program, so i suspect a memory leak.
Thank you very much
4 GPU 8G for each .
Run out of memery when I train 1024P demo .
Could you please tell me how to deal with it?
Hello, I have been trying to use mturk_script to evaluate the model. But in the end, I can't get results like you showed on your paper. Can you tell me specifically how to use mturk_script to evaluate the generated model? thank you very much
I have some questions about the loss weight .
1.At the beginning of the training, the loss weights are "initialized to the inverse of the number of elements in each layer", "the number of elements in each layer" refers to the number of elements of the future map mi,which is extracted by the layer i ?(or the elements of weight of layer i ?)
So the initial value of the loss weight wi is: wi = 1/count(mi)
2.Keep the loss weight constant and then train 100 times.In the last training we got the loss L(ik) of layer i and image k.
So the final weight wi is: wi = ( Li1 + Li2 +... + Lin)/n
I am not sure whether I understand it correctly.Look forward to your reply.
You write in the paper that you rescale {lambda_l} after 100 epochs.
Thanks
Would you please release your evaluation implementation? Thank you and I am looking forward to that!
what is "123.6800, 116.7790, 103.9390" and could you tell me how did you get this data? thanks
Can not download CRN 512p Model using download_models.py. Could you please give the URL for downloading parameters? We could download them from browser.
I want to train this network on my own data. In which format data is required. I have the data in RGB image with its corresponding MASK . Thanking in anticipation
Hi Dr. Chen. It looks like you are using GTA5 dataset. It seems that the size of the image (512x256) is not at the same scale as that of the original GTA5 dataset(1914x1052). What preprocessing did you do? Did you scale and crop the image or simply just scale them?
Hi, I read this code. And I have downloaded the leftImg8bit_trainvaltest.zip from cityscapes' website. Then I run generate_vivid_imges_256full.m and get the corresponding vivid output. My question is where Label256Full from. I don't find code that could generate it.
It seems like continuing a training session of demo_512 and demo_1024 does not work since after restoring a previously trained model it gets immediately overwritten by a blank Saver. I think that last line should be moved before the ckpt check, like in demo_256:
ckpt=tf.train.get_checkpoint_state("result_512p")
if ckpt:
print('loaded '+ckpt.model_checkpoint_path)
saver=tf.train.Saver(var_list=[var for var in tf.trainable_variables() if var.name.startswith('g_')])
saver.restore(sess,ckpt.model_checkpoint_path)
else:
ckpt_prev=tf.train.get_checkpoint_state("result_256p")
saver=tf.train.Saver(var_list=[var for var in tf.trainable_variables() if var.name.startswith('g_') and not var.name.startswith('g_512')])
print('loaded '+ckpt_prev.model_checkpoint_path)
saver.restore(sess,ckpt_prev.model_checkpoint_path)
saver=tf.train.Saver(max_to_keep=1000)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.