Git Product home page Git Product logo

iccv2019-learningtopaint's Introduction

Hi there 👋

  • I used to be an algorithm contest player NOI🥈, ICPC-regional🏅️.

  • I worked at MEGVII Research From 2017 to 2023. Currently I work at StepFun. I received my B.S. degree from Peking Univerisity in 2020.

Main Projects:

Cooperation Projects:

Google Scholar, 知乎, 算法博客, Email, CV

Service: CVPR22-24/ECCV22-24/ICCV23/AAAI23/NeurIPS23-24/ICLR24/ICML24/WACV24/SIGGRAPH/TIP/TPAMI/TOMM

iccv2019-learningtopaint's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iccv2019-learningtopaint's Issues

Stroke opacity

Hi,
I noticed that each stroke is transparent, so that layers over layers of color will add up over time to form the target picture.
Is there a possibility to adjust the opacity to simulate the painting of a picture using a opaque palette? I guess for that the training of a new model would be necessary.

Thanks in advance.

Training parameters

Hi !

I am trying to train the paint agent in my GPU. In the paper I could read that the training time was about 2 days in your case.

Can you tell me what parameters used you to train the paint agent? In my case the training time is more than 1 week (I am training the agent in a GPU too but I think that there is a lot of time difference).

Thanks so much!

Renderer input features

Hi @hzwer ,
Could you clarify the input feature of the neural renderer as it is 10-value vector or 13-value vector (+RGB).
If training with 10-value vector, how the painter can generate color pictures?

Bests,

stroke

I want to get the final stroke parameters! What should I do? please! Thank you!

How to make L2 rewards work?

I have tried to use L2 reward in ddpg.py line 102 and cancel WGAN optimization, but after the same iterations, this painter is not as good as WGAN reward.
Kindly, how do you make L2 rewards work?

Divide parameter and k=5

Hello :)

I have some doubts...

I have seen that in the algorithm a "divide" parameter is defined which divides the Canvas into mini canvas in order to improve the agent accuracy. But.... I would like to understand when this action is performed during the training (what are the steps). when the actor is going to make a stroke, the canvas is divided and then it is reconstructed?

Also I have seen that for each state the actor performs 5 actions (brush strokes), I understand that the discriminate gives the reward to the actor. But what about with respect to the critic? update q for each of the five actions?

Thank you very much in advance

some typos

noticed some typos in your paper:

  1. equation 3 has a hanging paranthesis in the very right

V(s_t) = r(s_t, a_t) + γV(s_t1))

suggested fix:

V(s_t) = r(s_t, a_t) + γV(s_t1)

  1. on page 5, the first sentence of the last paragraph,

The neural renderer network is consisting of several fully connect layers and convolution layers

suggested fix:

The neural renderer network is consisting of several fully connected layers and convolution layers

Hope it helps :)

training

when running train.py with celebA it automatically gets interrupted

loaded 200000 images finish loading data, 197999 training images, 2001 testing images observation_space (96, 128, 128, 7) action_space 13 /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1332: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") /content/LearningToPaint/baseline/DRL/ddpg.py:158: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). s0 = torch.tensor(self.state, device='cpu') /content/LearningToPaint/baseline/DRL/ddpg.py:161: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). s1 = torch.tensor(state, device='cpu') ^C

also I am on gpu

spectral normalization GAN

Have you tried spectral normalization GAN & adding L1 distance to WGAN loss? I wonder how these two changes could impact the performance:

1. Replacing WGAN-GP with spectral normalization

Spectral normalization has two main advantages:

  1. Slight performance improvement relative to WGAN-GP on ResNet. The inception score of spectral normalization had a slight upper hand — approximately 0.16 — with less deviation compared to WGAN-GP.

  2. Spectral normalization is ~30% more computationally efficient.
    Since both actors and critics use ResNet as the backbone, replacing WGAN-GP with spectral normalization can potentially yield meaningful results.

2. Combining WGAN-GP with spectral normalization

The authors of the spectral normalization paper suggest that combining WGAN-GP with spectral normalization can further improve the results compared to the baseline WGA-GP and spectral normalization GAN.

How to change strokes?

the readme.md said A step contains 5 strokes in default,when I train another model,where i can change strokes?

Output strokes per iteration only.

Hi,
thank you for your work.

I am struggling to modify the code so that, when running python3 baseline/test.py --max_step=80 --actor=actor.pkl --renderer=renderer.pkl --img=path_to_image --divide=5, it generates images with only the strokes added during the latest iteration - instead of the sum of all strokes.

Do you have an idea on this?

Thank you

hard_update

Hello :)

Could you tell me why is necessary this function and what it do exactly?

def hard_update(target, source):
for m1, m2 in zip(target.modules(), source.modules()):
m1._buffers = m2.buffers.copy()
for target_param, param in zip(target.parameters(), source.parameters()):
target_param.data.copy
(param.data)

I do not understand! Thanks so much!

Any other reward function

Very cool project! It seems using GAN loss here is a natural choice to compare the drawing and images. Have you ever tried other losses like the perceptual loss? Thank you!

stroke sequence learning

@hzwer
Can this method be trained not only to pain, but also to pain in a certain sequence?
i am interested in training a network to learn the sequence and order of the drawing and strokes.
any suggestions

Renderer Training Doubts

Hi, I have an issue with the way how the Neural Renderer is trained.
Let's consider a generic ML/DL training procedure: We fix a train/validation set of fixed size and on the same train set we do backpropagation and then we evaluate on the final validation set. But here, we are randomly generating batchSize of 64, in both train and valid (after every 1000th iteration afair) parts and perform training for 5,00,000 epochs. I find this confusing, the randomly generated samples could vary drastically across the epochs, how are you ensuring model improvement? Are you simply trying to overfit the model to all possible combinations of co-ordinates in the canvas? I want to understand why you have taken this approach.

Thanks
Niharika

Stroke gen

Hi, looking at the draw() function it seems like the generator creates greyscale brushstrokes. Where do the colour parameters get inputted?

Undefined name 'init' in actor.py

flake8 testing of https://github.com/hzwer/LearningToPaint on Python 3.7.1

$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics

./baseline/DRL/actor.py:17:9: F821 undefined name 'init'
        init.xavier_uniform(m.weight, gain=np.sqrt(2))
        ^
./baseline/DRL/actor.py:18:9: F821 undefined name 'init'
        init.constant(m.bias, 0)
        ^
2     F821 undefined name 'init'
2

E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.

  • F821: undefined name name
  • F822: undefined name name in __all__
  • F823: local variable name referenced before assignment
  • E901: SyntaxError or IndentationError
  • E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

关于其他数据集的问题

您好!我在使用CUB Birds 和 Stanford cars数据集进行训练时,图片只显示一个颜色,随着训练过程进行也没有其他变化,我对代码的修改仅有load_data(), 为什么会造成这种情况呢?

The differences between `env_batch` and `batch_size`

Hi, I dived into the code of your paper and I'm confused of the two variables env_batch and batch_size, which seems to be the same according to your implementation.

Could you give me some hints to help me figure it out? Thank you very much

Critic and discriminator

Hi!
I am trying to understand the Deep Reinforcement Learning part. I know that the actor outputs is a set of stroke parameters based on the canvas status and target image and the discriminator give (to the actor) a reward at each step . But what about critic? What is the input and the output for the actor? I am reading the paper but I do not understand this part.
thank you so much

cleanup

while I was reviewing your code to better understand your paper, I found some dead code. Would you mind if I clean up some code, add some instructive comments ( for people like me ), and send a PR?

关于 load_data 里的 img_test 疑问?

def load_data(self):
# CelebA
global train_num, test_num
for i in range(200000):
img_id = '%06d' % (i + 1)
try:
img = cv2.imread('/data/CelebA/celeba/img_align_celeba/' + img_id + '.jpg', cv2.IMREAD_UNCHANGED)
img = cv2.resize(img, (width, width))
if i > 2000:
train_num += 1
img_train.append(img)
else:
test_num += 1
img_test.append(img)
finally:
if (i + 1) % 10000 == 0:
print('loaded {} images'.format(i + 1))
print('finish loading data, {} training images, {} testing images'.format(str(train_num), str(test_num)))
请问 在 env.py 文件 load_data 函数中,0~1999 张图片被 append 到 img_test 列表中,请问测试图片在哪里被用到了呢?我想使用这 2000 张图片对模型进行测试定量分析,该怎么用呢?test.py 只是对单张图片进行测试。

Question about Q value

I love this amazing project. I'm surprised that neural networks can do such incredible thing.
There is a small problem about Q value. In the paper cur_q = reward + γ * target_q, so normally it should be "return Q, gan_reward" in evaluate(). This is actually the case in model-free method. But in model-based method it's "return (Q+gan_reward), gan_reward", this makes me confused. Why does the Q value need to be added with the reward of the same step?

how wgan is trained??

Thank you for sharing your awesome work!

In your paper, you mentioned that using wgan discriminator loss to define the reward.

But how wgan is trained in your work?(pre-train to some extent beforehand and using in cal_dis??)

Different Neural Renderer

Hello @hzwer,
Kindly, I have 2 questions:-

  1. I noticed you provided extra renderers in the README file. What modifications did you apply to the stroke_gen file so that you could train those renderers?
  2. What bezierwotrans.pkl --- actor_notrans.pkl files names stand for?

Thanks in advance

Decoding of strokes

The strokes are rendered from parameters to strokes and added to a canvas in the decode function

def decode(x, canvas): # b * (10 + 3)

I've got a couple of questions regarding the procedure. Why does the decoder return

return 1 - x.view(-1, 128, 128)
? It is trained by comparing to the ground truth, why should it learn the inverse, instead of the actual image?

Why is the stroke then

stroke = 1 - Decoder(x[:, :10])
?

And why is it added to the canvas via

canvas = canvas * (1 - stroke[:, i]) + color_stroke[:, i]
?

I don't understand why you would do the 1 - stroke at every step in this chain. Also the canvas is initialized to all zeros. Is the canvas * (1 - stroke[:, k]) in canvas = canvas * (1 - stroke[:, k]) + color_stroke[:, k] really necessary? stroke is included in color_stroke anyway.

Am I missing something? Thanks for any help!

How were straight strokes, circles and triangles generated?

Thanks for your nice work,

I am just wondering, for simple strokes like (flat) circles, triangles, rectangles, do we really need the renderer since we already have simpler state representation? For example, the circle only needs a center and a radius instead of a 10-value state vector.

Run out of memory

When I ran train. py with a GPU. It seems that RAM has run out. My computer has 46G of RAM, including 30G virtual memory.

$ python3 baseline/train.py --max_step=200 --debug --batch_size=96
mkdir: cannot create directory ‘./model’: File exists
loaded 10000 images
loaded 20000 images
loaded 30000 images
loaded 40000 images
loaded 50000 images
loaded 60000 images
loaded 70000 images
loaded 80000 images
loaded 90000 images
loaded 100000 images
loaded 110000 images
loaded 120000 images
loaded 130000 images
loaded 140000 images
loaded 150000 images
loaded 160000 images
loaded 170000 images
loaded 180000 images
loaded 190000 images
loaded 200000 images
finish loading data, 197999 training images, 2001 testing images
observation_space (96, 128, 128, 7) action_space 13
/home/rody/xu/npaint/LearningToPaint/baseline/DRL/ddpg.py:157: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  s0 =torch.tensor(self.state, device='cpu')
/home/rody/xu/npaint/LearningToPaint/baseline/DRL/ddpg.py:163: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  s1 =torch.tensor(state, device='cpu')
 #0: steps:200 interval_time:9.08 train_time:0.00
 #1: steps:400 interval_time:22.40 train_time:0.00
 #2: steps:600 interval_time:19.66 train_time:6.90
 #3: steps:800 interval_time:20.01 train_time:5.28
 #4: steps:1000 interval_time:20.89 train_time:6.01
 #5: steps:1200 interval_time:20.52 train_time:6.34
 #6: steps:1400 interval_time:18.20 train_time:7.01
Killed

Here's the memory footprint

              total        used        free      shared  buff/cache   available
Mem:          15892       15627         139          11         125          81
Swap:         30273       30273           0

Neural Renderer

Hello !

I want to understand the Neural renderer (DL network) part. How did you train this neural renderer?
If there is a dataset, please provide a link for it.
Have you used a traditional rendering algorithm in this case? (If so how ?)

Thank you

Parameter Doubts

Few doubts on parameters :

Q1. Here, what is the difference between max_steps, train_times, and episode_train_times? Can you please define them?

Q2. What happens during the warmup stage? ( Is there any issue if we keep warmup step=0)

About stroke generation

In stroke_gen.py you use Quadratic Bezier Curve to generate stroke. I wonder why (x1, y1) is calculated by (x0, y0) and (x2, y2)

x1 = x0 + (x2 - x0) * x1
y1 = y0 + (y2 - y0) * y1

What would happen if I comment this 2 line?

confused by the update policy.

in the update_policy() :
cur_q, step_reward = self.evaluate(state, action) target_q += step_reward.detach() value_loss = criterion(cur_q, target_q)

it's quite confusing .. so the value_loss = discount*(self.critic(St+1)+reward(St+1)) -self.critic(St) ??

shouldn't it be : Value_loss = discount*(self.critic(St+1)) + reward(St) - self.critic(St) ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.