Comments (6)
I think I solve this question now. Please read the code on computing Reconstruction loss
. Here is my explanation.
Gray-scale shading assumption
- A colorful pixel can be decomposed into:
- RGB = intensity * chromaticity
- That means,
- input_rgb_image = img_intensity * img_chromaticity
- Note that, input image here is represented in linear RGB space. The photo loaded from the disk is usually represented in sRGB space. You can find the conversion code in this project. Be careful the input image for the network is in sRGB space, but when computing reconstruction loss, the image is in linear RGB space.
- R = R_intensity * R_chromaticity
- S = S_intensity * S_chromaticity
- input_rgb_image = img_intensity * img_chromaticity
- Under gray-scale shading (white lighting) assumption, above can be rewritten as,
- S = S_intensity
- img_chromaticity = R_chromaticity
- R = R_intensity * img_chromaticity
- For the more specific explanation for intensity and chromaticity, I recommend you to read:
- Intrinsic images in the wild
- Revisiting Deep Intrinsic Image Decompositions
- You can find all these code in the project (especially in the script of data_loader): (if not, please look at code in IIW and SAW repositories)
- rgb to srgb
- srgb to rgb
- compute intensity (it's actually the avarage value of RGB channels)
- compute chromaticity
Network prediction
- the outputs of the network actually are:
- S_pred = log(S_intensity)
- R_ pred = log(R_intensity)
- If you want to get S and R:
- S = exp(S_pred)
- R = exp(R_pred) * img_chromaticity
Reconstruct original input image
- RGB_reconst_image = S * R
- If you want to visualize it, convert RGB_reconst_image into sRGB space
- sRGB space is suitable for human eyes
from cgintrinsics.
Thanks to @tlsshh, I've also solved the visualization problem.
Here is my implementation code with different image set (Not IIW nor SAW)
(Based on test_iiw.py)
opt = TrainOptions().parse() # set CUDA_VISIBLE_DEVICES before import torch
#root = "/home/zl548/phoenix24/"
#full_root = root +'/phoenix/S6/zl548/'
model = create_model(opt)
minc_loader = get_minc_loader(batch_size=16)
for i, (x,y) in enumerate(minc_loader):
x = x.float().cuda()
x = x.view(16, 3, 256, 256)
output_R, output_S = model.test_minc(x)
for j in range(0,output_R.size(0)):
prediction_R = output_R.data[j,:,:,:]
prediction_R = torch.exp(prediction_R)#.repeat(1,3,1,1)
prediction_S = output_S.data[j,:,:,:]
prediction_S = torch.exp(prediction_S)#.repeat(1,3,1,1)
# calc chromaticity
srgb_img = x.data[j,:,:,:]
rgb_img = srgb_to_rgb(np.transpose(srgb_img.cpu().numpy(), (1,2,0)))
rgb_img[rgb_img <1e-4] = 1e-4
chromaticity = rgb_to_chromaticity(rgb_img) # opt 1
# opt 2 chromaticity = rgb_to_chromaticity(srgb_img.cpu().numpy())
chromaticity = torch.from_numpy(np.transpose(chromaticity, (2,0,1))).contiguous().float()
p_R = torch.mul(chromaticity, prediction_R.cpu())
p_R_np = p_R.cpu().numpy()
p_R_np = np.transpose(p_R_np, (1,2,0))
p_R_np = cv2.cvtColor(p_R_np, cv2.COLOR_BGR2RGB)
p_S_np = np.transpose(prediction_S.cpu().numpy(), (1,2,0))
p_S_np = np.squeeze(p_S_np, axis=2)
save('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_albedo.png', p_R_np)
save('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_shading.png', p_S_np)
cv2.imwrite('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_original.png',np.transpose(srgb_img.cpu().numpy(),(1,2,0)))
print('save data done')
For better understanding, here's my additional code for test_data function and data_loader
Here's test_data function which I added on intrinsic_model.py
def test_minc(self, input_):
prediction_R, prediction_S = self.netG.forward(input_)
return prediction_R, prediction_S
Here's data_loader function which I added on data_loader
class TestDataSet(Dataset):
"""
Custom Test DataSet class
"""
def __init__(self, transform=None):
x_test = []
y_test = []
test_file_name = 'D:/labels/test2.txt' # image list
test_file = open(test_file_name, 'r')
test_list = test_file.readlines()
test_file.close()
print('start loading test data')
for i in range(len(test_list)):
img = cv2.imread('D:/MINC/minc-2500/minc-2500/' + test_list[i].rstrip('\n')) # load image
img = cv2.resize(img,(256,256)) #cv2 load image with bgr format
img = np.transpose(img3, (2,0,1))
x_test.append(img)
y_test.append(1)#convert_class(label[1])) # ex) 'brick' #since given project doest not require image label, I randomly gave 1 as a label
print('loading test data done')
x = np.asarray(x_test)
y = np.asarray(y_test)
#y_test = np_utils.to_categorical(y_test[:len(test_list)], num_classes)
self.len = len(y)
#self.transform = transform
self.x_data = torch.from_numpy(x)
self.y_data = torch.from_numpy(y)
def __getitem__(self, index):
return self.x_data[index], self.y_data[index]
def __len__(self):
return self.len
def get_minc_loader(batch_size,
shuffle=True,
show_sample=False,
num_workers = 2,
pin_memory = True):
# load dataset
test_dataset = TestDataSet()
test_loader = torch.utils.data.DataLoader(
test_dataset ,batch_size= batch_size, shuffle=False,
pin_memory=pin_memory)
return test_loader
from cgintrinsics.
➕
@lixx2938 In your previous work Learning Intrinsic Image Decomposition from Watching the World the model predicts two images - 3-channel log-reflectance and 1-channel log-lighting - so I could do torch.exp(log_r + log_s)
.
But in this repository the model predicts two 1-channel images. How to use them to reconstruct the original image?
from cgintrinsics.
same question here
from cgintrinsics.
I'm really looking forward to visualizing the prediction R and S……
from cgintrinsics.
I have the same question, have you solved it?
from cgintrinsics.
Related Issues (8)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cgintrinsics.