Thanks for your great project. However, I have trouble visualizing the predictions. Th

➕ @lixx2938 In your previous work <a href="https://github.com/lixx2938/unsupervise

How to visualize the output? about cgintrinsics HOT 6 OPEN

jundanl commented on July 18, 2024

How to visualize the output?

from cgintrinsics.

Comments (6)

jundanl commented on July 18, 2024 5

I think I solve this question now. Please read the code on computing Reconstruction loss. Here is my explanation.

Gray-scale shading assumption

A colorful pixel can be decomposed into:
1. RGB = intensity * chromaticity
2. That means,
  - input_rgb_image = img_intensity * img_chromaticity
    - Note that, input image here is represented in linear RGB space. The photo loaded from the disk is usually represented in sRGB space. You can find the conversion code in this project. Be careful the input image for the network is in sRGB space, but when computing reconstruction loss, the image is in linear RGB space.
  - R = R_intensity * R_chromaticity
  - S = S_intensity * S_chromaticity
Under gray-scale shading (white lighting) assumption, above can be rewritten as,
- S = S_intensity
- img_chromaticity = R_chromaticity
- R = R_intensity * img_chromaticity
For the more specific explanation for intensity and chromaticity, I recommend you to read:
- Intrinsic images in the wild
- Revisiting Deep Intrinsic Image Decompositions
You can find all these code in the project (especially in the script of data_loader): (if not, please look at code in IIW and SAW repositories)
- rgb to srgb
- srgb to rgb
- compute intensity (it's actually the avarage value of RGB channels)
- compute chromaticity

Network prediction

the outputs of the network actually are:
- S_pred = log(S_intensity)
- R_ pred = log(R_intensity)
If you want to get S and R:
- S = exp(S_pred)
- R = exp(R_pred) * img_chromaticity

Reconstruct original input image

RGB_reconst_image = S * R
If you want to visualize it, convert RGB_reconst_image into sRGB space
- sRGB space is suitable for human eyes

from cgintrinsics.

shapin94 commented on July 18, 2024 3

Thanks to @tlsshh, I've also solved the visualization problem.
Here is my implementation code with different image set (Not IIW nor SAW)
(Based on test_iiw.py)

opt = TrainOptions().parse()  # set CUDA_VISIBLE_DEVICES before import torch


#root = "/home/zl548/phoenix24/"
#full_root = root +'/phoenix/S6/zl548/'

model = create_model(opt)
minc_loader = get_minc_loader(batch_size=16)
   for i, (x,y) in enumerate(minc_loader):
       x = x.float().cuda()
       x = x.view(16, 3, 256, 256)
       output_R, output_S = model.test_minc(x)

       for j in range(0,output_R.size(0)):

           prediction_R = output_R.data[j,:,:,:]
           prediction_R = torch.exp(prediction_R)#.repeat(1,3,1,1)

           prediction_S = output_S.data[j,:,:,:]
           prediction_S = torch.exp(prediction_S)#.repeat(1,3,1,1)

           # calc chromaticity
           srgb_img = x.data[j,:,:,:]
           rgb_img = srgb_to_rgb(np.transpose(srgb_img.cpu().numpy(), (1,2,0)))
           rgb_img[rgb_img <1e-4] = 1e-4
           chromaticity = rgb_to_chromaticity(rgb_img) # opt 1
           # opt 2 chromaticity = rgb_to_chromaticity(srgb_img.cpu().numpy())
           chromaticity = torch.from_numpy(np.transpose(chromaticity, (2,0,1))).contiguous().float()

           p_R = torch.mul(chromaticity, prediction_R.cpu())
           p_R_np = p_R.cpu().numpy()
           p_R_np = np.transpose(p_R_np, (1,2,0))
           p_R_np = cv2.cvtColor(p_R_np, cv2.COLOR_BGR2RGB)

           p_S_np = np.transpose(prediction_S.cpu().numpy(), (1,2,0))
           p_S_np = np.squeeze(p_S_np, axis=2)

           save('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_albedo.png', p_R_np)
           save('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_shading.png', p_S_np)
           cv2.imwrite('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_original.png',np.transpose(srgb_img.cpu().numpy(),(1,2,0)))
       print('save data done')

For better understanding, here's my additional code for test_data function and data_loader

Here's test_data function which I added on intrinsic_model.py

    def test_minc(self, input_):
        prediction_R, prediction_S = self.netG.forward(input_)
        return prediction_R, prediction_S

Here's data_loader function which I added on data_loader

class TestDataSet(Dataset):
    """
    Custom Test DataSet class 
    """
    def __init__(self, transform=None):
        x_test = []
        y_test = []
        test_file_name = 'D:/labels/test2.txt' # image list
        test_file = open(test_file_name, 'r')
        test_list = test_file.readlines()
        test_file.close()

        print('start loading test data')
        for i in range(len(test_list)):
            img = cv2.imread('D:/MINC/minc-2500/minc-2500/' + test_list[i].rstrip('\n')) # load image
            img = cv2.resize(img,(256,256)) #cv2 load image with bgr format
            img = np.transpose(img3, (2,0,1))
            
            x_test.append(img)
            y_test.append(1)#convert_class(label[1])) # ex) 'brick'  #since given project doest not require image label, I randomly gave 1 as a label

        print('loading test data done')
        x = np.asarray(x_test)
        y = np.asarray(y_test)
        #y_test = np_utils.to_categorical(y_test[:len(test_list)], num_classes)

        self.len = len(y)
        #self.transform = transform
        self.x_data = torch.from_numpy(x)
        self.y_data = torch.from_numpy(y)

    def __getitem__(self, index):
        return self.x_data[index], self.y_data[index]
    
    def __len__(self):
        return self.len

def get_minc_loader(batch_size,
                    shuffle=True,
                    show_sample=False,
                    num_workers = 2,
                    pin_memory = True):
    # load dataset
    test_dataset = TestDataSet()

    test_loader = torch.utils.data.DataLoader(
        test_dataset ,batch_size= batch_size, shuffle=False,
        pin_memory=pin_memory)

    return test_loader

from cgintrinsics.

windj007 commented on July 18, 2024 1

➕
@lixx2938 In your previous work Learning Intrinsic Image Decomposition from Watching the World the model predicts two images - 3-channel log-reflectance and 1-channel log-lighting - so I could do torch.exp(log_r + log_s).

But in this repository the model predicts two 1-channel images. How to use them to reconstruct the original image?

from cgintrinsics.

irismanchen commented on July 18, 2024

same question here

from cgintrinsics.

jundanl commented on July 18, 2024

I'm really looking forward to visualizing the prediction R and S……

from cgintrinsics.

zhangmohan commented on July 18, 2024

I have the same question, have you solved it?

from cgintrinsics.

How to visualize the output? about cgintrinsics HOT 6 OPEN

Comments (6)

Gray-scale shading assumption

Network prediction

Reconstruct original input image

Related Issues (8)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent