Git Product home page Git Product logo

lsmi-dataset's Introduction

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination (ICCV 2021)

스크린샷 2021-08-21 오후 3 30 22

Change Log

LSMI Dataset Version : 1.1

1.0 : LSMI dataset released. (Aug 05, 2021)

1.1 : Add option for saving sub-pair images for 3-illuminant scene (ex. _1,_12,_13) & saving subtracted image (ex. _2,_3,_23) (Feb 20, 2022)

About

[Paper] [Project site] [Download Dataset] [Video]

This is an official repository of "Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination", which is accepted as a poster in ICCV 2021.

This repository provides

  1. Preprocessing code of "Large Scale Multi Illuminant (LSMI) Dataset"
  2. Code of Pixel-level illumination inference U-Net
  3. Pre-trained model parameter for testing U-Net

If you use our code or dataset, please cite our paper:

@inproceedings{kim2021large,
  title={Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm Under Mixed Illumination},
  author={Kim, Dongyoung and Kim, Jinwoo and Nam, Seonghyeon and Lee, Dongwoo and Lee, Yeonkyung and Kang, Nahyup and Lee, Hyong-Euk and Yoo, ByungIn and Han, Jae-Joon and Kim, Seon Joo},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={2410--2419},
  year={2021}
}

Requirements

Our running environment is as follows:

  • Python version 3.8.3
  • Pytorch version 1.7.0
  • CUDA version 11.2

We provide a docker image, which supports all extra requirements (ex. dcraw,rawpy,tensorboard...), including specified version of python, pytorch, CUDA above.

You can download the docker image here.

The following instructions are assumed to run in a docker container that uses the docker image we provided.

Getting Started

Clone this repo

In the docker container, clone this repository first.

git clone https://github.com/DY112/LSMI-dataset.git

Download the LSMI dataset

You should first download the LSMI dataset from here.

The dataset is composed of 3 sub-folers named "galaxy", "nikon", "sony".

Folders named by each camera include several scenes, and each scene folder contains full-resolution RAW files and JPG files that is converted to sRGB color space.

Move all three folders to the root of cloned repository.

In each sub-folders, we provides metadata (meta.json), and train/val/test scene index (split.json).

In meta.json, we provides following informations.

  • NumOfLights : Number of illuminants in the scene
  • MCCCoord : Locations of Macbeth color chart
  • Light1,2,3 : Normalized chromaticities of each illuminant (calculated through running 1_make_mixture_map.py)

Preprocess the LSMI dataset

  1. Convert raw images to tiff files

    To convert original 1-channel bayer-pattern images to 3-channel RGB tiff images, run following code:

    python 0_cvt2tiff.py

    You should modify SOURCE and EXT variables properly.

    The converted tiff files are generated at the same location as the source file.

    This process uses DCRAW command, with '-h -D -4 -T' as options.

    There is no black level subtraction, saturated pixel clipping or else.

    You can change the parameters as appropriate for your purpose.

  2. Make mixture map

    python 1_make_mixture_map.py

    Change the CAMERA variable properly to the target directory you want.

    This code does the following operations for each scene:

    • Subtract black level (no saturation clipping)
    • Use Macbeth Color Chart's achromatic patches, find each illuminant's chromaticities
    • Use green channel pixel values, calculate pixel level illuminant mixture map
    • Mask uncalculable pixel positions (which have 0 as value for all scene pairs) to ZERO_MASK

    After running this code, npy tpye mixture map data will be generated at each scene's directory.

    ⚠️ If you run this code with ZERO_MASK=-1, the full resolution mixture map may contains -1 for uncalculable pixels. You MUST replace this value appropriately before resizing to prevent this negative value from interpolating with other values.

  3. Crop for train/test U-Net (Optional)

    python 2_preprocess_data.py

    This preprocessing code is written only for U-Net, so you can skip this step and freely process the full resolution LSMI set (tiff and npy files).

    The image and the mixture map are resized as a square with a length of the SIZE variable inside the code, and the ground-truth image is also generated.

    Note that the side of the image will be cropped to make the image shape square.

    If you don't want to crop the side of the image and just want to resize whole image anyway, use SQUARE_CROP=False

    We set the default test size to 256, and set train size to 512, and SQUARE_CROP=True.

    The new dataset is created in a folder with the name of the CAMERA_SIZE. (Ex. galaxy_512)

Use U-Net for pixel-level AWB

You can download pre-trained model parameter here.

Pre-trained model is trained on 512x512 data with random crop & random pixel level relighting augmentation method.

Locate downloaded models folder into SVWB_Unet.

  • Test U-Net

    cd SVWB_Unet
    sh test.sh
  • Train U-Net

    cd SVWB_Unet
    sh train.sh

Dataset License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

lsmi-dataset's People

Contributors

dy112 avatar haoban avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

lsmi-dataset's Issues

illum_map becomes ZERO in 2_preprocess_data.py due to alpha and chormacities

Hello,

in preprocessing_data.py code, there is a LINE
/

img_wb = img / illum_map 

This is given in the paper as

Iab(WB) = Iab / Lab where Lab is illum_map. [equation 1]

Illum map is calculated as alpha* La + (1-alpha)*Lb = Lab

In this case, if alpha = 0, then Lb = Lab

Lb is an RGB vector. In the case there is "zero" chromacity in even one channel of RGB, then Lb would have an element of "0", causing a zero division in [equation 1].

This issue is arising, for example, in galaxy dataset on illum_map[1143,376,2] pixel value which is the blue one,

How to overcome this issue? Should there be a workaround in calculating alpha, say, alpha cannot become 0, or should there be a minimum chromacity value in illuminant RGB vectors other than simply "0"?

Thanks!

about pre-trained model

hi ,
i found that i can't download pre-trained model ,your link can't open...can you reshare your pre-trained model? thank you so much..

tint_map has negative values resulting in NaN in uvl conversion

Hello,
In dataloader_v4.py:

if self.random_color and self.split=='train':
     augment_chroma = self.random_color(illum_count)
     ret_dict["illum_chroma"] *= augment_chroma
     tint_map = mix_chroma(uncalculable_masked_mixmap,augment_chroma,illum_count)
     input_rgb = input_rgb * tint_map    # apply augmentation to input image

random augmentation to input image is applied here. However, when tint_map has negative values, input_rgb image would have also negative values and then:

def rgb2uvl(img_rgb):
    """
    convert 3 channel rgb image into uvl
    """
    epsilon = 1e-8
    img_uvl = np.zeros_like(img_rgb, dtype='float32')
    img_uvl[:,:,2] = np.log(img_rgb[:,:,1] + epsilon)
    img_uvl[:,:,0] = np.log(img_rgb[:,:,0] + epsilon) - img_uvl[:,:,2]   
    img_uvl[:,:,1] = np.log(img_rgb[:,:,2] + epsilon) - img_uvl[:,:,2]

    return img_uvl

image_rgb goes into this function and logarithm gets negative values. Therefore, img_uvl got NaN values.

How would you rain and obtain your results with using augmentation without having NaN issue mentioned here?

A workaround might be limiting tint_map matrix to have 0 for negative elements of itself but this might affect the reproducing your results badly.

Looking forward to hearing from you and thank you!

Cem

Error in Visualization part

For visualising coefficient map(.npy file) , I have changed VISUALIZE to True. After making this changes in make_mixture_map file ,I run this file then it is showing ValueError: could not broadcast input array from shape(6000,8000) into shape (1500,2000),line 214, in apply_wb_raw.

MAE_rgb and MAE_illum

Hello, may I ask whether the average Angle error result in the paper is the MAE_illum result or the MAE_rgb result in the public code?
Looking forward to your reply

Sony dataset

Hi,

I am using Sony dataset for my task at hand.
I'm following your provided code. I have query related to mixture map part where you are defining camera and RAW_EXT (for saturation). I have done modification in CAMERA variable with path (D:/Sony). For RAW_EXT am I supposed to use CAMERA.dng file (provided on GitHub "sony.dng") or I have to replace that with "CAMERA.arw".

In later case I'm getting error "input/out error".

about data collection

Hello author, I am very interested in your article and methods and would like to ask some questions about data collection.
I would like to know if there are any restrictions or requirements on the brightness of different light sources, the difference in color temperature, and the position of color cards when collecting data? Because I previously used a camera to capture RGB data without AWB to generate GT, but this time with the same code and equipment, I am unable to implement the correct GT. Where should I consider this issue? Also, I would like to confirm with you that pic12- pic1=pic2. Pic12 and Pic1 here are both RGB graphs without AWB, right?
i wish your responed...thxxx

No access to dataset

Hi

I am interested in using your dataset.
Unfortunately link provided on github page doesn't work anymore.
Google Drive provided on your official website doesn't work as well.
Is it possible to fix it?

Aleksander Kempski

The problem of Sony-sensor's effective bit-depth

Based on your analysis,this Sony sensor seems a bit odd. I noticed it is a 14-bit depth sensor, but if that's the case, its black level should be closer to 512. However, according to your analysis, 128 is what allows for correct image output, which suggests that you might have used a 12-bit mode for saving. In other words, the effective bit depth of this Sony sensor's RAW data is actually 12 bits, right?

Originally posted by @shuwei666 in #12 (comment)

bug in preprocess_data.py

in preprocess data file all the obtained output image are black in color
it's not the white balanced image which we need for training the model

about train size

hi,
dongyoung,I would like to ask you a question.

i found that you make all images resized to 256×256 images as used in U-Net.i want to know why resized to 256,if i want use 4k(3840,2160) to train unet, Will this have any impact? and , if i train use 256 but when i test image, i use 4k image, Will it affect accuracy?
In short, what is the size of my training that I want to apply to 4k images? I saw that you have a crop operation, and I am very puzzled why do you still need to resize the original image after the crop? I would like to hear your suggestions and look forward to your response.

Which script is used to get the correct GT?

To obtain sRGB images of ground truth,I processed the images in two ways with your scripts. For detail:
1st: run '1_make_mixture_map.py' with 'VISUALIZE=True'. And I got GT sRGB image, e.g. 'Placexxx_wb12.png'.
2nd: run '2_preprocess_data.py' with full resolution(i.e. SQUARE_CROP = False, SIZE = None, TEST_SIZE = None), and got GT tiff image, e.g. 'Placexxx_12_gt.tiff'. Then processed it with 'visualize.py' and got GT sRGB image, e.g. 'Placexxx_12_gt.png'.
HOWEVER, I compared these two sRGB images, and they are different.
What causes this and what should I do to fix it?
Looking forward to your reply.

about pick data

hi,
your work is great! and I want to know that how to sign the label for colorchecker's location? u use the some model like yolo?or segmentation? or you use your hand to sign the every color checker? I wish your response thx..

SAMSUNG S23 ultra data

I have a few questions for which i have not found any documentation.

  1. The main sensor images (108 MPix I assume) is only represented as 12 Mpix. Is this the highest resolution available as raw from samsung? Do they have a description on how the DNG raw is obtained from the original data?
  2. The raw data appears to be clipped at 0... with 0 offset. This makes it difficult to avoid color bias in dark regions (as the negative values are now, i think, pre-clipped and unknown).
  3. Scene illumination: Do you have any information about the room and extra illuminant (2, and 3) in the scenes? What types of illumination (FL, incandescent, LED, other..)? I understand that one of the extra lights are used on many scenes. Any spec on this? CCT, spectrum, ??
  4. I have realized an explicit pipeline using the linear Color matrices together with Bradford transform. One scene Place71, i noticed that the difference image (img12 -img_1) looks gray ladder steps. I am a bit surprised by this, as the ladder looks perfectly partially green in the img_1 (sun alone) version. All tests done using second white (gray) patch in chart 0 for WB. I do not understand this large tinge...
    Place71_diff_12-1
    Place71_1_CM1

HDRNet

hello, can you release the code of HDRNet training under this dataset?

Error in the meta,json of sony

Hello, in the process of generating sony data set gt with the code you provided, it was found that there were nan data in the meta-. json file, and the results generated according to files 0 and 1 and 2 could not reproduce your paper correctly. How can we solve this problem?
Looking forward to your reply.

json files for nikon and sony subsets

Could you please share JSON files for nikon and sony subsets separately from RAW data zips? Especially the one for metadata, namely "meta.json".

We could extract the RAW data without downloading z01 and z02 files by using some unzipping tricks, however, it does not produce JSON files for the metadata. This leads to some issues in preparing mixture maps.

We do not have storage space for downloading all of them at that time. So, if you have any chance to share JSON files for nikon and sony subsets, we would be very happy!

[BUG] Wrong arxiv link

Hi, dong young.
I'm very impressed by your amazing work, LSMI.
It seems your link to arxiv is wrong.

image

I hope it will be fixed soon.
Take care :)

[DATA] mixture map generation

Hi DongYoung,

I met some problem when using the generated the mixture map from Nikon D810 dataset. Could you check the scene I put in following? It's a kinda weird illumination contour. The image is Place550_12, in Nikon dataset. I do have a different result both from rawpy and own prediction at the blueish color cast area.

Place550_12

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.