layout-generation / layout-generation Goto Github PK

View Code? Open in Web Editor NEW

140.0 7.0 23.0 23.01 MB

Layout Generation and Baseline implementations

License: MIT License

Python 41.58% Jupyter Notebook 58.42%

layoutvae layoutgan layouttransformer tensorflow2 pytorch

layout-generation's Introduction

Layout Generation and Baseline Implementation

Layout VAE
Layout Transformer
- Layout Transformer Model Architecture
- Results
LayoutGAN
Quantitative Comparison

Layout VAE

LayoutVAE is a variational autoencoder based model . It is a probabilistic and autoregressive model which generates the scene layout using latent variables in lower dimensions . It is capable of generating different layouts using the same data point.

CountVAE: This is the first part of the layoutVAE model; it takes the label set as input and predicts the counts of bounding boxes for corresponding labels. The input is provided as multilabel encoding.
BBox VAE: This the second part of the model was BBox VAE with LSTM based Embedding Generation. Similar to Countvae here also previous predictions along with the label set and label counts are used as conditioning info for current predictions.

Layout VAE Model

Flow Diagram

Results Obtained

Layout Transformer

Layout Transformer is a model proposed for generating structured layouts which can be used for documents, websites, apps, etc. It uses the decoder block of the Transformer Model, which is able to capture the relation of the document boxes with the previously predicted boxes (or inputs). Since it is an auto-regressive model, it can be used to generate entirely new layouts or to complete existing partial layouts. The paper also emphasized on the fact that this model performs better than the existing models (at that time) and is better in the following aspects:

Able to generate layouts of arbitrary lengths
Gives better alignment due to the discretized grid
Is able to effectively capture the relationships between boxes in a single layout, which gives meaningful layouts

Layout Transformer Model Architecture

Results

LayoutGAN

LayoutGAN uses a GAN network , with the generator taking randomly sampled inputs (class probabilities and geometric parameters) as parameters, arranging them and thus producing refined geometric and class parameters.

Architecture

Results on MNIST

Results on single column layouts

Quantitative Comparison

A total of three metrics were used to compare the models.

Overlapping Loss
Interection over Union (IoU)
Alignment Loss

After Calculating the losses for each model, the following comparison table was obtained:

	Overlap	IOU	Alignment
Original Data	1.000000	1.000000	1.000000
LayoutGAN	1172.005234	2745.437529	1.164882
LayoutVAE	119.320127	185.864381	3.493406
Layout Transformer	1.090315	1.422297	0.739862

layout-generation's People

Contributors

Stargazers

Watchers

layout-generation's Issues

rico_data = np.load(root+'Data/rico_new.npy')

Excuse me. Thanks for you sharing codes, very nice!
For Layout Transformer, I wanna ask you how to find the rico_new.npy dataset, thank you very much!

For LayoutVAE, DATA_PATH link https://developer.ibm.com/exchanges/data/all/publaynet/ shows publaynet dataset has 96 GB, is there any other dataset for training, thank you very much!

For Metrics,
Transformer_res=np.load(root+"trans.npy")
VAE_res = np.load(root+"VAE_res.npy")
GAN_res = np.load(root+"GAN_res.npy")
how to get those npy dataset.

Sorry to bother you. Thanks for your sharing code!! Thanks a million.

Transformer_res=np.load(root+"trans.npy") VAE_res = np.load(root+"VAE_res.npy") GAN_res = np.load(root+"GAN_res.npy")

Excuse me, where are the three npy files? Thanks

How to show the data Publaynet.npy, trans.npy , GAN_res.npy

I mean I wanna have a look for the data vision.
Thanks

What is the format for the numpy file when loading data in LayoutVAE.py

Hi,

I am trying to run the main.py file inside LayoutVAE/Source folder. There is a load_data method inside layoutvae.py module. However, there is no information on the format of this file.

Can you please share some information regarding what format does this .npy file need to be in.

Thanks
Sara

Question about IoU Metrics error

Hi, sorry to bother you. The max value of metrics IoU is 1 and the mix is 0. The result shows 2700 and all of the results rather than 1.

How to create .npy files from the RICO JSON data ??

Hey can you tell me how you converted the RICO semantic annotations.json to RICO.npy file and can you share the link for it too

sorry for troubling you
Thank you in advance

Reproduce Results for Layout Transformer

Hi! I was following you code to reproduce the results for Layout Transformer. To get the results you show on the website, did you use the same setup as shown in the notebook? like batch_size = 1 and trained only with 10k examples for PubLayNet dataset?

can we get the rico data

Hi,

thanks for your code. I wonder if you could share the preprocessed rico data with us? Thanks!

Evaluation metrics

Hi,
Good Repo!

I have few question regarding metrics :

Can you please provide npy files which you used in metrics.ipynb?
Does metrics requires only predictions or both ground truth and predictions?
How many predicted images were used for each model for calculating metrics ?

Layout Transformer Quantitative Evaluation

Overlap, IOU, and Alignment loss of the Layout transformer is close to 1 for all these metrics.
Also, there is a high difference in metrics values between other models.

As per layout_completion function in Layout_Transformer.ipynb , a training set(starting two elements of each file) was used for layout generation. There is high chances that model is memorising the training dataset.

Can you please get the metrics for validation set to check if the metrics values are same for layout transformer?

Thank you!

RICO Training

Hey,
Thank you for releasing this code!

Any pointers on which rico dataset to use and how to preprocess it to feed into the transformer?
Are you starting with "RICO UI SCREENSHOTS AND HIERARCHIES WITH SEMANTIC ANNOTATIONS"?

Thanks!