Git Product home page Git Product logo

layout-generation's Introduction

Layout Generation and Baseline Implementation

Contents

Layout VAE

LayoutVAE is a variational autoencoder based model . It is a probabilistic and autoregressive model which generates the scene layout using latent variables in lower dimensions . It is capable of generating different layouts using the same data point.

  • CountVAE: This is the first part of the layoutVAE model; it takes the label set as input and predicts the counts of bounding boxes for corresponding labels. The input is provided as multilabel encoding.
  • BBox VAE: This the second part of the model was BBox VAE with LSTM based Embedding Generation. Similar to Countvae here also previous predictions along with the label set and label counts are used as conditioning info for current predictions.

Layout VAE Model

modelvae

Flow Diagram

Architecture

Results Obtained

VAE_result

Layout Transformer

Layout Transformer is a model proposed for generating structured layouts which can be used for documents, websites, apps, etc. It uses the decoder block of the Transformer Model, which is able to capture the relation of the document boxes with the previously predicted boxes (or inputs). Since it is an auto-regressive model, it can be used to generate entirely new layouts or to complete existing partial layouts. The paper also emphasized on the fact that this model performs better than the existing models (at that time) and is better in the following aspects:

  • Able to generate layouts of arbitrary lengths
  • Gives better alignment due to the discretized grid
  • Is able to effectively capture the relationships between boxes in a single layout, which gives meaningful layouts

Layout Transformer Model Architecture

Trans_model

Results

Trans_result

LayoutGAN

LayoutGAN uses a GAN network , with the generator taking randomly sampled inputs (class probabilities and geometric parameters) as parameters, arranging them and thus producing refined geometric and class parameters.

Architecture

Results on MNIST

Results on single column layouts

Quantitative Comparison

A total of three metrics were used to compare the models.

  • Overlapping Loss
  • Interection over Union (IoU)
  • Alignment Loss

After Calculating the losses for each model, the following comparison table was obtained:

Overlap IOU Alignment
Original Data 1.000000 1.000000 1.000000
LayoutGAN 1172.005234 2745.437529 1.164882
LayoutVAE 119.320127 185.864381 3.493406
Layout Transformer 1.090315 1.422297 0.739862

layout-generation's People

Contributors

23yashm avatar nicky7767 avatar swapnil2001-master avatar tanishk1gupta avatar tushar-jain01 avatar yashjain7856 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

layout-generation's Issues

Question about IoU Metrics error

Hi, sorry to bother you. The max value of metrics IoU is 1 and the mix is 0. The result shows 2700 and all of the results rather than 1.

rico_data = np.load(root+'Data/rico_new.npy')

Excuse me. Thanks for you sharing codes, very nice!
For Layout Transformer, I wanna ask you how to find the rico_new.npy dataset, thank you very much!

For LayoutVAE, DATA_PATH link https://developer.ibm.com/exchanges/data/all/publaynet/ shows publaynet dataset has 96 GB, is there any other dataset for training, thank you very much!

For Metrics,
Transformer_res=np.load(root+"trans.npy")
VAE_res = np.load(root+"VAE_res.npy")
GAN_res = np.load(root+"GAN_res.npy")
how to get those npy dataset.

Sorry to bother you. Thanks for your sharing code!! Thanks a million.

What is the format for the numpy file when loading data in LayoutVAE.py

Hi,

I am trying to run the main.py file inside LayoutVAE/Source folder. There is a load_data method inside layoutvae.py module. However, there is no information on the format of this file.

Can you please share some information regarding what format does this .npy file need to be in.

Thanks
Sara

Reproduce Results for Layout Transformer

Hi! I was following you code to reproduce the results for Layout Transformer. To get the results you show on the website, did you use the same setup as shown in the notebook? like batch_size = 1 and trained only with 10k examples for PubLayNet dataset?

Layout Transformer Quantitative Evaluation

Screenshot 2021-11-18 at 5 02 34 PM

Overlap, IOU, and Alignment loss of the Layout transformer is close to 1 for all these metrics.
Also, there is a high difference in metrics values between other models.

As per layout_completion function in Layout_Transformer.ipynb , a training set(starting two elements of each file) was used for layout generation. There is high chances that model is memorising the training dataset.

Can you please get the metrics for validation set to check if the metrics values are same for layout transformer?

Thank you!

Evaluation metrics

Hi,
Good Repo!

I have few question regarding metrics :

  1. Can you please provide npy files which you used in metrics.ipynb?

  2. Does metrics requires only predictions or both ground truth and predictions?

  3. How many predicted images were used for each model for calculating metrics ?

RICO Training

Hey,
Thank you for releasing this code!

Any pointers on which rico dataset to use and how to preprocess it to feed into the transformer?
Are you starting with "RICO UI SCREENSHOTS AND HIERARCHIES WITH SEMANTIC ANNOTATIONS"?

Thanks!

can we get the rico data

Hi,

thanks for your code. I wonder if you could share the preprocessed rico data with us? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.