Git Product home page Git Product logo

Comments (7)

speculaas avatar speculaas commented on August 25, 2024

"how you created the database?"
Sorry, I may not be clear in previous comment.

What I want to ask is your design idea.
A general picture of:
In the training script,
how images are arranged in the trn.tfrecords
and how score(full image) - score(crop) is computed as the result of arrangement you designed?

Especially,
when putting the image into trn.tfrecords, full image and its crop seemed to be right next to each other,
but when computing the score(full) minus score(crop)
what
q = score(feature_vec)
p = tf.matmul(loss_matrix,q)

these two lines doing is as if the full image and its crop are:

batch_size apart from each other, as the following script seems to say:
loss_matrix[k,k] = 1
loss_matrix[k,k+batch_size] = -1

Best Regard,
JimmyYS

from view-finding-network.

kloppjp avatar kloppjp commented on August 25, 2024

Hey,

your observation is correct: each entry in the DB has six channels, three for the crop and three for the original. However, these are split and separately queued for batching. At training time, we take out one batch of crops its corresponding batch of originals and concatenate them to a single array, please see also https://github.com/yiling-chen/view-finding-network/blob/master/vfn_train.py#L82

For the sliding window part, @yiling-chen will help you ;)

Jan

from view-finding-network.

speculaas avatar speculaas commented on August 25, 2024

Dear Yiling,
After I traced : vfn_train.py and create_dbs.py
again, I can see the how full img and crop img are arranged:

  1. when writing to trn.tfrecords: full and crop are indeed next to each other:
    img_comb = (np.append(img_crop, img_full ...
  2. and when doing "def read_and_decode()" , the image_raw is split along axis=2:
    return tf.split(image, 2, 2)

and then arranged such that "training_images" contains an array of cropped images followed by an array of full images:
crop, full = read_and_decode(
return tf.concat([crops, fulls], 0)

Embarrassingly, question no. 1 seemed to be trivial, and I can now see clearly how the images are arranged in trn.tfrecords.

What remained to clarify is question no.2

Best Regard,
JimmyYS

from view-finding-network.

speculaas avatar speculaas commented on August 25, 2024

Dear Jan,
Thanks for your help,
I didn't see your response while I was clarify my own embarrassingly trivial question no.1
And I see your response now.
Thanks again for helping me so quickly.

from view-finding-network.

yiling-chen avatar yiling-chen commented on August 25, 2024

Hi @speculaas,

sliding_window.json, as its name suggests, is simply sliding windows. :)
It was originally used in my another work.
https://github.com/yiling-chen/flickr-cropping-dataset
Since our goal was to provide a fair benchmark between all baseline image croppers, we used a fixed set of candidate windows to let every baseline pick the best crop and compare the accuracy (with ground truth). Note that to enhance the performance of an image cropper, you are welcome to apply more advanced methods to generate good proposal windows before feeding them into the image croppers.

You can find a sample implementation of generating the sliding windows on-the-fly and evaluate an image cropper with saliency map here.
https://github.com/yiling-chen/flickr-cropping-dataset/blob/master/baselines/saliency_crop.py

from view-finding-network.

speculaas avatar speculaas commented on August 25, 2024

Dear Yiling,
Thanks for your pointer!
Your response is exactly what I was looking for.

I think what I had in mind is a model to suggest a good composition for a camera user.

BR,
JimmyYS

from view-finding-network.

speculaas avatar speculaas commented on August 25, 2024

Sorry, my reponse seemed imcomplete:

"I think what I had in mind is a model to suggest a good composition for a camera user."
As a result, when I found your paper, I was looking not only for a ranker, but also something like crop generator. And then I saw the crop are pre-generated.
And thanks for your clarification, I see that:

  1. how the crops can be generated,
    and 2. these pre-generated crops also serve as a fair benchmark

from view-finding-network.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.