tau-yihouxiang / ws_dan Goto Github PK

View Code? Open in Web Editor NEW

111.0 111.0 29.0 554 KB

The official TensorFlow implementation of WS-DAN.

Python 81.44% Shell 0.76% Jupyter Notebook 17.80%

ws_dan's People

Contributors

Stargazers

Watchers

ws_dan's Issues

Why are pooled features multiplied by 100?

In bilinear_attention_pooling the pooled features are multiplied by 100. Why is that?

Problem Converting Aircraft dataset

DateSets

In eval(),some about Object Localization and Refinement

Thanks for your work so much!
In your works,The result of the final output is the average of the two output probabilities?
i have some confusion!!thank!

is threshold 0.5 or from random.uniform()?

In the paper, thresholds of cropping and dropping are 0.5. But in the implementation code, they are sampled with random.uniform().

Some performance issues in the programs

Hello, I found that there are some little performance issue in tf.random_uniform WS_DAN//nets/nasnet/nasnet_test.py. If the function is called a lot, the efficiency of program execution will be reduced. I think that tf.random_uniform should be created before the loop. There are also several similar places, such as inputs, inputs, inputs and here.

当batchsize设为4或8时报错

train_sample.py中的np.random.choice(np.arange(0, num_parts), 1, p=part_weights) 处报错 “probabilities contain NaN”.
使用batchsize=1, 12, 16, 32均不报错。请问是为什么呢？该算法对batchsize有特殊限制？

PyTorch version code is available. Thank wvinzh!

I run the tf code and got a 89+% acc, I think my implementation is almost the same as your tf version, so is there any details that you didn't mentioned in the paper?

Different claims for the paper and the code on attention regularization

Hi there,

Thanks for the contribution! After reading the code, I am kind of confused on the attention regularization part. Please correct me if there is some misunderstanding.

From the code, what I understand for the center loss part is that for every class(label), you have a center for the features and obviously those features are also used for softmax classification with multiplying a scale 100. However, what you claimed in the paper is that the center loss is used for the attention regularization which will assign each attention feature in the feature matrix a center. The equation you used in the paper for center loss is the sum of distance difference between those attention features ("with an distinguished M in the equation").

Is there any explanation of doing this?

advisory

Hello, author.
Is the version of tensorflow you used version 1.12 after 1.9?
Thank you！

consult

Hello, author.
Ask you a question: Does your data generate tfrecord, is it normalized when reading? and
I did not find the relevant code.
preprocess_for_train function parameters
Args:
image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
[0, 1], otherwise it would converted to tf.float32 assuming that the range
is [0, MAX], where MAX is largest positive representable number for
int(8/16/32) data type (see tf.image.convert_image_dtype for details).
height: integer
width: integer
bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
where each coordinate is [0, 1) and the coordinates are arranged
as [ymin, xmin, ymax, xmax].
fast_mode: Optional boolean, if True avoids slower transformations (i.e.
bi-cubic resizing, random_hue or random_contrast).
scope: Optional scope for name_scope.
add_image_summaries: Enable image summaries.
Returns:
3-D float Tensor of distorted image used for training with range [-1, 1].

引用

你好，这篇文章最后发表在哪了？我想引用，但是找不到除了arxiv的bib。

embeddings = end_points_1['embeddings']这里报错，KeyError: 'embeddings'这里是取的那一层的特征图啊？谢谢！

advisory

Author, hello, may I ask you to use the tensorflow version?
thanks

Embedding dimension

The depth of the feature_maps, aka the depth of Mixed_6e from Inception_v3, is 768 and by default 32 attention_maps are generated, then after the BAP module, the width and height of tensor are reduced, leaving a tensor of shape (N, 32, 768), right?
Then it is normalized and reshape to (N, 32*768) as the embeddings. It confuses me that wouldn't it a bit too large for an embedding? I read other papers about metric learning and most of them would not generate an embedding of size large than 512.

consult

Hello, can you ask some questions?
In the test, when the batch-size is less than 64, the memory usage is 8707M. What is the operation? I don't understand.
excuse me.

something wrong when converting Aircraft

$ python convert_data.py --dataset_name=Aircraft --dataset_dir=./Aircraft/Data
Traceback (most recent call last):
File "convert_data.py", line 76, in
tf.app.run()
File "/mnt/disk0/home/mahailong/anaconda3/envs/CenterNet/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "convert_data.py", line 65, in main
convert_aircraft.run(FLAGS.dataset_dir)
File "/mnt/disk0/home/mahailong/WS_DAN/datasets/convert_aircraft.py", line 196, in run
train_dataset, test_dataset = generate_datasets(dataset_dir)
File "/mnt/disk0/home/mahailong/WS_DAN/datasets/convert_aircraft.py", line 155, in generate_datasets
train_info = np.loadtxt(os.path.join(data_root, 'fgvc-aircraft-2013b/data', 'images_variant_trainval.txt'), str)
File "/mnt/disk0/home/mahailong/anaconda3/envs/CenterNet/lib/python3.6/site-packages/numpy/lib/npyio.py", line 1141, in loadtxt
for x in read_data(_loadtxt_chunksize):
File "/mnt/disk0/home/mahailong/anaconda3/envs/CenterNet/lib/python3.6/site-packages/numpy/lib/npyio.py", line 1065, in read_data
% line_num)
ValueError: Wrong number of columns at line 1235

tau-yihouxiang / ws_dan Goto Github PK

ws_dan's People

Contributors

Stargazers

Watchers

Forkers

ws_dan's Issues

Recommend Projects

Recommend Topics

Recommend Org