tau-yihouxiang / ws_dan Goto Github PK
View Code? Open in Web Editor NEWThe official TensorFlow implementation of WS-DAN.
The official TensorFlow implementation of WS-DAN.
In bilinear_attention_pooling
the pooled features are multiplied by 100. Why is that?
Thanks for your work so much!
In your works,The result of the final output is the average of the two output probabilities?
i have some confusion!!thank!
In the paper, thresholds of cropping and dropping are 0.5. But in the implementation code, they are sampled with random.uniform().
Hello, I found that there are some little performance issue in tf.random_uniform WS_DAN//nets/nasnet/nasnet_test.py. If the function is called a lot, the efficiency of program execution will be reduced. I think that tf.random_uniform should be created before the loop. There are also several similar places, such as inputs, inputs, inputs and here.
train_sample.py中的np.random.choice(np.arange(0, num_parts), 1, p=part_weights) 处报错 “probabilities contain NaN”.
使用batchsize=1, 12, 16, 32均不报错。请问是为什么呢?该算法对batchsize有特殊限制?
I run the tf code and got a 89+% acc, I think my implementation is almost the same as your tf version, so is there any details that you didn't mentioned in the paper?
Hi there,
Thanks for the contribution! After reading the code, I am kind of confused on the attention regularization part. Please correct me if there is some misunderstanding.
From the code, what I understand for the center loss part is that for every class(label), you have a center for the features and obviously those features are also used for softmax classification with multiplying a scale 100. However, what you claimed in the paper is that the center loss is used for the attention regularization which will assign each attention feature in the feature matrix a center. The equation you used in the paper for center loss is the sum of distance difference between those attention features ("with an distinguished M in the equation").
Is there any explanation of doing this?
Hello, author.
Is the version of tensorflow you used version 1.12 after 1.9?
Thank you!
Hello, author.
Ask you a question: Does your data generate tfrecord, is it normalized when reading? and
I did not find the relevant code.
preprocess_for_train function parameters
Args:
image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
[0, 1], otherwise it would converted to tf.float32 assuming that the range
is [0, MAX], where MAX is largest positive representable number for
int(8/16/32) data type (see tf.image.convert_image_dtype
for details).
height: integer
width: integer
bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
where each coordinate is [0, 1) and the coordinates are arranged
as [ymin, xmin, ymax, xmax].
fast_mode: Optional boolean, if True avoids slower transformations (i.e.
bi-cubic resizing, random_hue or random_contrast).
scope: Optional scope for name_scope.
add_image_summaries: Enable image summaries.
Returns:
3-D float Tensor of distorted image used for training with range [-1, 1].
你好,这篇文章最后发表在哪了?我想引用,但是找不到除了arxiv的bib。
Author, hello, may I ask you to use the tensorflow version?
thanks
The depth of the feature_maps, aka the depth of Mixed_6e
from Inception_v3, is 768 and by default 32 attention_maps are generated, then after the BAP module, the width and height of tensor are reduced, leaving a tensor of shape (N, 32, 768), right?
Then it is normalized and reshape to (N, 32*768) as the embeddings. It confuses me that wouldn't it a bit too large for an embedding? I read other papers about metric learning and most of them would not generate an embedding of size large than 512.
Hello, can you ask some questions?
In the test, when the batch-size is less than 64, the memory usage is 8707M. What is the operation? I don't understand.
excuse me.
$ python convert_data.py --dataset_name=Aircraft --dataset_dir=./Aircraft/Data
Traceback (most recent call last):
File "convert_data.py", line 76, in
tf.app.run()
File "/mnt/disk0/home/mahailong/anaconda3/envs/CenterNet/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "convert_data.py", line 65, in main
convert_aircraft.run(FLAGS.dataset_dir)
File "/mnt/disk0/home/mahailong/WS_DAN/datasets/convert_aircraft.py", line 196, in run
train_dataset, test_dataset = generate_datasets(dataset_dir)
File "/mnt/disk0/home/mahailong/WS_DAN/datasets/convert_aircraft.py", line 155, in generate_datasets
train_info = np.loadtxt(os.path.join(data_root, 'fgvc-aircraft-2013b/data', 'images_variant_trainval.txt'), str)
File "/mnt/disk0/home/mahailong/anaconda3/envs/CenterNet/lib/python3.6/site-packages/numpy/lib/npyio.py", line 1141, in loadtxt
for x in read_data(_loadtxt_chunksize):
File "/mnt/disk0/home/mahailong/anaconda3/envs/CenterNet/lib/python3.6/site-packages/numpy/lib/npyio.py", line 1065, in read_data
% line_num)
ValueError: Wrong number of columns at line 1235
Finished trainig!Saving model to disk.保存模型的时候 attempting to use a closed FileWriter .The operation will be a noop unless the FileWriter is explicitly reopened.
Which hyperparameter setting has achieved the results in the paper?
Hi @tau-yihouxiang @ManWingloeng @Danbinabo, Can any one please tell me how to run the inference on input image or video? I have trained the model using readme instructions. but I don't know inference for single image.
Please help me out.
Thanks a lot.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.