PyTorch code for "Locating objects without bounding boxes" - Loss function and trained models

License: Other

Python 100.00%

locating-objects-without-bboxes's People

Stargazers

Watchers

locating-objects-without-bboxes's Issues

Can weighted Hausdorff distance be implemented into Semantic Segmentation task?

Hi, very excited reading your paper and code.
The loss function containing weighted Hausdorff distance was designed for object detection task in this paper, can weighted Hausdorff distance be directly introduced into semantic segmentation task?

Thank you very much!

Runtime Error: An attempt has been made to start a new process before the current process has finished

Hello, I am having the same issue as #9 and #11 , which were both closed due to inactivity. Is the only suggested solution to use this on a Unix machine? I am running this on Windows and currently do not have access to a Unix machine.

This the command that I used:
python -m object-locator.train --train-dir training --val-dir auto --epochs 100 --no-cuda --save lobes.ckpt --visdom-server http://localhost --visdom-port 8097 --visdom-env main

The full error message can be found here: https://pastebin.com/gvcTXYxV

I get the same error when trying to run locate.py with this command:
python -m object-locator.locate --dataset mall_dataset --out output --model mall,lambdaa=1,BS=32,Adam,LR1e-4.ckpt --no-cuda

training from scratch for the ShanghaitechB dataset

I use the pre-trained model you gave (ShanghaitechB),and get a pretty good results ,but I use the example you gave about train.py, and try to train the dataset (ShanghaitechB).
But I get totally different results. Your pre-trained get good results, but my model can not get good results. the esitimated_map are full of yellow . It can not locate anyone! No matter which epoch I choose(100,500,1000),results are same. Full of yellow blocks

Pre-trained model

Hi, Thanks for providing the code. It is mentioned in the code that a pre-trained model comes with this package. The pre-trained model "unet_256x256_sorghum.ckpt" is not available in checkpoints folder. I was trying to test the code initially on the provided test set. It is giving an error on this file while testing. Should I train it again with any data set first then?

Have u ever tried other loss functions?

Have u tried to generate heatmaps around gt, then train the network with focal loss?

need help for validation and locate.py

the train.py and locate.py are runing .new problem occurs.I found gpu is used when training.but code gets slow when validation.why gpu is not be used when validation.
when locate.py is runing ,it has the same problem with the validation.I am sure my gpu is usable and I have inputed right arguments

A question about f1-scores in the Figure 6 and Figure 8.

In your paper, figure 8 shows the f1-scores on three dataset are over 88.6 on r=5. however,in my opinion, the figure 6 shows the f1-scores on r=5 is under 60. My question is which database is figure 6 based on ? or I have a wrong understanding on Figure 6. Can you give me some suggestions?
$_TK{}3U8A7A (A5VT5 }J$

@javiribera

Calculating number of points

How can we calculate number of points?

About greater batch size during validation

Hi, thanks for the great work!

When I was training with my own data, the bottleneck of the whole training was apparently the validation process since it's using batch size = 1. In one epoch of my training, it cost only 20% of time for training and back-prop but cost 80% for validation. Why can't the batch size for validation be greater than 1? Is there any issue or problems to be solved in order to have this feature?

need help for this problem about runing train.py

(object_loader) D:\google_download\locating-objects-without-bboxes-master>python -m object-locator.train --train-dir D:/_PNGImages --batch-size 32 --lr 1e-3 --val-dir D:/_PNGImages --optim Adam --save saved_model.ckpt
W: Not connected to any Visdom server. You will not visualize any training/validation plot or intermediate image
replace ballpark
Building network... DONE (took 0.363471 seconds)
Epoch 0 (420 images): 0%| | 0/14 [00:00<?, ?it/s]W
: Not connected to any Visdom server. You will not visualize any training/validation plot or intermediate image
replace ballpark
Building network... DONE (took 0.365666 seconds)
Epoch 0 (420 images): 0%| | 0/14 [00:00<?, ?it/s]T
raceback (most recent call last):
Traceback (most recent call last):
File "D:\anaconda\envs\object_loader\lib\runpy.py", line 193, in _run_module_as_main
File "", line 1, in
File "D:\anaconda\envs\object_loader\lib\multiprocessing\spawn.py", line 105, in spawn_main
"main", mod_spec)
exitcode = _main(fd)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\spawn.py", line 114, in _main
File "D:\anaconda\envs\object_loader\lib\runpy.py", line 85, in _run_code
prepare(preparation_data)exec(code, run_globals)

File "D:\google_download\locating-objects-without-bboxes-master\object-locator\train.py", line 170, in
File "D:\anaconda\envs\object_loader\lib\multiprocessing\spawn.py", line 223, in prepare
for batch_idx, (imgs, dictionaries) in enumerate(iter_train):
_fixup_main_from_name(data['init_main_from_name']) File "D:\anaconda\envs\object_loader\lib\site-packages\tqdm_tqdm.py", line 940, in iter

File "D:\anaconda\envs\object_loader\lib\multiprocessing\spawn.py", line 249, in _fixup_main_from_name
alter_sys=True)
File "D:\anaconda\envs\object_loader\lib\runpy.py", line 205, in run_module
return _run_module_code(code, init_globals, run_name, mod_spec)
File "D:\anaconda\envs\object_loader\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "D:\anaconda\envs\object_loader\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\google_download\locating-objects-without-bboxes-master\object-locator\train.py", line 170, in
for batch_idx, (imgs, dictionaries) in enumerate(iter_train):
File "D:\anaconda\envs\object_loader\lib\site-packages\tqdm_tqdm.py", line 940, in iter
for obj in iterable:
File "D:\anaconda\envs\object_loader\lib\site-packages\torch\utils\data\dataloader.py", line 278, in iter
for obj in iterable:
File "D:\anaconda\envs\object_loader\lib\site-packages\torch\utils\data\dataloader.py", line 278, in iter
return _MultiProcessingDataLoaderIter(self)
File "D:\anaconda\envs\object_loader\lib\site-packages\torch\utils\data\dataloader.py", line 682, in init
return _MultiProcessingDataLoaderIter(self)
File "D:\anaconda\envs\object_loader\lib\site-packages\torch\utils\data\dataloader.py", line 682, in init
w.start()
w.start() File "D:\anaconda\envs\object_loader\lib\multiprocessing\process.py", line 105, in start

File "D:\anaconda\envs\object_loader\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)self._popen = self._Popen(self)

File "D:\anaconda\envs\object_loader\lib\multiprocessing\context.py", line 223, in _Popen
File "D:\anaconda\envs\object_loader\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
return _default_context.get_context().Process._Popen(process_obj)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\popen_spawn_win32.py", line 33, in init
reduction.dump(process_obj, to_child)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\reduction.py", line 60, in dump
prep_data = spawn.get_preparation_data(process_obj._name)
File "D:\anaconda\envs\object_loader\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "D:\anaconda\envs\object_loader\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
ForkingPickler(file, protocol).dump(obj)

BrokenPipeError: [Errno 32] Broken pipe

Mismatch of tensor in WHD forward?

Hello Javier,
I'm trying to understand your WHD forward function, but fail to understand the dimension of normalized_y.

            # One by one
            prob_map_b = prob_map[b, :, :]
            gt_b = gt[b]
            orig_size_b = orig_sizes[b, :]
            norm_factor = (orig_size_b/self.resized_size).unsqueeze(0)

            # Pairwise distances between all possible locations and the GTed locations
            n_gt_pts = gt_b.size()[0]
            normalized_x = norm_factor.repeat(self.n_pixels, 1) * self.all_img_locations
            normalized_y = norm_factor.repeat(len(gt_b), 1) * gt_b
            d_matrix = cdist(normalized_x, normalized_y)

From my understanding, gt_b.size() = [H, W], normal_factor.size = (1, 2), then n_gt_pts = H; self.all_img_locations.size() = [HxW, 2] which leads to normalized_x.size() = [HxW, 2] ;

Then, here comes my puzzle. norm_factor.repeat(len(gt_b), 1) gives me [B, 2], but gt_b.size() = [H, W], how to multiply these 2 tensors?? Did you use some special reshape operations here?

Thank you!

Pupil Dataset's Link page could not be found.

Hi, @chenyuhao @javiribera @dguera

Your Pupil Dataset's Link page is failed.

I me this internet error message.
"Link page could not be found."

How can I get the Pupil dataset?

Thanks.
Best,
@bemoregt.

Explanation about True positives, False Positives and False Negatives

Hi, I was wandering what are is the difference between "np.sum(detected_pts)" (true positives count) and "np.sum(detected_gt)" which is used to estimate false negatives count?. My understanding is that gt are the real points and pts are the detected ones ( within the radius), but what would be the difference if the r is big enough and why are not the same?. For instance, in a mall set, you have 5 persons and the program correctly identifies 3 of them. True positives will be 3, false positives 2 and false negatives will also be 2 and therefore that sums(detected) will be equal. I also get that both will be higher with a bigger radius but the gt one will be bigger that the pts one (Sometimes). Question is because in your paper you wrote as False Negative definition : "A false negative is counted if a true location does have any estimated location at a distance at most r" which for me is the same as True Positive definition wich is the same just from the point of the estimated location. And if it is a misspelling why False negative is not the same as False positive?
Thankyou for your attention

KeyError

dictionary['locations'] = eval(dictionary['locations'])
KeyError: 'locations'

running locate.py without xml file

when I don't have xml file its fail in at line 93, data_plant_stuff.py
when trying to read None file :

    # Read all XML as a string
    with open(os.path.join(directory, xml_filename), 'r') as fd:
        xml_str = fd.read()

Questions about ground truth format.

Hi,

I was reading the help information of command python3.6 -m object-locator.locate -h offered me when I came across

--dataset DATASET    Directory with test images. Must contain image files
                                       (any format), and CSV or XML file containing a
                                      groundtruth file following the API v0.4 (default:
                                      None)

As far as I know, the groudtruth in mall dataset is a .mat file which is unable to use directly.
I have no clue about what 'API v0.4' refers to. Is there any documents that I can catch up with?

Thanks.

Adapting your loss to segmentation

Hello,
Thanks for this nice paper and the code.
Of my understanding, in you code (WeightedHausdorffDistance class), the ground truth gt is a list of point corresponding to different objects. Is it possible to have gt a list of point from the same object ?

I am wondering of using your loss for a segmentation task in medical imaging to replace (or in parallel of) Dice Loss. So we have the ground truth segmentation of one object.

Best Regards,
Théo

Question about Result of Metrics

I use the command: CUDA_VISIBLE_DEVICES=1 python -m object-locator.locate --imgsize 256X256 --dataset mall_dataset/frames/test/ --out mall_dataset/output/ --model checkpoints/mall,lambdaa=1,BS=32,Adam,LR1e-4.ckpt --evaluate, and the test set takes pictures from No.seq_001801.jpg to No.seq_002000.jpg. However, the results have large difference from your paper. Can you give me some suggestions? @javiribera

"cannot import name '_validate_lengths' "

HI, @chenyuhao @javiribera @dguera

I met this error when run your elvaluation script.

"cannot import name '_validate_lengths' "

I installed your conda install guide ...

What's wrong to me ?

Thanks.
Best,

@bemoregt.

The max distance in Weighted Hausdorff Loss is not the actual max distance

The maximum possible distance is calculated with resized image size.

locating-objects-without-bboxes/object-locator/losses.py

Line 143 in e51f75e

self.max_dist = math.sqrt(resized_height**2 + resized_width**2)

But this should be calculated with the original image size, as bellow code also calculates distances in the original image.

Bellow two lines need to be changed

locating-objects-without-bboxes/object-locator/losses.py

Line 224 in e51f75e

weighted_d_matrix = (1 - p_replicated)*self.max_dist + p_replicated*d_matrix

locating-objects-without-bboxes/object-locator/losses.py

Line 204 in e51f75e

terms_2.append(torch.tensor([self.max_dist],

Suggested change

            max_dist = (orig_size_b **2).sum().sqrt()
....
            # Corner case: no GT points
            if gt_b.ndimension() == 1 and (gt_b < 0).all().item() == 0:
                terms_1.append(torch.tensor([0],
                                            dtype=torch.get_default_dtype()))
                terms_2.append(torch.tensor([max_dist],
                                            dtype=torch.get_default_dtype()))
                continue
.....
            weighted_d_matrix = (1 - p_replicated)*max_dist  + p_replicated*d_matrix

I am sorry, too lazy to make pull request to fix it by myself now.

Training epoch to get

Hi, I have been training with mall dataset,
This is the command I used,
python -m object-locator.train --train-dir ./object-locator/dataset/mall_dataset/frames/ \
--batch-size 32
--lr 1e-3 --optim Adam
--resume ./model_08.ckpt --visdom-server localhost --visdom-port 8097
--val-dir auto
--epochs 1000000 \
--nThreads 4 \
--val-freq 10 -
-max-mask-pts 100
--paint

Could you please show me your train command, to get your achievement of getting head position?
especially validation frequency and epoch setting.

mall_small_dataset at 1200 epochs

mall_small_dataset.zip

I have created a smaller dataset to see the training of the network. For faster training, the mall images were cropped to 1/2 in width and height and only 500 images were fed into the training (see the attached dataset, a ground-truth image, and a Visdom screenshot at 1200 epochs). The following command was used for training:

---mall_small_dataset command for training
python -m object-locator.train --train-dir "data/mall_small_dataset" --batch-size 16 --lr 1e-4 --val-dir "data/mall_small_dataset" --optim adam --val-freq 10 --save "data/mall_small_dataset-model.ckpt" --visdom-env mall_small_dataset_training --visdom-server http://localhost --visdom-port 8097

The code performed 1200 epochs overnight, but it still hasn't converged to object locations yet. Could you give me guidance to resolve the problem?

Thank you

Extend WeightedHausdorffDistance to 3D Images

Cannot download Mall dataset

The requested URL /~ccloy/downloads_mall_dataset.html was not found on this server.

local host error

Hi,
i'm trying to train on my own dataset by using this command:
python -m object-locator.train \ --train-dir real_train \ --batch-size 32 \ --visdom-env mytrainingsession \ --visdom-server localhost \ --lr 1e-3 \ --val-dir real_train \ --optim Adam \ --save saved_model.ckpt

I got this error message:
Traceback (most recent call last):
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/connectionpool.py", line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/http/client.py", line 1262, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/http/client.py", line 1308, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/http/client.py", line 1257, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/http/client.py", line 974, in send
self.connect()
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/connection.py", line 187, in connect
conn = self._new_conn()
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/connection.py", line 172, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f22cdda79b0>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/connectionpool.py", line 725, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/urllib3/util/retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8989): Max retries exceeded with url: /env/mytrainingsession (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f22cdda79b0>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/visdom/init.py", line 711, in _send
data=json.dumps(msg),
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/visdom/init.py", line 677, in _handle_post
r = self.session.post(url, data=data)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/requests/sessions.py", line 578, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/odedb/.conda/envs/object-locator/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8989): Max retries exceeded with url: /env/mytrainingsession (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f22cdda79b0>: Failed to establish a new connection: [Errno 111] Connection refused',))
[Errno 111] Connection refused
E: cannot connect to Visdom server localhost:8989

thanks in advance!

Mall dataset testing time

As specified in the paper, you did a train/val/test split on the mall dataset (2000 frames) of 80/10/10 which would give 200 testing frames. How long should it take to locate all the objects for those 200 images? After using a script to turn the .mat file into train/val/test datasets with the proper ground truth files and using the command
python -m object-locator.locate --dataset mall_dataset/mall_test/ --out mall_out/ --model mall.ckpt --evaluate
it takes ~1.5 hours to run on the entire testing set which is much longer than the training/validation iterations. I'm just running it locally on an i5-6600k and a 1070. Training was taking about 1.5 minutes per epoch and validation was around 5 minutes which makes me wonder what I could be doing wrong for testing. Perhaps this is expected, but I just wanted to check.

Thanks in advance!

Can anyone provide the link to download the dataset?

Thank you!

weird error

Hi,
I am using this commit version: commit d848560

I tried to train using this train command: python -m object-locator.train --train-dir real_train --batch-size 32 --visdom-env mytrainingsession --visdom-server localhost --lr 1e-3 --val-dir real_train --optim Adam --save saved_model.ckpt

got this error:
https://pastebin.com/JMbjWZPn

i attached the gt.csv file
gt.xlsx

thanks in advance!

Converged but too wide

The code ran for 1800 epochs. The loss didn't get lower anymore. The object locations in training look too wide. I used the following command with the parameters. Could you guide me to improve the accuracy of the object locations?

python -m object-locator.train --train-dir "data/mall_small_dataset" --batch-size 10 --lr 1e-6 --val-dir "data/mall_small_validate" --optim adam --val-freq 10 --save "data/mall_small_dataset-model.ckpt" --visdom-env mall_500_v2_dataset_training --visdom-server http://localhost --visdom-port 8097 --imgsize 256x320 --resume "data/mall_small_dataset-model.ckpt"

Thank you

questions about the Hausdorff loss

thank you very much for sharing the codes.

I have some questions about the Averaged Hausdorff Loss. currently i am trying to solve a boundary detection problem based on medical image dataset. I tried to use your codes for AveragedHausdorffLoss, however inputs of your function are two point sets while my inputs are 2-class softmax probability map and ground truth labels. The critical issue here is that I have to calculate set1 from the probability map using torch.max() while the torch.max() function is not differentiable and thus not able to be pack-propagated.

My question is do u know some methods to avoid the 'not being able to be back-propagated' problem or other implementations which directly use the prob map to calculate loss.

Best regards!

About validating time

When I run train.py on my own data, it will cost a long time when validates with with very low gpu untils,I wanna know why and my data is 1000x1000 about thousands of object on each image

W: The dataset directory data does not contain a CSV file with groundtruth.

I implemented a command below

python -m object-locator.locate --dataset data --out result --model pupil,lambdaa=1,BS=64,SGD,LR1e-3,p=-1,ultrasmallNet.ckpt --ultrasmallnet

and got an error as follows:

W: The dataset directory data does not contain a CSV file with groundtruth. Metrics will not be evaluated. Only estimations will be returned. Loading checkpoint... \__ loaded checkpoint 'pupil,lambdaa=1,BS=64,SGD,LR1e-3,p=-1,ultrasmallNet.ckpt' with 6.02M trainable parameters DONE (took 0.076988 seconds) 0%| | 0/1 [00:00<?, ?it/s]<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=612x408 at 0x7FF904D587B8> Traceback (most recent call last): File "/opt/conda/envs/object-locator/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/opt/conda/envs/object-locator/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/tmp/locating-objects-without-bboxes/object-locator/locate.py", line 186, in <module> total=len(testset_loader)): File "/opt/conda/envs/object-locator/lib/python3.6/site-packages/tqdm/_tqdm.py", line 940, in __iter__ for obj in iterable: File "/opt/conda/envs/object-locator/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in __next__ return self._process_next_batch(batch) File "/opt/conda/envs/object-locator/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch raise batch.exc_type(batch.exc_msg) KeyError: 'Traceback (most recent call last):\n File "/opt/conda/envs/object-locator/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/opt/conda/envs/object-locator/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/tmp/locating-objects-without-bboxes/object-locator/data.py", line 291, in __getitem__\n dictionary[\'locations\'] = eval(dictionary[\'locations\'])\nKeyError: \'locations\'\n' ^X^C

There are any solution?

My Directory hierarchy is

Ground Truth file gt.csv includes

Thanks.

Any suggestions for this problem

(object-locator) D:\2019\wildandfields\Locating Objects Without Bounding Boxes\locating-objects-without-bboxes-master>python -m object-locator.locate --dataset 20160613_F54_validation --out output --model "checkpoints\plants_20160613_F54,BS=32,Adam,LR1e-5,p=-1.ckpt" --evaluate --no-gpu --radii 5
Loading checkpoint...
__ loaded checkpoint 'checkpoints\plants_20160613_F54,BS=32,Adam,LR1e-5,p=-1.ckpt' with 64.8M trainable parameters
DONE (took 3.866121 seconds)
Loading checkpoint...
__ loaded checkpoint 'checkpoints\plants_20160613_F54,BS=32,Adam,LR1e-5,p=-1.ckpt' with 64.8M trainable parameters
DONE (took 3.257291 seconds)
Traceback (most recent call last):
Traceback (most recent call last):
File "", line 1, in
File "D:\sofewarespace\envs\object-locator\lib\runpy.py", line 193, in _run_module_as_main
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\spawn.py", line 105, in spawn_main
"main", mod_spec)
File "D:\sofewarespace\envs\object-locator\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\2019\wildandfields\Locating Objects Without Bounding Boxes\locating-objects-without-bboxes-master\object-locator\locate.py", line 186, in
exitcode = _main(fd)
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\spawn.py", line 223, in prepare
_fixup_main_from_name(data['init_main_from_name'])
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\spawn.py", line 249, in _fixup_main_from_name
alter_sys=True)for batch_idx, (imgs, dictionaries) in tqdm(enumerate(testset_loader),

File "D:\sofewarespace\envs\object-locator\lib\runpy.py", line 205, in run_module
File "D:\sofewarespace\envs\object-locator\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
return _run_module_code(code, init_globals, run_name, mod_spec)
File "D:\sofewarespace\envs\object-locator\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "D:\sofewarespace\envs\object-locator\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\2019\wildandfields\Locating Objects Without Bounding Boxes\locating-objects-without-bboxes-master\object-locator\locate.py", line 186, in
return _DataLoaderIter(self)
for batch_idx, (imgs, dictionaries) in tqdm(enumerate(testset_loader),
File "D:\sofewarespace\envs\object-locator\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
File "D:\sofewarespace\envs\object-locator\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
w.start()return _DataLoaderIter(self)

File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\process.py", line 105, in start
File "D:\sofewarespace\envs\object-locator\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
w.start()
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)self._popen = self._Popen(self)

File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\context.py", line 223, in _Popen
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)return _default_context.get_context().Process._Popen(process_obj)

File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\context.py", line 322, in _Popen
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
return Popen(process_obj) File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\popen_spawn_win32.py", line 33, in init

File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)prep_data = spawn.get_preparation_data(process_obj._name)

File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\reduction.py", line 60, in dump
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "D:\sofewarespace\envs\object-locator\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
ForkingPickler(file, protocol).dump(obj)

BrokenPipeError: [Errno 32] Broken pipe

General questions regarding experiments with wHD

Could you comment on the following? I believe it would help me and potentially future readers:

Have you used this loss in a multitask sense? For example, combining segmentation and heat-maps from your loss?
What happens if you try spatial softmax for the heat-map instead of sigmoid?
In a multi-class problem where number of GT points are known and fixed, could you potentially make N heat-maps for N points, softmax across channels and expect your solution to work?

some problems about usage

usage: train.py [-h] --train-dir TRAIN_DIR [--val-dir VAL_DIR] [--imgsize HxW]
[--batch-size N] [--epochs N] [--nThreads N] [--lr LR] [-p P]
[--no-cuda] [--no-data-augm] [--drop-last-batch] [--seed S]
[--resume PATH] [--save PATH] [--log-interval N]
[--max-trainset-size N] [--max-valset-size N] [--val-freq F]
[--visdom-env NAME] [--visdom-server SRV] [--visdom-port PRT]
[--optimizer OPTIM] [--replace-optimizer] [--max-mask-pts M]
[--paint] [--radius R] [--n-points N] [--ultrasmallnet]
[--lambdaa L]
train.py: error: unrecognized arguments: --env-name sorghum

code have no argument name "--env-name",but you write it in your example

did not find used-for-cvpr2019-submission

Hi there I did not find a repo with tag used-for-cvpr2019-submission

Errors during validation and checkpoint saving

Hello!

Several problems appear when I use your codes.

First of all, my data are 4000x6000 drone images, previously cut into 400x400, of animals.
I work with Windows 10 on Anaconda prompt, with a 4GB GPU.

I run a training with this command:
python -m object-locator.train --train-dir mytraindir --batch-size 1 --visdom-env training --visdom-server localhost --visdom-port 8097 --epochs 2 --lr 1e-3 --val-dir myvaldir --save otherdir\saved_model.ckpt --nThreads 1 --imgsize 400x400

Training is going well. However, an error occurs during the validation, at each loaded image:
C:\Users\delplaal\AppData\Local\Continuum\anaconda3\envs\object-locator\lib\site-packages\object-locator\utils.py:86: RuntimeWarning: invalid value encountered in true_divide array_scaled = ((array - minn)/(maxx - minn)*255)

It seems that the denominator is zero, but I do not know where that comes from.

Finally, at the end of this pseudo-validation, this error appears:
E: Don't overwrite a checkpoint without resuming from it. Are you sure you want to do that? (if you do, remove it manually).

And I get a broken pipe. So I can't do more than one epoch.

I also tried without validation and I get this error:
File "C:\Users\delplaal\AppData\Local\Continuum\anaconda3\envs\object-locator\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\delplaal\AppData\Local\Continuum\anaconda3\envs\object-locator\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\delplaal\AppData\Local\Continuum\anaconda3\envs\object-locator\lib\site-packages\object-locator\train.py", line 273, in <module> if args.save and (epoch + 1) % args.val_freq == 0: ZeroDivisionError: integer division or modulo by zero

I'm stuck. What can I do to fix this?

Here is a sample of my dataset: https://we.tl/t-ybaoqdEqha

Thank you in advance for your answers!

Alexandre

mall dataset issue

Hello,

I tried to run your codes according README.md .
But I can't solve about mall_dataset problem.

Your 'readme' suggests that I should have well-formatted gt.csv file. at here
and repos contains some '.txt to .csv' python code.

But mall dataset link's zip file has only .mat ground truth file.

Of course, It'll be fine if i spend more time adjusting .mat file...!, but I wondered there is another way to solve this problem or something I missed! Thank you :)

validation time

I know there's been many issues related with validation time.
But to me, it's not clear.

I am training mall dataset and set --taus as default= -2 since it's faster than other ways.
It still took about 4.5s per image for validation.
I saw one posting that you said 'Only the neural network can use GPU. The thresholding and EM can only use CPU'.
does that mean validation only has to be worked with CPU? Can I know the reason?

Thank you

Improving information presented in functions

Hi,

Great work. Might be including in future papers of mine. Found one issue:
https://github.com/javiribera/locating-objects-without-bboxes/blob/master/object-locator/losses.py

Might help if you explicitly mentioned that in the case for 1 GT point, be sure to provide it as list of a [1, 2] tensors. This shape is consistent with your code and would lead to failure if people provided GT points as [2, ] or [2, 1].

Further, I suggest removed "device" dependency during function definition and instead consider something like this self.all_img_locations = self.all_img_locations.to(prob_map.device)

Would help future users who train on cluster.

Training About Mall Dataset

I'm trying to replicate your training process using the “mall,lambdaa=1,BS=32,Adam,LR1e-4.ckpt”. I find that there is still training space after loading weights, and the loss of verification set is very large. I confuse about it, can you give me some suggestions?

About the XML file about the groundtruth

Hi sir,
I want to train the model in mall dataset with your code.
After downloading the dataset, the GT file is a mat format.
Can you provide the XML file for each dataset?

The error log is ：
(object-locator) imre@imre2018-09-01:~/baiyan/locating-objects-without-bboxes$ python -m object-locator.train --train-dir ./mall_dataset/frames/ --batch-size 32 --visdom-env sorghum --lr 1e-3 --val-dir ./mall_dataset/frames/ --optim Adam --save saved_model.ckpt
W: Not connected to any Visdom server. You will not visualize any training/validation plot or intermediate image
W: The dataset directory ./mall_dataset/frames/ does not contain a XML file with groundtruth. Metrics will not be evaluated.Only estimations will be returned.
Traceback (most recent call last):
File "/home/imre/anaconda2_v2/envs/object-locator/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/imre/anaconda2_v2/envs/object-locator/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/imre/baiyan/locating-objects-without-bboxes/object-locator/train.py", line 93, in
max_valset_size=args.max_valset_size)
File "/home/imre/baiyan/locating-objects-without-bboxes/object-locator/data.py", line 131, in get_train_val_loaders
seed=seed)
File "/home/imre/baiyan/locating-objects-without-bboxes/object-locator/data.py", line 66, in build_dataset
seed=seed)
File "/home/imre/baiyan/locating-objects-without-bboxes/object-locator/data_plant_stuff.py", line 93, in init
with open(os.path.join(directory, xml_filename), 'r') as fd:
File "/home/imre/anaconda2_v2/envs/object-locator/lib/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/home/imre/anaconda2_v2/envs/object-locator/lib/python3.6/genericpath.py", line 149, in _check_arg_types
(funcname, s.class.name)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'

Loss breaks after reaching minima

Would you happen to have any intuition on this?
I'm using a U-net style network (with skipped connections). Output -> 3 channels. The centre of mass, in my case, the pupil centre, is regressed from channel 1.

I use torch.sigmoid on channel 1 before giving it as input to weighted H loss and a sufficiently small learning rate (5e-5) with ADAM.

I observe that the loss reduces 0.03 -> 0.009 and the output starting to look as expected from channel 1, i.e, we start seeing the expected blob. Post convergence to a minima (which happens within 1 epoch), the loss goes its maximum (0.1 in my case) and stays there. I checked the gradient norms and found that there is a lot of fluctuation in the norm values. Furthermore, the loss is jumpy on every iteration.

Would you have an intuition about this?

Different size of test images

Hi,
The trained network detects objects only in those images which have the same dimensions as of the training images. Can it be modified to detect objects in test images with different dimensions?

'WeightedHausdorffDistance' object has no attribute '_backward_hooks'

Hi,

I am trying to use WHD in order to optimize a semantic segmentation model. I have written this code:

import torch

from losses import *

whd = WeightedHausdorffDistance(resized_height = 192, resized_width = 192)
prob_map = torch.rand(1, 192, 192, requires_grad = True)
prob_map.requires_grad = True
gt = [torch.randint(0, 2, (192, 192))]
orig_sizes = np.array([[192, 192]])
whd(prob_map, gt, orig_sizes)

The code gives me the next error:

AttributeError: 'WeightedHausdorffDistance' object has no attribute '_backward_hooks'

¿Can someone help me?

Thanks in advance

how to test images without gt

After I have trained my model, how can I test my model, I only have images without gt(csv or xml).Thank you!

error when install ballpark

Collecting ballpark==1.4.0 (from -r /test/locating-objects-without-bboxes-master/condaenv.ivbc642u.requirements.txt (line 1))
Cache entry deserialization failed, entry ignored
Using cached https://files.pythonhosted.org/packages/8f/5b/e259db671525c63202885c2cad5fc90cf095a65149a2ad3a7586fa73180f/ballpark-1.4.0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-fxtjjwiz/ballpark/setup.py", line 5, in
long_description=open('README.rst').read(),
File "/root/anaconda3/envs/object-locator/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 650: ordinal not in range(128)

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-fxtjjwiz/ballpark/

CondaValueError: pip returned an error

PearsonRConstantInputWarning

Hi,
while training i am keep getting the following warning:
PearsonRConstantInputWarning: An input array is constant; the correlation coefficent is not defined.

is it important?

modifying for multiple classes

Could you provide a list of required modifications in order for this to work for multiple classes?

Small issue when using plants dataset

When using the plant dataset, object-locator/data.py should have the following changes to prevent KeyErrors:

dictionary['locations'] = eval(dictionary['locations'])
to
dictionary['locations'] = eval(dictionary['plant_locations'])

and

dictionary['count'] = torch.tensor([dictionary['count']], dtype=torch.get_default_dtype())
to
dictionary['count'] = torch.tensor([dictionary['plant_count']], dtype=torch.get_default_dtype())

Didn't make a PR cause I thought you'd probably like to change it in some way that generalizes for all datasets that I am not aware of.

Cheers, thanks for the great repo!

javiribera / locating-objects-without-bboxes Goto Github PK

locating-objects-without-bboxes's People

Stargazers

Watchers

Forkers

locating-objects-without-bboxes's Issues

Recommend Projects

Recommend Topics

Recommend Org