Hi Chris, I'd like to harness the power of multiple GPUs for basecalling. Can you

I remembered I have multi-GPU functionality for bonito: <a href="htt

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Oh sorry, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

For performant multi-gpu inference see <a href="https://github.com/nanoporetech/dorado

multi-gpu inference? about bonito HOT 6 CLOSED

noncodo commented on September 25, 2024

multi-gpu inference?

from bonito.

Comments (6)

kishwarshafin commented on September 25, 2024 2

I remembered I have multi-GPU functionality for bonito:

https://github.com/kishwarshafin/bonito/blob/nanoporetech-master/bonito/basecaller_distributed.py

from bonito.

iiSeymour commented on September 25, 2024

Hey @noncodo

Yes, it's possible and easy to add, see the patch below.

+++ b/bonito/train.py
@@ -69,6 +69,12 @@ def main(args):
             print("* error: Cannot use AMP: Apex package needs to be installed manually, See https://github.com/NVIDIA/apex")
             exit(1)
 
+    if args.multi-gpu:
+        from torch.nn import DataParallel
+        model = DataParallel(model)
+        model.stride = model.module.stride
+        model.alphabet = model.module.alphabet
+
     schedular = CosineAnnealingLR(optimizer, args.epochs * len(train_loader))
 
     log_interval = np.floor(len(train_dataset) / args.batch * 0.05)
@@ -80,7 +86,15 @@ def main(args):
             log_interval, model, device, train_loader, optimizer, epoch, use_amp=args.amp
         )
         test_loss, mean, median = test(model, device, test_loader)
+
+        if args.multi_gpu:
+            state = model.module.state_dict()
+        else:
+            state = model.state_dict()
+
+        # save optim state
         torch.save(model.state_dict(), os.path.join(workdir, "weights_%s.tar" % epoch))
+
         with open(os.path.join(workdir, 'training.csv'), 'a', newline='') as csvfile:
             csvw = csv.writer(csvfile, delimiter=',')
             if epoch == 1:
@@ -111,6 +125,7 @@ def argparser():
     parser.add_argument("--batch", default=32, type=int)
     parser.add_argument("--chunks", default=1000000, type=int)
     parser.add_argument("--validation_split", default=0.99, type=float)
+    parser.add_argument("--multi-gpu", action="store_true", default=False)
     parser.add_argument("--amp", action="store_true", default=False)
     parser.add_argument("-f", "--force", action="store_true", default=False)
     return parser

I found DataParallel could hang multi-gpu systems without NVLink/NVSwitch so I haven't merged it yet, guarding the import and uses behind --multi-gpu is probably safe so I'll look to get this in master.

from bonito.

noncodo commented on September 25, 2024

Wonderful! I'll give'er a spin on my NVLink-less system and report back. Beats the heck out of splitting fast5s into n batches.

from bonito.

iiSeymour commented on September 25, 2024

Oh sorry, @noncodo I just realised you are after multi-gpu inference, not training!

That will be a little more complicated as the fast5 reader, decoder and fasta writer sit in different processes and the main loop is currently set up for a single consumer.

from bonito.

kishwarshafin commented on September 25, 2024

@iiSeymour and @noncodo ,

I have recently implemented multi-GPU support for HELEN (https://github.com/kishwarshafin/helen), both training and inference. You'd have to switch to DistributedDataParallel. The DataParallel implementation we had was a little bit of a speedup but nothing much, but DistributedDataParallel gave us a big improvement.

The way for bonito can do it is to use a dataloader that can create segments of data for each GPU to process and each process can write it's own fasta/fastq. Keep track of the names of the fasta/fastq files and concatenate those. Or leave it to the user to cat those.

I am not sure if Bonito is at that point in production, I think it still not producing quality scores? At least the version we are working on doesn't. Let me know if you have any questions.

Distributed training script: https://github.com/kishwarshafin/helen/blob/master/helen/modules/python/models/train_distributed.py
Distributed inference script:
https://github.com/kishwarshafin/helen/blob/master/helen/modules/python/models/predict_gpu.py

from bonito.

iiSeymour commented on September 25, 2024

For performant multi-gpu inference see https://github.com/nanoporetech/dorado

from bonito.

multi-gpu inference? about bonito HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent