stonesjtu / pytorch_memlab Goto Github PK
View Code? Open in Web Editor NEWProfiling and inspecting memory in pytorch
License: MIT License
Profiling and inspecting memory in pytorch
License: MIT License
I tried to install using pip install
or pip install git+https://github.com/stonesjtu/pytorch_memlab
and in both cases got this error:
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-sx0xf6jy/pandas/setup.py", line 333
f"{extension}-source file '{sourcefile}' not found.\n"
^
SyntaxError: invalid syntax
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-sx0xf6jy/pandas/
pytorch_memlab works excellent for gpu0, however, all mem of tensors turn to 0 when I used gpu2,3,4.
Thank you for the productive tools for the open source community!
Hi, thank you for very useful python library. I am just checking if there is a way to redirect report() output directly to the txt file without redirecting stdout or anything similar. Or add method that returns string fo report, something like reporter.report() -> str.
I'm developing a custom network layer and it subsequently has many unnamed Tensors within the MemReporter output. below is a snippet example
pcconv5_1.weight_net.layers.4.weight (1, 64) 512.00B
pcconv5_1.weight_net.layers.4.bias (1,) 512.00B
pcconv5_1.feature_layer.weight (1, 32) 512.00B
Tensor0 (1, 10, 10, 10, 8) 31.50K
Tensor1 (1, 8, 3) 512.00B
Tensor2 (1, 10, 10, 10, 12) 47.00K
Tensor3 (1, 12, 3) 512.00B
Tensor4 (1, 8, 8) 512.00B
Tensor5 (1, 8, 1, 1, 1, 8, 5) 1.50K
Tensor6 (1, 8, 1, 1, 1, 8, 32) 8.00K
Tensor7 (1, 8, 1, 1, 1, 8, 32) 8.00K
Tensor8 (1, 8, 1, 1, 1, 8, 64) 16.00K
Tensor9 (1, 8, 1, 1, 1, 8, 64) 16.00K
Tensor10(->Tensor4) (1, 8, 1, 1, 1, 8, 1) 0.00B
Is there any way to name or label these tensors (Tensors 0 through 10 for example) so that I can more easily determine which operation specifically creates them?
Hi, thanks for this awesome package, it indeed help me a lot.
However, when i try to run some basic code from the jupyter demo provide by you.
I have a error [TypeError: can only concatenate tuple (not "list") to tuple], and do not know how to solve the issues.
My environment settings is
Do you have any idea how to solve this issues ? Thanks for your attention, hope to get your reply soon.
Hi
I was just wondering what this message means in the MemReporter
output
Total Tensors: 266979334 Used Memory: 924.71M
The allocated memory on cuda:0: 1.31G
Memory differs due to the matrix alignment or invisible gradient buffer tensors
Also what is the difference between Used Memory and allocated memory?
Many thanks
Like https://pypi.org/project/memory-profiler/
Do you support profile and print the corresponding memory usage of each line while training?
Hi, when I trying to use the LineProfiler example, the following error message pops up:
import torch
from pytorch_memlab import LineProfiler
def inner():
... torch.nn.Linear(100, 100).cuda()
...
def outer():
... linear = torch.nn.Linear(100, 100).cuda()
... linear2 = torch.nn.Linear(100, 100).cuda()
... inner()
...
with LineProfiler(outer, inner) as prof:
... outer()
...
Traceback (most recent call last):
File "", line 1, in
File "/home/wyuancs/miniconda3/envs/response-selection/lib/python3.8/site-packages/pytorch_memlab/line_profiler/line_profiler.py", line 45, in init
self.add_function(func)
File "/home/wyuancs/miniconda3/envs/response-selection/lib/python3.8/site-packages/pytorch_memlab/line_profiler/line_profiler.py", line 59, in add_function
first_line = inspect.getsourcelines(func)[1]
File "/home/wyuancs/miniconda3/envs/response-selection/lib/python3.8/inspect.py", line 979, in getsourcelines
lines, lnum = findsource(object)
File "/home/wyuancs/miniconda3/envs/response-selection/lib/python3.8/inspect.py", line 798, in findsource
raise OSError('could not get source code')
OSError: could not get source code
How could I solve this problem? I tried to search online, but not find any solution
What do the negative values mean? For example below is part of what I captured when running some code:
Line # Max usage Peak usage diff max diff peak Line Contents
===============================================================
56 @profile
58 def main():
59 0.00B 0.00B -4.14G -4.64G dtype, multidevice, backward_device = setup_gpu()
Hello,
Firstly, congratulations for memlab. I have been trying to use it in Google Colab, but sometimes this error happens:
ReferenceError Traceback (most recent call last)
in ()
33 print('Reporter!!!!!!!')
34 reporter = MemReporter()
---> 35 reporter.report()
2 frames
/usr/local/lib/python3.7/dist-packages/pytorch_memlab/mem_reporter.py in (.0)
62 #FIXME: make the grad tensor collected by gc
63 objects = gc.get_objects()
---> 64 tensors = [obj for obj in objects if isinstance(obj, torch.Tensor)]
65 for t in tensors:
66 self.device_mapping[t.device].append(t)
ReferenceError: weakly-referenced object no longer exists
In my code, I use MemReport() just right after the training phase, i.e.:
for epoch in range(num_epochs):
net.train()
...
# end of training
print('Reporter!!!!!!!')
reporter = MemReporter()
reporter.report()
Do you know what is the problem?
Thank you and regards.
Hi,
Thanks for the super useful package. Currently, it seems it is not possible to leverage it under a jupyter notebook. If I create a new notebook and add a cell with the contents
import torch
from pytorch_memlab import profile
@profile
def work():
linear = torch.nn.Linear(100, 100).cuda()
linear2 = torch.nn.Linear(100, 100).cuda()
linear3 = torch.nn.Linear(100, 100).cuda()
work()
I get no results printed.
However, when I use MemReporter
I do get results:
import torch
from pytorch_memlab import MemReporterrter
linear = torch.nn.Linear(1024, 1024).cuda()
reporter = MemReporter()
reporter.report()
Element type Size Used MEM
-------------------------------------------------------------------------------
Storage on cuda:0
Parameter0 (1024, 1024) 4.00M
Parameter1 (1024,) 4.00K
Parameter2 (1024, 1024) 4.00M
Parameter3 (1024,) 4.00K
-------------------------------------------------------------------------------
Total Tensors: 2099200 Used Memory: 8.01M
The allocated memory on cuda:0: 8.01M
-------------------------------------------------------------------------------
Additionally, I wonder if it is possible to add a line magic similar to %lprun
for profiling cells.
The variable tensor_names is referenced before assignment if no model is passed into the MemReporter. Need to move it into the above if statement.
When I attempt to import anything from pytorch_memlab on a Google Colab CPU instance, I get the following error:
Could not reset CUDA stats and cache: 'NoneType' object has no attribute 'lower'
I have a pl.LightningModule (pytorch-lightning) that includes many nn.Modules.
It's not obvious from the documentation how I can profile all the LightningModule tensors and the subordinate Module tensors. Could you please provide an example?
Hello,
This is probably a naive question. I would like to turn on and off the decorator without commenting things out. What would be an elegant way to achieve this? I have something like the following in mind but I don't know how to achieve this. Would you like to share some thoughts on this? Many thanks!
profile_flag = True # False
@profile(profile_flag)
def func1():
...
@profile(profile_flag)
def func2():
...
This is a very interesting library and will hopefully help with constant OOM errors. Beyond profiling the memory usage, does memlab offer a way to clear GPU memory (if we do find the GPU holding onto tensors after execution)? Thanks!
The demo code does not work with this version of PyTorch. No output is printed
import torch
from pytorch_memlab import LineProfiler
def inner():
torch.nn.Linear(100, 100).cuda()
def outer():
linear = torch.nn.Linear(100, 100).cuda()
linear2 = torch.nn.Linear(100, 100).cuda()
inner()
with LineProfiler(outer, inner) as prof:
outer()
prof.display()
The tensor_name
and tensor_device
mapping dict maintains a reference to the key / value tensor, which prevents python collecting unused tensors.
Great repository! Thanks for creating this :)
Just have a quick question. Is pytorch_memlab tracking the memory usage by torch.autograd.profiler? thanks a lot! Just want to clarify this to see if I need to use both :)
I need to show that some technique called gradient checkpointing can really save GPU memory usage during backward propagation. When I see the result there are two columns on the left showing active_bytes
and reserved_bytes
. In my testing, while active bytes read 3.83G
, the reserved bytes read 9.35G
. So why does PyTorch still reserve that much GPU memory?
I added the @profile
decorator to a custom module method (after from pytorch_memlab import LineProfiler, profile
) and the output is the truncated display with the error KeyError: "['XXX'] not in index"
for my method XXX. What's the meaning of this?
I'm trying to understand why there is a large discrepancy (between the 'Used Memory' or 'allocated memory on cuda:0' from MemReporter
, versus what nvtop's (or nvidia-smi)'s reported memory usage. For example, while training a model (RetinaNet from detectron2, for context), I'm seeing ~285M from MemReporter, and ~15G from nvtop/nvidia-smi.
Is this all due to the autograd graph? I've been trying to read more about this but haven't found good references.
Thanks for your work on this library, and any pointers you can share about this!
This is a great tool for finding where the memory has gone - thank you!
I have a request:
Problem:
Memory is being reported incorrectly for any loop or a function, since those are repeated multiple times, it doesn't show how the peak/active memory counters were progressing in, say, first iteration and shows the data for the whole loop/function after it has run more than once. It's correct for the final iteration, but not the first one and it's crucial to see the first moment the memory usage has gone up and the peak.
This functionality is typically not needed for a normal memory profiler, since all we want to know there is frequency of calls and total usage for the given line, but here if it's an investigation tool we need to see the first few uses. I hope I was able to convey the issue clearly.
I tried to solve this manually by unrolling the loop in the code I was profiling and replicating the iteration code multiple times, which is not very sustainable.
It's also typical that there is a change in memory footprint from iteration 1 to 2 and then things stabilize at iteration 3 and onward (if there is no leak that is). So probably there could be an option to record and prints 3 different stats:
same applies to functions.
I'm thinking perhaps the low-hanging fruit is to give users an option to record a loop iteration or function just the first time it's run and report that. That already would be very useful and perhaps not to difficult to implement.
Thank you!
I want to measure the max memory used during the execution of a script.
I think doing this naively will count max reserved memory, while I want the max memory actually used.
I do not need any further details.
Will this have an overhead? I want to do this on scripts taking hours to finish.
I'm using MemReporter
After run reporter.report()
I don't know which part different since there are lot of layer and tensor being output
Wondering if there is a way to find diff easily
From the readme:
You can also filter the device to report on by passing extra arguments:
report(device=torch.device(0))
The device
parameter seems to work neither with the report decorator nor with the MemReporter. If this feature is still available, the readme should be clarified.
Hi,
Thanks a lot for providing this very helpful library.
I have a question about Used Memory and GPU memory.
I followed your code to get the Used Memory of my model for one batch (size: (16, 3, 224, 224)). It is 928.02M. But the same code for the same model could not run in 2070 super GPU (8 GiB capacity).
928.02 M vs 8 GiB. What is the difference between the Used Memory in your code and GPU memory? Thanks.
I installed pytorch_memlab
with pip3 install pytorch_memlab
and I get this error when trying to set the target GPU:
from pytorch_memlab import profile, set_target_gpu
ImportError: cannot import name 'set_target_gpu'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.