All else being the same, compiling ~100 modules takes ~4-5s when using python3.9, but

Testing after <a class="issue-link js-issue-link" data-error-text="Failed to load titl

Investigate lengthy data collection in python 3.10 about ml-compiler-opt HOT 6 CLOSED

mtrofin commented on May 2, 2024

Investigate lengthy data collection in python 3.10

from ml-compiler-opt.

Comments (6)

mtrofin commented on May 2, 2024

So far, using viztrace, was able to pinpoint protobuf serialization as very likely root cause. The protobuf package is at the same version, though, so looking further.

from ml-compiler-opt.

mtrofin commented on May 2, 2024

It seems that updating to protobuf==3.19.5 addresses that. Would prefer postpone changing the version in requirements.txt to when we bump the tf version, and this is another reason to move away from protobuf altogether.

from ml-compiler-opt.

boomanaiden154 commented on May 2, 2024

I believe the protobuf version has been updated to 3.19.5 in the version of Tensorflow that we're already using (or the version is just set to be > 3.x.x && < 4.0.0 and the version was lower last time the lockfile was generated). When I regenerated the lockfile in #128, it automatically grabbed 3.19.5 as when we upgraded to Tensorflow nightly originally, we didn't really touch any of the dependency versions (#118).

from ml-compiler-opt.

mtrofin commented on May 2, 2024

huh. so I must have quite a messed up local setup.

from ml-compiler-opt.

boomanaiden154 commented on May 2, 2024

It's not your local setup. The protobuf version in requirements.txt is still set to the lower version in main.

from ml-compiler-opt.

boomanaiden154 commented on May 2, 2024

Testing after #189:
Default trace over an LLVM corpus, default settings (Python 3.8.10):

real	1m43.543s
user	120m39.134s
sys	4m22.878s

Warmstart:

real	1m32.171s
user	4m54.158s
sys	3m45.621s

PPO training, default settings (except setting num_policy_iterations):

real	6m42.992s
user	219m37.325s
sys	40m12.776s

Default trace over an LLVM corpus, default settings (Python 3.10.6):

real	1m41.111s
user	121m11.636s
sys	2m33.097s

Warmstart:

real	1m30.252s
user	5m6.882s
sys	3m32.267s

PPO training, default settings (except setting num_policy_iterations):

real	6m32.612s
user	199m24.214s
sys	34m36.932s

While these runs aren't perfectly controlled to just the data collection step (although the default trace runs should come close), they do show that the performance is pretty comparable between 3.10 and previous versions now. I'm seeing similar times for module compilation between both versions and about the performance I'd "expect" given timings/performance that I've seen before.

I compiled some documentation on what I did in this gist. Given that there doesn't seem to be performance differences between the versions now, I'm going to close this issue.

from ml-compiler-opt.

Recommend Projects

Investigate lengthy data collection in python 3.10 about ml-compiler-opt HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent