Comments (4)
Thanks for your report! Could you please share your minimal reproducible example?
from pytorch.
@shink I wish I could isolate the issue better.
I get a similar Caffe2
/ aten
issue when compiling with GCC 14.1.0, too:
/usr/bin/cmake: symbol lookup error: /usr/lib64/libstdc++.so.6: undefined symbol: _ZNKSt7__cxx110messagesIcE7do_openERKNS_12basic_stringIcSt11char_traitsIcESaIcEEERKSt6locale, version GLIBCXX_3.4.21
make[2]: *** [caffe2/CMakeFiles/ATEN_CUDA_FILES_GEN_TARGET.dir/build.make:7118: aten/src/ATen/ops/bitwise_right_shift_cpu_dispatch.h] Error 127
make[2]: *** Deleting file 'aten/src/ATen/ops/bitwise_right_shift_cpu_dispatch.h'
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:1122: caffe2/CMakeFiles/ATEN_CUDA_FILES_GEN_TARGET.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
I'm not sure why it says GLIBCXX_3.4.21
; my Libc version is 2.39.
Updated versions from collect_env.py
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Slackware Linux (x86_64)
GCC version: (GCC) 14.1.0
Clang version: 18.1.5
CMake version: version 3.29.3
Libc version: glibc-2.39
Python version: 3.11.9 (main, Apr 2 2024, 13:43:44) [GCC 13.2.0] (64-bit runtime)
Python platform: Linux-6.9.0-x86_64-AMD_Ryzen_Threadripper_2990WX_32-Core_Processor-with-glibc2.39
Is CUDA available: N/A
CUDA runtime version: 12.4.99
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: Quadro RTX 4000
Nvidia driver version: 550.54.14
cuDNN version: Probably one of the following:
/usr/share/cuda/lib64/libcudnn.so.9.1.1
/usr/share/cuda/lib64/libcudnn_adv.so.9.1.1
/usr/share/cuda/lib64/libcudnn_cnn.so.9.1.1
/usr/share/cuda/lib64/libcudnn_engines_precompiled.so.9.1.1
/usr/share/cuda/lib64/libcudnn_engines_runtime_compiled.so.9.1.1
/usr/share/cuda/lib64/libcudnn_graph.so.9.1.1
/usr/share/cuda/lib64/libcudnn_heuristic.so.9.1.1
/usr/share/cuda/lib64/libcudnn_ops.so.9.1.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 43 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Vendor ID: AuthenticAMD
BIOS Vendor ID: Advanced Micro Devices, Inc.
Model name: AMD Ryzen Threadripper 2990WX 32-Core Processor
BIOS Model name: AMD Ryzen Threadripper 2990WX 32-Core Processor Unknown CPU @ 3.0GHz
BIOS CPU family: 107
CPU family: 23
Model: 8
Thread(s) per core: 2
Core(s) per socket: 32
Socket(s): 1
Stepping: 2
Frequency boost: enabled
CPU(s) scaling MHz: 74%
CPU max MHz: 3000.0000
CPU min MHz: 2200.0000
BogoMIPS: 5999.96
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sev sev_es
Virtualization: AMD-V
L1d cache: 1 MiB (32 instances)
L1i cache: 2 MiB (32 instances)
L2 cache: 16 MiB (32 instances)
L3 cache: 64 MiB (8 instances)
NUMA node(s): 4
NUMA node0 CPU(s): 0-7,32-39
NUMA node1 CPU(s): 16-23,48-55
NUMA node2 CPU(s): 8-15,40-47
NUMA node3 CPU(s): 24-31,56-63
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Mitigation; untrained return thunk; SMT vulnerable
Vulnerability Spec rstack overflow: Mitigation; Safe RET
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] flake8==7.0.0
[pip3] numpy==1.26.3
[conda] Could not collect
from pytorch.
Removing the -fPIC
flag, the build continued further, but I encountered the same issue again:
Fatal glibc error: malloc.c:4376 (_int_malloc): assertion failed: (unsigned long) (size) >= (unsigned long) (nb)
during GIMPLE pass: dce
/tmp/SBo/pytorch-v2.3.0/aten/src/ATen/FunctionalInverses.cpp: In static member function โstatic at::Tensor at::functionalization::FunctionalInverses::_nested_get_values_inverse(const at::Tensor&, const at::Tensor&, at::functionalization::InverseReturnMode)โ:
/tmp/SBo/pytorch-v2.3.0/aten/src/ATen/FunctionalInverses.cpp:315:8: internal compiler error: Aborted
315 | Tensor FunctionalInverses::_nested_get_values_inverse(const Tensor& base, const Tensor& mutated_view, InverseReturnMode inverse_return_mode) {
| ^~~~~~~~~~~~~~~~~~
0x1fc8df8 internal_error(char const*, ...)
???:0
0x7feb1f696aab __pthread_kill
???:0
0x7feb1f642e11 __GI_raise
???:0
0x7feb1f62849e abort
???:0
0x7feb1f6292c9 __libc_message_impl.cold
???:0
0x7feb1f639e02 __libc_assert_fail
???:0
0x7feb1f6a3c84 _int_malloc
???:0
0x7feb1f6a3f51 _int_realloc
???:0
0x7feb1f6a51a5 __libc_realloc
???:0
0x2058b30 xrealloc
???:0
0xacce0e get_dominated_to_depth(cdi_direction, basic_block_def*, int)
???:0
0xacce9a get_all_dominated_blocks(cdi_direction, basic_block_def*)
???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
It seems to be a compiler bug.
There is a GCC bug reported related to compiling GridSamplerKernel.cpp. The solution was to use -fno-strict-aliasing
, but that didn't help things in my case.
from pytorch.
I was encountering these compiler errors because of a hardware issue; my DRAM MHz was set too high.
from pytorch.
Related Issues (20)
- A Permute Layer for torch.nn.Sequential
- Forward hooks not called when fast path is used in TransformerEncoderLayer HOT 1
- How to enable XNNPACK instead of NNPACK/MKLDNN in Windows? HOT 2
- torch._dynamo.exc.Unsupported: call_method GetAttrVariable(UnspecializedNNModuleVariable(CenterCrop), _transformed_types) __iter__ () {} HOT 1
- AOTriton Cmake error breaking PyTorch nightly binary builds for ROCm
- `torch.compile` with `reduce-overhead`: very long compile time + GPU memory continuously to grow HOT 8
- warnings.warn is super spammy under Dynamo HOT 5
- taking upper triangular of "-inf" matrix results in nan values HOT 1
- Using a warning inside of Dynamo internals is super spammy
- Using PyTorch with Transformers to run inference with 'MPS' backend causes poor results. HOT 3
- [v.2.4.0] Release Tracker HOT 54
- 'torch.compiler.reset()' does not reset 'assume_constant_result' value HOT 1
- [Dynamo] FunctionCtx initilaization warning suppression looks fishy
- HSDP + `set_optimizer_state_dict` errors with monolithic checkpointing HOT 4
- torch.export gives segmentation fault HOT 1
- the bug of torch._export.aot_compile when it is using a _mm_plus_mm operator HOT 1
- the bug of torch._export.aot_compile when it is using a _mm_plus_mm operator HOT 1
- xpu: gradient checkpointing wrongly hits cuda path running on non-cuda devices HOT 3
- Negative numbers in the "Self CPU" column in pytorch's profiler HOT 2
- InternalTorchDynamoError on converting llama-2 to onnx using torch.onnx.dynamo_export
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.