Git Product home page Git Product logo

Comments (5)

ezyang avatar ezyang commented on July 18, 2024 1

for reference the error is

2024-05-31T23:12:43.7830534Z =========================== short test summary info ============================
2024-05-31T23:12:43.7832053Z FAILED test/integration/test_integration.py::TestSubclass::test_int8_dynamic_quant_subclass_api_1_cpu - torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
2024-05-31T23:12:43.7833246Z CppCompileError: C++ compile error
2024-05-31T23:12:43.7833517Z 
2024-05-31T23:12:43.7833621Z Command:
2024-05-31T23:12:43.7840753Z g++ /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp -shared -fPIC -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -D_GLIBCXX_USE_CXX11_ABI=0 -I/opt/conda/envs/venv/lib/python3.9/site-packages/torch/include -I/opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/TH -I/opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/THC -I/opt/conda/envs/venv/include/python3.9 -L/opt/conda/envs/venv/lib/python3.9/site-packages/torch/lib -L/opt/conda/envs/venv/lib -L/opt/conda/envs/venv/lib/python3.9/site-packages/torch/lib -ltorch -ltorch_cpu -lgomp -ltorch_python -lc10 -mavx512f -mavx512dq -mavx512vl -mavx512bw -mfma -DCPU_CAPABILITY_AVX512 -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -fopenmp -D C10_USING_CUSTOM_GENERATED_MACROS -o /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.so
2024-05-31T23:12:43.7847453Z 
2024-05-31T23:12:43.7847561Z Output:
2024-05-31T23:12:43.7848972Z /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp: In function ‘void kernel(half*, const half*, const int8_t*, const int64_t*, const half*, half*, half*, half*, long int)’:
2024-05-31T23:12:43.7851691Z /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp:77:36: error: no match for ‘operator*’ (operand types are ‘at::vec::CPU_CAPABILITY::Vectorized<int>’ and ‘at::vec::CPU_CAPABILITY::Vectorized<float>’)
2024-05-31T23:12:43.7853239Z    77 |                 auto tmp40 = tmp37 * tmp39;
2024-05-31T23:12:43.7853703Z       |                              ~~~~~ ^ ~~~~~
2024-05-31T23:12:43.7854122Z       |                              |       |
2024-05-31T23:12:43.7854580Z       |                              |       Vectorized<float>
2024-05-31T23:12:43.7855061Z       |                              Vectorized<int>
2024-05-31T23:12:43.7856049Z In file included from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512.h:8,
2024-05-31T23:12:43.7857334Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec.h:4,
2024-05-31T23:12:43.7858557Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/functional_base.h:6,
2024-05-31T23:12:43.7859815Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/functional.h:3,
2024-05-31T23:12:43.7861053Z                  from /tmp/torchinductor_root/sk/cskh5dx62fglpphcrl6723dnmowdabouerrzy3dmqcngbxwfa7bv.h:35,
2024-05-31T23:12:43.7862212Z                  from /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp:2:
2024-05-31T23:12:43.7865215Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_base.h:629:41: note: candidate: ‘template<class T> at::vec::CPU_CAPABILITY::Vectorized<T> at::vec::CPU_CAPABILITY::operator*(const at::vec::CPU_CAPABILITY::Vectorized<T>&, const at::vec::CPU_CAPABILITY::Vectorized<T>&)’
2024-05-31T23:12:43.7867719Z   629 | template <class T> Vectorized<T> inline operator*(const Vectorized<T> &a, const Vectorized<T> &b) {
2024-05-31T23:12:43.7868487Z       |                                         ^~~~~~~~
2024-05-31T23:12:43.7869739Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_base.h:629:41: note:   template argument deduction/substitution failed:
2024-05-31T23:12:43.7871736Z /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp:77:38: note:   deduced conflicting types for parameter ‘T’ (‘int’ and ‘float’)
2024-05-31T23:12:43.7872918Z    77 |                 auto tmp40 = tmp37 * tmp39;
2024-05-31T23:12:43.7873378Z       |                                      ^~~~~
2024-05-31T23:12:43.7874349Z In file included from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_base.h:1132,
2024-05-31T23:12:43.7875777Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512.h:8,
2024-05-31T23:12:43.7876969Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec.h:4,
2024-05-31T23:12:43.7878188Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/functional_base.h:6,
2024-05-31T23:12:43.7879447Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/functional.h:3,
2024-05-31T23:12:43.7880709Z                  from /tmp/torchinductor_root/sk/cskh5dx62fglpphcrl6723dnmowdabouerrzy3dmqcngbxwfa7bv.h:35,
2024-05-31T23:12:43.7881848Z                  from /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp:2:
2024-05-31T23:12:43.7884406Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_n.h:316:37: note: candidate: ‘template<class T, int N> at::vec::CPU_CAPABILITY::VectorizedN<T, N> at::vec::CPU_CAPABILITY::operator*(const at::vec::CPU_CAPABILITY::VectorizedN<T, N>&, const at::vec::CPU_CAPABILITY::VectorizedN<T, N>&)’
2024-05-31T23:12:43.7886385Z   316 | VECTORIZEDN_DEFINE_BINARY_OP_GLOBAL(operator*)
2024-05-31T23:12:43.7886877Z       |                                     ^~~~~~~~
2024-05-31T23:12:43.7888170Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_n.h:297:28: note: in definition of macro ‘VECTORIZEDN_DEFINE_BINARY_OP_GLOBAL’
2024-05-31T23:12:43.7889512Z   297 |   inline VectorizedN<T, N> op(                                                 \
2024-05-31T23:12:43.7890106Z       |                            ^~
2024-05-31T23:12:43.7891273Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_n.h:316:37: note:   template argument deduction/substitution failed:
2024-05-31T23:12:43.7892428Z   316 | VECTORIZEDN_DEFINE_BINARY_OP_GLOBAL(operator*)
2024-05-31T23:12:43.7892924Z       |                                     ^~~~~~~~
2024-05-31T23:12:43.7894198Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec_n.h:297:28: note: in definition of macro ‘VECTORIZEDN_DEFINE_BINARY_OP_GLOBAL’
2024-05-31T23:12:43.7895523Z   297 |   inline VectorizedN<T, N> op(                                                 \
2024-05-31T23:12:43.7896112Z       |                            ^~
2024-05-31T23:12:43.7897781Z /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp:77:38: note:   ‘at::vec::CPU_CAPABILITY::Vectorized<int>’ is not derived from ‘const at::vec::CPU_CAPABILITY::VectorizedN<T, N>’
2024-05-31T23:12:43.7899201Z    77 |                 auto tmp40 = tmp37 * tmp39;
2024-05-31T23:12:43.7899663Z       |                                      ^~~~~
2024-05-31T23:12:43.7900660Z In file included from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512.h:10,
2024-05-31T23:12:43.7901934Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec.h:4,
2024-05-31T23:12:43.7903154Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/functional_base.h:6,
2024-05-31T23:12:43.7904587Z                  from /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/functional.h:3,
2024-05-31T23:12:43.7921001Z                  from /tmp/torchinductor_root/sk/cskh5dx62fglpphcrl6723dnmowdabouerrzy3dmqcngbxwfa7bv.h:35,
2024-05-31T23:12:43.7922202Z                  from /tmp/torchinductor_root/ag/cag5cdm6uh26pig7xgkfwgbqqh377nc7ldeqen544wvh5totpuza.cpp:2:
2024-05-31T23:12:43.7925013Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512_bfloat16.h:790:29: note: candidate: ‘at::vec::CPU_CAPABILITY::Vectorized<c10::BFloat16> at::vec::CPU_CAPABILITY::operator*(const at::vec::CPU_CAPABILITY::Vectorized<c10::BFloat16>&, const at::vec::CPU_CAPABILITY::Vectorized<c10::BFloat16>&)’
2024-05-31T23:12:43.7927404Z   790 | Vectorized<BFloat16> inline operator*(const Vectorized<BFloat16>& a, const Vectorized<BFloat16>& b) {
2024-05-31T23:12:43.7928166Z       |                             ^~~~~~~~
2024-05-31T23:12:43.7930114Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512_bfloat16.h:790:67: note:   no known conversion for argument 1 from ‘at::vec::CPU_CAPABILITY::Vectorized<int>’ to ‘const at::vec::CPU_CAPABILITY::Vectorized<c10::BFloat16>&’
2024-05-31T23:12:43.7932167Z   790 | Vectorized<BFloat16> inline operator*(const Vectorized<BFloat16>& a, const Vectorized<BFloat16>& b) {
2024-05-31T23:12:43.7933196Z       |                                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
2024-05-31T23:12:43.7935472Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512_bfloat16.h:1389:25: note: candidate: ‘at::vec::CPU_CAPABILITY::Vectorized<c10::Half> at::vec::CPU_CAPABILITY::operator*(const at::vec::CPU_CAPABILITY::Vectorized<c10::Half>&, const at::vec::CPU_CAPABILITY::Vectorized<c10::Half>&)’
2024-05-31T23:12:43.7937710Z  1389 | Vectorized<Half> inline operator*(const Vectorized<Half>& a, const Vectorized<Half>& b) {
2024-05-31T23:12:43.7938422Z       |                         ^~~~~~~~
2024-05-31T23:12:43.7940336Z /opt/conda/envs/venv/lib/python3.9/site-packages/torch/include/ATen/cpu/vec/vec512/vec512_bfloat16.h:1389:59: note:   no known conversion for argument 1 from ‘at::vec::CPU_CAPABILITY::Vectorized<int>’ to ‘const at::vec::CPU_CAPABILITY::Vectorized<c10::Half>&’
2024-05-31T23:12:43.7942298Z  1389 | Vectorized<Half> inline operator*(const Vectorized<Half>& a, const Vectorized<Half>& b) {
2024-05-31T23:12:43.7943051Z       |                                   ~~~~~~~~~~~~~~~~~~~~~~~~^
2024-05-31T23:12:43.7943515Z 
2024-05-31T23:12:43.7943536Z 
2024-05-31T23:12:43.7943823Z Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
2024-05-31T23:12:43.7944276Z 
2024-05-31T23:12:43.7944280Z 
2024-05-31T23:12:43.7944583Z You can suppress this exception and fall back to eager by setting:
2024-05-31T23:12:43.7945129Z     import torch._dynamo
2024-05-31T23:12:43.7945526Z     torch._dynamo.config.suppress_errors = True
2024-05-31T23:12:43.7946971Z FAILED test/integration/test_integration.py::TestSubclass::test_int8_dynamic_quant_subclass_api_2_cpu - torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
2024-05-31T23:12:43.7948162Z CppCompileError: C++ compile error

from pytorch.

leslie-fang-intel avatar leslie-fang-intel commented on July 18, 2024

BF16 passes but FP16 failed. From the lowering graph, we have saw the graph is already different between BF16 and FP16, is it a CPU specific issue @jerryzh168 ?

  • Here is the BF16 fx graph readable. We convert tensor to BF16 after clamp_min
image
  • Here is the FP16 fx graph readable. We convert tensor to FP32 after clamp_min
image

from pytorch.

leslie-fang-intel avatar leslie-fang-intel commented on July 18, 2024

This line of code is suspicious in TorchAO https://github.com/pytorch/ao/blob/950a89388e88e10f26bbbbe2ec0b1710ba3d33d1/torchao/quantization/quant_api.py#L413, which hardcode the data type as None for BF16, but FP32 for for FP16.

==============
Update

Unify the data type to None for both FP16 and BF16, this testcase passes on my local system.

from pytorch.

jerryzh168 avatar jerryzh168 commented on July 18, 2024

this fails for both CPU and CUDA I think.

the linked detail is important to not regress the performance for some internal model I think, why can't inductor support this path?

from pytorch.

leslie-fang-intel avatar leslie-fang-intel commented on July 18, 2024

Thanks for the remind, after further investigation, we do found a CPP Backend issue, #128498 to fix it. With this PR, I think this testcase works with CPP Backend now.

from pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.