Git Product home page Git Product logo

Comments (13)

smithp35 avatar smithp35 commented on June 4, 2024

Would it be possible to get a small reproducer? With a simple test file

        .text
        vfma.f32 q0, q3, q7
        vcvt.f16.s16 q1, q7, #1
        vsub.f32        q0, q2, q1

I can assemble this with clang (16) --target=arm-none-eabi -mcpu=cortex-m85 -c

Looking at the definition of the cortex-m85 it should enable MVE and floating point by default. If you are using -march then you'll need -march=armv8.1-m+mve.fp . This can be used with -mcpu as well -mcpu=cortex-m85+mve.fp although it shouldn't be necessary.

The Arm Compiler 6 (commercial compiler based on LLVM but is true for clang in general) at https://developer.arm.com/documentation/101754/0620/armclang-Reference/armclang-Command-line-Options/-mcpu?lang=en may help.

from llvm-embedded-toolchain-for-arm.

renesas-kyle-finch avatar renesas-kyle-finch commented on June 4, 2024

With the contents you provided in test.s, I got the following output with clang (17) from LLVM 17:

clang --target=arm-none-eabi -mcpu=cortex-m85 -c -o test.o test.s 
clang: warning: no multilib found matching flags: --target=thumbv8.1m.main-none-unknown-eabi -march=thumbv8.1m.main+dsp+mve+mve.fp+fp16+ras+lob+pacbti+nocrc+nocrypto+nosha2+noaes+nodotprod+nofp16fml+nobf16+nosb+noi8mm+nocdecp0+nocdecp1+nocdecp2+nocdecp3+nocdecp4+nocdecp5+nocdecp6+nocdecp7 -mfloat-abi=softfp -mfpu=fp-armv8-fullfp16-d16 [-Wmissing-multilib]
clang: note: available multilibs are:
--target=aarch64-none-unknown-elf
--target=armv4t-none-unknown-eabi -mfpu=none
--target=armv5e-none-unknown-eabi -mfpu=none
--target=thumbv6m-none-unknown-eabi -mfpu=none
--target=armv7-none-unknown-eabi -mfpu=none
--target=armv7-none-unknown-eabihf -mfpu=vfpv3-d16
--target=armv7r-none-unknown-eabi -mfpu=none
--target=armv7r-none-unknown-eabihf -mfpu=vfpv3-d16
--target=thumbv7m-none-unknown-eabi -mfpu=none
--target=thumbv7em-none-unknown-eabi -mfpu=none
--target=thumbv7em-none-unknown-eabihf -mfpu=fpv4-sp-d16
--target=thumbv7em-none-unknown-eabihf -mfpu=fpv5-d16
--target=thumbv8m.main-none-unknown-eabi -mfpu=none
--target=thumbv8m.main-none-unknown-eabihf -mfpu=fpv5-d16
--target=thumbv8.1m.main-none-unknown-eabi -mfpu=none
--target=thumbv8.1m.main-none-unknown-eabihf -march=thumbv8.1m.main+fp16 -mfpu=fp-armv8-fullfp16-sp-d16
--target=thumbv8.1m.main-none-unknown-eabihf -march=thumbv8.1m.main+dsp+mve -mfpu=none

If I add -mfloat-abi=hard to the above, then it seems to assemble okay with the test.

With my own code, if I specify --target=arm-none-eabi -mcpu=cortex-m85 I get an error elsewhere indicating math.h could not be found. But when I add -mfloat-abi=hard, I get the originally reported error:

source.s:386:2: error: invalid instruction, any one of the following would fix this:
        vsub.f32        q0, q4, q1
        ^
source.s:386:6: note: invalid operand for instruction
        vsub.f32        q0, q4, q1
            ^
source.s:386:2: note: instruction requires: mve.fp
        vsub.f32        q0, q4, q1
        ^
source.s:388:2: error: instruction requires: mve.fp
        vfma.f32        q1, q0, q2

from llvm-embedded-toolchain-for-arm.

smithp35 avatar smithp35 commented on June 4, 2024

There is at least one problem here, which is that we don't have a armv8.1m.main_hard_fp_mve library variant, only armv8.1m.main_hard_nofp_mve. This will mean that multilib selection will fail and you'll get strange errors like the math.h not being found. I'll mention that to the team next week.

One workaround for that is to manually set the include and library directories to the armv8.1m.main_hard_fp directory. This will be compatible with hardfp and mve+fp. As far as I know the math library when compiled in mve+fp configuration won't use MVE instructions anyway.

For example:

-isystem /path/to/LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/include -L/path/to/LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib

There is probably a multilib.yaml flag mapping could map +mve+fp to armv8.1m.main_hard_fp, but I can't think of that off the top of my head.

I'm at a loss to explain why source.s isn't working. That is assuming the build system/make is using the same --target and -mcpu as the test.s file.

from llvm-embedded-toolchain-for-arm.

renesas-kyle-finch avatar renesas-kyle-finch commented on June 4, 2024

I will try manually setting the include and library directories manually. But adding a flag mapping to multilib.yaml would be ideal. Our build system is being used for other architectures besides armv8.1.

Regarding source.s vs test.s, in this case, source.s is generated by LLVM from our source.c file. The --target and -mcpu are the same between source.s and test.s though. I can try and come up with a simple project to reproduce the issue stripping our a bunch of our other proprietary stuff. There seems to be good information when I use clang -v

from llvm-embedded-toolchain-for-arm.

smithp35 avatar smithp35 commented on June 4, 2024

We've continued to look into this.

I think there's two parts. The first is assembling instructions with -mcpu=cortex-m85 (or -mcpu=cortex-m85+fp+mve) we've not been able to reproduce this. The -mfloat-abi=hard is a calling convention so it doesn't affect what instructions are available. Possible that an .arch directive could have overridden it locally (docs https://sourceware.org/binutils/docs/as/ARM-Directives.html). We would probably need an example source file and a command line.

The second is the available multilibs. On investigation there is a difference between LLVM 16 (latest official release which had a downstream multilib patch) and LLVM 17 which has the upstream multilib patch. There is a LLVM 17 preview release here https://github.com/ARM-software/LLVM-embedded-toolchain-for-Arm/releases/tag/preview-17.0.0-devdrop0

In summary:
LLVM16 only has a viable multilib for -mcpu=cortex-m85+nomve+nofp -> armv8.1m.main_soft_nofp_nomve
The closest workarounds I can find are to add an explicit fpu or use march.

  • -mcpu=cortex-m85+fp+mve -mfpu=fp-armv8-fullfp16-sp-d16 -> armv8.1m.main_hard_fp
  • -march=armv8.1-m.main+fp+mve -mfloat-abi=hard -> armv8.1m.main_hard_fp

LLVM17 has a viable multilib for -mcpu=cortex-m85 -mfloat-abi=hard -> armv8.1m.main_hard_fp
LLVM17 is missing a viable multilib for -mcpu-cortex-m85 -mfloat-abi=softfp (sadly the default) as we don't currently have a softfp variant. There is a soft floating point variant, but that won't use the floating point hardware at all -mcpu=cortex-m85 -mfloat-abi=soft -> armv8.1m.main_soft_nofp_nomve

If you are able to use the LLVM 17 preview this should fix the multilib problem for -mfloat-abi=hard. We are intending to have a compatible softfp library variant for the final release.

Tests (adding -Wl,--verbose) to see the libraries used by the linker
LLVM17 -mcpu=cortex-m85+fp+mve -mfloat-abi=hard

LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85+fp+mve mve.c   -Wl,--verbose -mfloat-abi=hard
ld.lld: /tmp/mve-643fe0.o
ld.lld: LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib/libc.a
ld.lld: LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib/libm.a
ld.lld: LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib/libclang_rt.builtins.a

LLVM16 -mcpu=cortex-m85+fp+mve -mfloat-abi=hard

LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85+fp+mve mve.c   -Wl,--verbose -mfloat-abi=hard
ld.lld: /tmp/mve-5f8036.o
ld.lld: error: unable to find library -lc
ld.lld: error: unable to find library -lm
ld.lld: error: unable to find library -lclang_rt.builtins-arm

LLVM16 using -march=armv8.1-m.main+fp+mve

LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -march=armv8.1-m.main+fp+mv
e mve.c   -Wl,--verbose -mfloat-abi=hard
ld.lld: /tmp/mve-d65dda.o
ld.lld: LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib/libc.a
ld.lld: LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib/libm.a
ld.lld:LLVMEmbeddedToolchainForArm-16.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/lib/libclang_rt.builtins.a

We are planning to add at least one softfp variant for the full release.

from llvm-embedded-toolchain-for-arm.

renesas-kyle-finch avatar renesas-kyle-finch commented on June 4, 2024

I have already been using the preview release of LLVM 17. I have been working on making a simplified example to send over and have made two observations.

Example source code:

#include <stdint.h>
#include <math.h>

#define DUMMY_CONST_1 (0.0012345F)
#define DUMMY_CONST_2 (0.01F)
#define DUMMY_CONST_3 (0.02F)
#define DUMMY_CONST_4 (0.03F)
#define DUMMY_CONST_5 (0.04F)

typedef struct
{
    float a;
    float b;
    float c;
    float d;
} dummy_t;

int8_t foo(dummy_t *handle)
{
    handle->a += DUMMY_CONST_2 * (DUMMY_CONST_1 - handle->a);
    handle->b += DUMMY_CONST_3 * (DUMMY_CONST_1 - handle->b);
    handle->c += DUMMY_CONST_4 * (DUMMY_CONST_1 - handle->c);
    handle->d += DUMMY_CONST_5 * (DUMMY_CONST_1 - handle->d);
    return 0;
}
  1. With the provided example source code and the command below, without specifying -mfloat-abi=hard I get an error that math.h cannot be found. With the -v flag sent to clang, it appears that it is using an invalid -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/include. I think this could be related to the missing softfp variant.

Here is the full output:

clang -v -std=c99 -x c -O2 -mcpu=cortex-m85+fp+mve --target=arm-none-eabi -mthumb -save-temps=obj -c -o mve_compile_test.o mve_compile_test.c 
clang: warning: no multilib found matching flags: --target=thumbv8.1m.main-none-unknown-eabi -march=thumbv8.1m.main+dsp+mve+mve.fp+fp16+ras+lob+pacbti+nocrc+nocrypto+nosha2+noaes+nodotprod+nofp16fml+nobf16+nosb+noi8mm+nocdecp0+nocdecp1+nocdecp2+nocdecp3+nocdecp4+nocdecp5+nocdecp6+nocdecp7 -mfloat-abi=softfp -mfpu=fp-armv8-fullfp16-d16 [-Wmissing-multilib]
clang: note: available multilibs are:
--target=aarch64-none-unknown-elf
--target=armv4t-none-unknown-eabi -mfpu=none
--target=armv5e-none-unknown-eabi -mfpu=none
--target=thumbv6m-none-unknown-eabi -mfpu=none
--target=armv7-none-unknown-eabi -mfpu=none
--target=armv7-none-unknown-eabihf -mfpu=vfpv3-d16
--target=armv7r-none-unknown-eabi -mfpu=none
--target=armv7r-none-unknown-eabihf -mfpu=vfpv3-d16
--target=thumbv7m-none-unknown-eabi -mfpu=none
--target=thumbv7em-none-unknown-eabi -mfpu=none
--target=thumbv7em-none-unknown-eabihf -mfpu=fpv4-sp-d16
--target=thumbv7em-none-unknown-eabihf -mfpu=fpv5-d16
--target=thumbv8m.main-none-unknown-eabi -mfpu=none
--target=thumbv8m.main-none-unknown-eabihf -mfpu=fpv5-d16
--target=thumbv8.1m.main-none-unknown-eabi -mfpu=none
--target=thumbv8.1m.main-none-unknown-eabihf -march=thumbv8.1m.main+fp16 -mfpu=fp-armv8-fullfp16-sp-d16
--target=thumbv8.1m.main-none-unknown-eabihf -march=thumbv8.1m.main+dsp+mve -mfpu=none
clang version 17.0.0
Target: arm-none-unknown-eabi
Thread model: posix
InstalledDir: /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin
 "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang-17" -cc1 -triple thumbv8.1m.main-none-unknown-eabi -E -save-temps=obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name mve_compile_test.c -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -nostdsysteminc -target-cpu cortex-m85 -target-feature +soft-float-abi -target-feature -crc -target-feature -dotprod -target-feature +mve.fp -target-feature +ras -target-feature -fp16fml -target-feature -bf16 -target-feature -sb -target-feature -i8mm -target-feature +lob -target-feature -cdecp0 -target-feature -cdecp1 -target-feature -cdecp2 -target-feature -cdecp3 -target-feature -cdecp4 -target-feature -cdecp5 -target-feature -cdecp6 -target-feature -cdecp7 -target-feature +pacbti -target-feature -hwdiv-arm -target-feature +hwdiv -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature +vfp3d16 -target-feature +vfp3d16sp -target-feature -vfp3sp -target-feature +fp16 -target-feature -vfp4 -target-feature +vfp4d16 -target-feature +vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature +fp-armv8d16 -target-feature +fp-armv8d16sp -target-feature -fp-armv8sp -target-feature +fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature +dsp -target-feature +mve -target-feature -crypto -target-feature -sha2 -target-feature -aes -target-feature +strict-align -target-abi aapcs -mfloat-abi soft -Wunaligned-access -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home/coder/workspace/peaks -resource-dir /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17 -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/include -O0 -std=c99 -fdebug-compilation-dir=/home/coder/workspace/peaks -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcolor-diagnostics -faddrsig -o mve_compile_test.i -x c mve_compile_test.c
clang -cc1 version 17.0.0 based upon LLVM 17.0.0-rc1 default target aarch64-linux-gnu
ignoring nonexistent directory "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/include"
ignoring duplicate directory "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include
End of search list.
mve_compile_test.c:2:10: fatal error: 'math.h' file not found
    2 | #include <math.h>
      |          ^~~~~~~~
1 error generated.
  1. When I do specify -mfloat-abi=hard, I get the originally reported errors. However, I don't think these are related to multilib. I think these errors are related to optimization. Using O0 or O1, this compiles fine. But using O2, O3, Ofast, Os, or Oz, I get the following errors and output:
clang -v -std=c99 -x c -mfloat-abi=hard -O2 -mcpu=cortex-m85+fp+mve --target=arm-none-eabi -mthumb -save-temps=obj -c -o mve_compile_test.o mve_compile_test.c 
clang version 17.0.0
Target: arm-none-unknown-eabi
Thread model: posix
InstalledDir: /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin
 "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang-17" -cc1 -triple thumbv8.1m.main-none-unknown-eabihf -E -save-temps=obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name mve_compile_test.c -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -nostdsysteminc -target-cpu cortex-m85 -target-feature -crc -target-feature -dotprod -target-feature +mve.fp -target-feature +ras -target-feature -fp16fml -target-feature -bf16 -target-feature -sb -target-feature -i8mm -target-feature +lob -target-feature -cdecp0 -target-feature -cdecp1 -target-feature -cdecp2 -target-feature -cdecp3 -target-feature -cdecp4 -target-feature -cdecp5 -target-feature -cdecp6 -target-feature -cdecp7 -target-feature +pacbti -target-feature -hwdiv-arm -target-feature +hwdiv -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature +vfp3d16 -target-feature +vfp3d16sp -target-feature -vfp3sp -target-feature +fp16 -target-feature -vfp4 -target-feature +vfp4d16 -target-feature +vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature +fp-armv8d16 -target-feature +fp-armv8d16sp -target-feature -fp-armv8sp -target-feature +fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature +dsp -target-feature +mve -target-feature -crypto -target-feature -sha2 -target-feature -aes -target-feature +strict-align -target-abi aapcs -mfloat-abi hard -Wunaligned-access -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home/coder/workspace/peaks -resource-dir /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17 -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/include -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8m.main_hard_fp/include -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv7em_hard_fpv5_d16/include -internal-isystem /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv7em_hard_fpv4_sp_d16/include -O2 -std=c99 -fdebug-compilation-dir=/home/coder/workspace/peaks -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -o mve_compile_test.i -x c mve_compile_test.c
clang -cc1 version 17.0.0 based upon LLVM 17.0.0-rc1 default target aarch64-linux-gnu
ignoring duplicate directory "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8.1m.main_hard_fp/include
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv8m.main_hard_fp/include
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv7em_hard_fpv5_d16/include
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/../lib/clang-runtimes/arm-none-eabi/armv7em_hard_fpv4_sp_d16/include
End of search list.
 "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang-17" -cc1 -triple thumbv8.1m.main-none-unknown-eabihf -emit-llvm-bc -emit-llvm-uselists -save-temps=obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name mve_compile_test.c -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -nostdsysteminc -target-cpu cortex-m85 -target-feature -crc -target-feature -dotprod -target-feature +mve.fp -target-feature +ras -target-feature -fp16fml -target-feature -bf16 -target-feature -sb -target-feature -i8mm -target-feature +lob -target-feature -cdecp0 -target-feature -cdecp1 -target-feature -cdecp2 -target-feature -cdecp3 -target-feature -cdecp4 -target-feature -cdecp5 -target-feature -cdecp6 -target-feature -cdecp7 -target-feature +pacbti -target-feature -hwdiv-arm -target-feature +hwdiv -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature +vfp3d16 -target-feature +vfp3d16sp -target-feature -vfp3sp -target-feature +fp16 -target-feature -vfp4 -target-feature +vfp4d16 -target-feature +vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature +fp-armv8d16 -target-feature +fp-armv8d16sp -target-feature -fp-armv8sp -target-feature +fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature +dsp -target-feature +mve -target-feature -crypto -target-feature -sha2 -target-feature -aes -target-feature +strict-align -target-abi aapcs -mfloat-abi hard -Wunaligned-access -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home/coder/workspace/peaks -resource-dir /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17 -O2 -std=c99 -fdebug-compilation-dir=/home/coder/workspace/peaks -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcolor-diagnostics -vectorize-loops -vectorize-slp -disable-llvm-passes -faddrsig -o mve_compile_test.bc -x cpp-output mve_compile_test.i
clang -cc1 version 17.0.0 based upon LLVM 17.0.0-rc1 default target aarch64-linux-gnu
#include "..." search starts here:
#include <...> search starts here:
 /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17/include
End of search list.
 "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang-17" -cc1 -triple thumbv8.1m.main-none-unknown-eabihf -S -save-temps=obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name mve_compile_test.c -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -nostdsysteminc -target-cpu cortex-m85 -target-feature -crc -target-feature -dotprod -target-feature +mve.fp -target-feature +ras -target-feature -fp16fml -target-feature -bf16 -target-feature -sb -target-feature -i8mm -target-feature +lob -target-feature -cdecp0 -target-feature -cdecp1 -target-feature -cdecp2 -target-feature -cdecp3 -target-feature -cdecp4 -target-feature -cdecp5 -target-feature -cdecp6 -target-feature -cdecp7 -target-feature +pacbti -target-feature -hwdiv-arm -target-feature +hwdiv -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature +vfp3d16 -target-feature +vfp3d16sp -target-feature -vfp3sp -target-feature +fp16 -target-feature -vfp4 -target-feature +vfp4d16 -target-feature +vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature +fp-armv8d16 -target-feature +fp-armv8d16sp -target-feature -fp-armv8sp -target-feature +fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature +dsp -target-feature +mve -target-feature -crypto -target-feature -sha2 -target-feature -aes -target-feature +strict-align -target-abi aapcs -mfloat-abi hard -Wunaligned-access -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home/coder/workspace/peaks -resource-dir /opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/lib/clang/17 -O2 -std=c99 -fdebug-compilation-dir=/home/coder/workspace/peaks -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -o mve_compile_test.s -x ir mve_compile_test.bc
clang -cc1 version 17.0.0 based upon LLVM 17.0.0-rc1 default target aarch64-linux-gnu
 "/opt/LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang-17" -cc1as -triple thumbv8.1m.main-none-unknown-eabihf -filetype obj -main-file-name mve_compile_test.c -target-cpu cortex-m85 -target-feature -crc -target-feature -dotprod -target-feature +mve.fp -target-feature +ras -target-feature -fp16fml -target-feature -bf16 -target-feature -sb -target-feature -i8mm -target-feature +lob -target-feature -cdecp0 -target-feature -cdecp1 -target-feature -cdecp2 -target-feature -cdecp3 -target-feature -cdecp4 -target-feature -cdecp5 -target-feature -cdecp6 -target-feature -cdecp7 -target-feature +pacbti -target-feature -hwdiv-arm -target-feature +hwdiv -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature +vfp3d16 -target-feature +vfp3d16sp -target-feature -vfp3sp -target-feature +fp16 -target-feature -vfp4 -target-feature +vfp4d16 -target-feature +vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature +fp-armv8d16 -target-feature +fp-armv8d16sp -target-feature -fp-armv8sp -target-feature +fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature +dsp -target-feature +mve -target-feature -crypto -target-feature -sha2 -target-feature -aes -target-feature +strict-align -fdebug-compilation-dir=/home/coder/workspace/peaks -dwarf-version=5 -mrelocation-model static -mllvm -arm-add-build-attributes -o mve_compile_test.o mve_compile_test.s
mve_compile_test.s:44:2: error: invalid instruction, any one of the following would fix this:
        vsub.f32        q1, q1, q0
        ^
mve_compile_test.s:44:6: note: invalid operand for instruction
        vsub.f32        q1, q1, q0
            ^
mve_compile_test.s:44:2: note: instruction requires: mve.fp
        vsub.f32        q1, q1, q0
        ^
mve_compile_test.s:45:2: error: instruction requires: mve.fp
        vfma.f32        q0, q1, q2

from llvm-embedded-toolchain-for-arm.

smithp35 avatar smithp35 commented on June 4, 2024

Thanks for the example. The missing header file is definitely part of the missing softfp multilib variant.

The second error looks like it is related to the compilers assembler output. The -save-temps=obj outputs an assembly file and then reassembles it.

Looking at the assembly

        .text
        .syntax unified
        .eabi_attribute 67, "2.09"      @ Tag_conformance
        .cpu    cortex-m85
        .eabi_attribute 6, 21   @ Tag_CPU_arch
        .eabi_attribute 7, 77   @ Tag_CPU_arch_profile
        .eabi_attribute 8, 0    @ Tag_ARM_ISA_use
        .eabi_attribute 9, 3    @ Tag_THUMB_ISA_use
        .fpu    fpv5-d16
        ...

It looks like the .cpu and .fpu directives here are overriding the command line -mcpu option and are losing the MVE. This can be reproduced separately with:

LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85+fp+mve -mfloat-abi=hard -O2 -S t.c -o t.s
LLVMEmbeddedToolchainForArm-17.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85+fp+mve -mfloat-abi=hard -O2 -c t.s
t.s:44:2: error: invalid instruction, any one of the following would fix this:
        vsub.f32        q1, q1, q0
        ^
t.s:44:6: note: invalid operand for instruction
        vsub.f32        q1, q1, q0
            ^
t.s:44:2: note: instruction requires: mve.fp
        vsub.f32        q1, q1, q0
        ^
t.s:45:2: error: instruction requires: mve.fp
        vfma.f32        q0, q1, q2
        ^

When editing t.s I can remove/comment out the .cpu and .fpu directives and the file will assemble correctly.

So it looks like it is the compiler's assembly output is at fault here.

from llvm-embedded-toolchain-for-arm.

renesas-kyle-finch avatar renesas-kyle-finch commented on June 4, 2024

Thanks for the confirmation on both issues.

For the first, I would guess that this issue lies in "LLVM-embedded-toolchain-for-Arm"?

What about the second? Do I need to submit a ticket with the LLVM project?

from llvm-embedded-toolchain-for-arm.

smithp35 avatar smithp35 commented on June 4, 2024

The first is definitely within the scope of this project. We're going to make at least one softfp variant to start with so that there is at least a compatible softfp multilib.

I have an internal ticket that I raised for the code-generation problem. I can submit it to the llvm-project for more visibility. Will do that tomorrow.

from llvm-embedded-toolchain-for-arm.

smithp35 avatar smithp35 commented on June 4, 2024

Ive raised llvm/llvm-project#65722

from llvm-embedded-toolchain-for-arm.

voltur01 avatar voltur01 commented on June 4, 2024

With this change #302 that added softfp library variant, now the default for Cortrex-M85 is satisfied:

./LLVMEmbeddedToolchainForArm-18.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85 -c -o test.o test.c --print-multi-directory
# arm-none-eabi/armv7m_soft_fpv4_sp_d16

./LLVMEmbeddedToolchainForArm-18.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85+fp -c -o test.o test.c --print-multi-directory
# arm-none-eabi/armv7m_soft_fpv4_sp_d16

./LLVMEmbeddedToolchainForArm-18.0.0-Linux-x86_64/bin/clang --target=arm-none-eabi -mcpu=cortex-m85+fp+mve -c -o test.o test.c --print-multi-directory
# arm-none-eabi/armv7m_soft_fpv4_sp_d16

from llvm-embedded-toolchain-for-arm.

voltur01 avatar voltur01 commented on June 4, 2024

https://github.com/ARM-software/LLVM-embedded-toolchain-for-Arm/releases/tag/release-17.0.1 has fixes for both multilib and MVE reassembling issues, however we will need to upstream the latter into LLVM for future releases.

from llvm-embedded-toolchain-for-arm.

voltur01 avatar voltur01 commented on June 4, 2024

Both changes were upstreamed, closing.

from llvm-embedded-toolchain-for-arm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.