Git Product home page Git Product logo

cpu_features's Introduction

cpu_features

A cross-platform C library to retrieve CPU features (such as available instructions) at runtime.

GitHub-CI Status

Linux FreeBSD MacOS Windows
amd64 CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
AArch64 CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
ARM CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
MIPS CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
POWER CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
RISCV CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
LOONGARCH CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel
s390x CMake
Bazel
CMake
Bazel
CMake
Bazel
CMake
Bazel

Table of Contents

Design Rationale

  • Simple to use. See the snippets below for examples.
  • Extensible. Easy to add missing features or architectures.
  • Compatible with old compilers and available on many architectures so it can be used widely. To ensure that cpu_features works on as many platforms as possible, we implemented it in a highly portable version of C: C99.
  • Sandbox-compatible. The library uses a variety of strategies to cope with sandboxed environments or when cpuid is unavailable. This is useful when running integration tests in hermetic environments.
  • Thread safe, no memory allocation, and raises no exceptions. cpu_features is suitable for implementing fundamental libc functions like malloc, memcpy, and memcmp.
  • Unit tested.

Code samples

Note: For C++ code, the library functions are defined in the cpu_features namespace.

Checking features at runtime

Here's a simple example that executes a codepath if the CPU supports both the AES and the SSE4.2 instruction sets:

#include "cpuinfo_x86.h"

// For C++, add `using namespace cpu_features;`
static const X86Features features = GetX86Info().features;

void Compute(void) {
  if (features.aes && features.sse4_2) {
    // Run optimized code.
  } else {
    // Run standard code.
  }
}

Caching for faster evaluation of complex checks

If you wish, you can read all the features at once into a global variable, and then query for the specific features you care about. Below, we store all the ARM features and then check whether AES and NEON are supported.

#include <stdbool.h>
#include "cpuinfo_arm.h"

// For C++, add `using namespace cpu_features;`
static const ArmFeatures features = GetArmInfo().features;
static const bool has_aes_and_neon = features.aes && features.neon;

// use has_aes_and_neon.

This is a good approach to take if you're checking for combinations of features when using a compiler that is slow to extract individual bits from bit-packed structures.

Checking compile time flags

The following code determines whether the compiler was told to use the AVX instruction set (e.g., g++ -mavx) and sets has_avx accordingly.

#include <stdbool.h>
#include "cpuinfo_x86.h"

// For C++, add `using namespace cpu_features;`
static const X86Features features = GetX86Info().features;
static const bool has_avx = CPU_FEATURES_COMPILED_X86_AVX || features.avx;

// use has_avx.

CPU_FEATURES_COMPILED_X86_AVX is set to 1 if the compiler was instructed to use AVX and 0 otherwise, combining compile time and runtime knowledge.

Rejecting poor hardware implementations based on microarchitecture

On x86, the first incarnation of a feature in a microarchitecture might not be the most efficient (e.g. AVX on Sandy Bridge). We provide a function to retrieve the underlying microarchitecture so you can decide whether to use it.

Below, has_fast_avx is set to 1 if the CPU supports the AVX instruction setโ€”but only if it's not Sandy Bridge.

#include <stdbool.h>
#include "cpuinfo_x86.h"

// For C++, add `using namespace cpu_features;`
static const X86Info info = GetX86Info();
static const X86Microarchitecture uarch = GetX86Microarchitecture(&info);
static const bool has_fast_avx = info.features.avx && uarch != INTEL_SNB;

// use has_fast_avx.

This feature is currently available only for x86 microarchitectures.

Running sample code

Building cpu_features (check quickstart below) brings a small executable to test the library.

 % ./build/list_cpu_features
arch            : x86
brand           :        Intel(R) Xeon(R) CPU E5-1650 0 @ 3.20GHz
family          :   6 (0x06)
model           :  45 (0x2D)
stepping        :   7 (0x07)
uarch           : INTEL_SNB
flags           : aes,avx,cx16,smx,sse4_1,sse4_2,ssse3
% ./build/list_cpu_features --json
{"arch":"x86","brand":"       Intel(R) Xeon(R) CPU E5-1650 0 @ 3.20GHz","family":6,"model":45,"stepping":7,"uarch":"INTEL_SNB","flags":["aes","avx","cx16","smx","sse4_1","sse4_2","ssse3"]}

What's supported

x86ยณ AArch64 ARM MIPSโด POWER RISCV Loongarch s390x
Linux yesยฒ yesยน yesยน yesยน yesยน yesยน yesยน yesยน
FreeBSD yesยฒ not yet not yet not yet not yet N/A not yet not yet
MacOs yesยฒ yesโต N/A N/A N/A N/A N/A N/A
Windows yesยฒ not yet not yet N/A N/A N/A N/A N/A
Android yesยฒ yesยน yesยน yesยน N/A N/A N/A N/A
iOS N/A not yet not yet N/A N/A N/A N/A N/A
  1. Features revealed from Linux. We gather data from several sources depending on availability:
    • from glibc's getauxval
    • by parsing /proc/self/auxv
    • by parsing /proc/cpuinfo
  2. Features revealed from CPU. features are retrieved by using the cpuid instruction.
  3. Microarchitecture detection. On x86 some features are not always implemented efficiently in hardware (e.g. AVX on Sandybridge). Exposing the microarchitecture allows the client to reject particular microarchitectures.
  4. All flavors of Mips are supported, little and big endian as well as 32/64 bits.
  5. Features revealed from sysctl. features are retrieved by the sysctl instruction.

Android NDK's drop in replacement

cpu_features is now officially supporting Android and offers a drop in replacement of for the NDK's cpu-features.h , see ndk_compat folder for details.

License

The cpu_features library is licensed under the terms of the Apache license. See LICENSE for more information.

Build with CMake

Please check the CMake build instructions.

Quickstart

  • Run list_cpu_features

    cmake -S. -Bbuild -DBUILD_TESTING=OFF -DCMAKE_BUILD_TYPE=Release
    cmake --build build --config Release -j
    ./build/list_cpu_features --json

    Note: Use --target ALL_BUILD on the second line for Visual Studio and XCode.

  • run tests

    cmake -S. -Bbuild -DBUILD_TESTING=ON -DCMAKE_BUILD_TYPE=Debug
    cmake --build build --config Debug -j
    cmake --build build --config Debug --target test

    Note: Use --target RUN_TESTS on the last line for Visual Studio and --target RUN_TEST for XCode.

  • install cpu_features

    cmake --build build --config Release --target install -v

    Note: Use --target INSTALL for Visual Studio.

    Note: When using Makefile or XCode generator, you can use DESTDIR to install on a local repository.
    e.g.

    cmake --build build --config Release --target install -v -- DESTDIR=install

Community bindings

Links provided here are not affiliated with Google but are kindly provided by the OSS Community.

Send PR to showcase your wrapper here

cpu_features's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cpu_features's Issues

no member named 'vfpv' in 'ArmFeatures' when compiling ndk_compat for arm

Compiling ndk_compat for Android arm devices results in the error below:

/build/src/ndk_compat/cpu-features.c:118:21: error: no member named 'vfpv' in 'ArmFeatures' if (info.features.vfpv) g_cpuFeatures |= ANDROID_CPU_ARM_FEATURE_VFPv2; ~~~~~~~~~~~~~ ^ 1 error generated.

..and sure enough looking in include/cpuinfo_arm.h there is no vfpv property. Looking through the file history it seems there never was. I'm scratching my head here, am I doing something wrong or is this a bug that has been there forever?

Commenting out line 118 in ndk_compat/cpu-features.c solves the issue

Add cx16 (cmpxchg16b) cpuid flag

cmpxchg16b is generally supported on most, decently modern CPUs, but since the timeline already goes back to the old Core 2's, I wonder whether this could be added.
It's an important instruction (if you want 16-byte atomics), which is not supported on earlier Core 2's and some special AMDs.

Similarities to other CPU feature projects (deduplicate?)

ART on Android works around quirks/bugs in /proc/cpuinfo and aux vector, has support for cross compilation, detection of features via C preprocessor, CPUID instructions, undefined instruction exceptions, supports MIPS, ARM, Intel 32 and 64, etc.:
https://android.googlesource.com/platform/art/+/master/runtime/arch/instruction_set_features.h#36
Android NDK has something simple:
https://developer.android.com/ndk/guides/cpu-features.html
V8 CPU feature detection:
https://github.com/v8/v8/blob/master/src/base/cpu.h#L32

As there are OS/CPU bugs that cause issues standardizing on 1 library makes sense. As an author of the ART code I'm have a bias :-)

Fatal error: 'cpu-features.h' file not found

I downloaded lib and copied it to my project directory. My CMakeLists.txt :

 cmake_minimum_required(VERSION 3.6)

 add_subdirectory(cpu_features)

 add_library(
     myLib
     SHARED
     app/src/main/cpp/test.cpp
)
 
 link_libraries(myLib cpu_features)

My test.cpp file:

#include "test.h"
#include <cpu-features.h>

I have 10: fatal error: 'cpu-features.h' file not found error.

Please tell me what I'm doing wrong.
Thanks.

CLI: JSON output

It would be super-handy to have a -o json or a --json flag so that output from the CLI would be easy to parse and consume by other programs.

Inspired by #19

Allow building as a shared library

This would allow languages that can call into shared libraries to natively wrap this awesome library. ๐Ÿ™‚ I would submit a PR for this, but my CMake isn't so good.

Compile-time checks not OK?

Hi, when following your readme my compiler complained and I can't disagree with it: Your code
CPU_FEATURES_COMPILED_X86_AVX || features.avx;
after CPP expands to
defined(__AVX__) || features.avx;
and that doesn't compile (or is there a trick to get the CPP-command defined to become executable?).

--> Wouldn't it be better to do something like this in cpu_features_macros.h:

#ifdef (__AVX__)
#define CPU_FEATURES_COMPILED_X86_AVX 1
#else 
#define CPU_FEATURES_COMPILED_X86_AVX 0
#endif 

instead of the current

#define CPU_FEATURES_COMPILED_X86_AVX defined(__AVX__)

to make your sample code work?

Or should one use your sample code in a specific / different way than by simply gcc-ing it?

SSSE3, SSE4.1, SSE4.2 not detected on non-AVX CPUs

ParseCpuId is coded to require the XMM XCR0 bit in order to recognize SSSE3, SSE4.1, and SSE4.2, but this is not the right way to detect it. Machines without AVX will not have XCR0 at all, and will fail this check.

In the absence of XCR0, on x86-32, the proper way to detect SSE is to ask the operating system. (On x86-64, just assume yes.) In Windows, that's IsProcessorFeaturePresent(PF_XMMI_INSTRUCTIONS_AVAILABLE). In Linux, unfortunately, it's parsing /proc/cpuinfo. In macOS and iOS (simulator), it's sysctlbyname on hw.optional.sse, though you could also just assume yes.

Something could be said that the number of x86 machines without SSE in the CPU and OS is about zero now--it depends on how far back you want your library to support.

suitability of using cpu_features for fundamental libc functions

Thread safe, no memory allocation, and raises no exceptions. cpu_features is suitable for implementing fundamental libc functions like malloc, memcpy, and memcmp.

The README claims the above. But while grepping through the code, memcpy and memcmp are at least used already in the code from the libc library. Hence when implementing such based on cpu_features one would create a loop of dependencies. I assume that cpu_features requires a re-work to not utilize libc functions internally.

Need better platform code separation

I was interested in adding iOS, macOS and Windows support, but the layout of the code is not conducive to this. src/cpuinfo_aarch64.c is highly Linux-specific. For an ARM64 Windows implementation, where would the code go? (In Windows, the ARM64 implementation would be based around IsProcessorFeaturePresent.)

Small driver program to test from command line

It would be helpful in many ways to have a small driver program that would exercise the features of this library and would emit output on stdout after detecting all available features. This would allow for easier testing of the library on new platforms, as well as making it possible to integrate cpu_features into non-C99 applications.

Unable to build this on macOS

@gchatelet , I ran into an error while running this command on the macOS system


Abhinavs-MacBook-Pro:cpu_features eklavya$ cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -H. -Bcmake_build
-- The C compiler identification is AppleClang 9.1.0.9020039
-- The CXX compiler identification is AppleClang 9.1.0.9020039
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/eklavya/projects/code/c-shared-lab/cpu_features/cmake_build


Abhinavs-MacBook-Pro:cpu_features eklavya$ cd cmake_build/


Abhinavs-MacBook-Pro:cmake_build eklavya$ make 
Scanning dependencies of target cpu_features
[  6%] Building C object CMakeFiles/cpu_features.dir/src/linux_features_aggregator.c.o
[ 13%] Building C object CMakeFiles/cpu_features.dir/src/cpuid_x86_clang_gcc.c.o
[ 20%] Building C object CMakeFiles/cpu_features.dir/src/cpuid_x86_msvc.c.o
[ 26%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_aarch64.c.o
[ 33%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_arm.c.o
[ 40%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_mips.c.o
[ 46%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_ppc.c.o
[ 53%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_x86.c.o
[ 60%] Building C object CMakeFiles/cpu_features.dir/src/filesystem.c.o
[ 66%] Building C object CMakeFiles/cpu_features.dir/src/hwcaps.c.o
[ 73%] Building C object CMakeFiles/cpu_features.dir/src/stack_line_reader.c.o
[ 80%] Building C object CMakeFiles/cpu_features.dir/src/string_view.c.o
[ 86%] Linking C shared library libcpu_features.dylib
Undefined symbols for architecture x86_64:
  "_CpuFeatures_GetPlatformType", referenced from:
      _GetPPCPlatformStrings in cpuinfo_ppc.c.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libcpu_features.dylib] Error 1
make[1]: *** [CMakeFiles/cpu_features.dir/all] Error 2
make: *** [all] Error 2

Link failure under centos 6 for the shared lib

When trying to build with -DBUILD_SHARED_LIBS=ON :
...
[ 77%] Linking C shared library libcpu_features.so
cmake-3.13.3/bin/cmake -E cmake_link_script CMakeFiles/cpu_features.dir/link.txt --verbose=1
gcc-6.1.0/bin/gcc -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libcpu_features.so -o libcpu_features.so CMakeFiles/cpu_features.dir/src/cpuinfo_x86.c.o CMakeFiles/utils.dir/src/filesystem.c.o CMakeFiles/utils.dir/src/stack_line_reader.c.o CMakeFiles/utils.dir/src/string_view.c.o -ldl
/usr/bin/ld: CMakeFiles/utils.dir/src/string_view.c.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
CMakeFiles/utils.dir/src/string_view.c.o: could not read symbols: Bad value

The static lib builds ok, so not a blocking bug (for me) but still.

Instructions on how to use library at compile time

There are demo snippets in the README showing how to use the library at the source level, and there is a building with cmake part about building it, but it would be nice to have a section that explains how to actually to add the library to your build: e.g., do you need to link against any binary? or just include the header?

You cover how a project using cmake should consume this, but a large number of projects aren't using cmake, so even though they'll build cpu_features itself with cmake, they should be able to then integrate the output into their own build system.

I guess one or two lines about that scenario would suffice.

New tag request

0.2 tag is already several months old.
Would you now be open to create a new tag (0.3.0 or whatever) ?
Kind

Mips implementation is buggy

The hwcaps.h MIPS constants

#define MIPS_HWCAP_VZ (1UL << 0)
#define MIPS_HWCAP_EVA (1UL << 1)
#define MIPS_HWCAP_HTW (1UL << 2)
#define MIPS_HWCAP_FPU (1UL << 3)
#define MIPS_HWCAP_MIPS32R2 (1UL << 4)
#define MIPS_HWCAP_MIPS32R5 (1UL << 5)
#define MIPS_HWCAP_MIPS64R6 (1UL << 6)
#define MIPS_HWCAP_DSPR1 (1UL << 7)
#define MIPS_HWCAP_DSPR2 (1UL << 8)
#define MIPS_HWCAP_MSA (1UL << 9)

are not valid and should map to the linux kernel instead.

How to compile this as a shared library ?

Hi everyone,

I was wondering if you could please tell how I could compile this as a shared library ?

I tried using cmake but so far, without any success. Please point me to a better direction

build process requires outbound internet connections

I would appreciate it if the documentation would reflect the fact that this library downloads approximately 55,000 lines of code from github at build time. This can cause problems for people who expect a build to work within sandboxed package-build environments.

Missing ARMv7 define

Thanks for the cpu_features library.

It looks like the library is missing a define for ARMv7. ARMv7 is important for some hand tuned algorithms. For example, one might bundle Andy Polyakov's Cryptogams AES and SHA algorithms, and make the baseline ARMv7 with -march=armv7-a.

There are several reasons to make ARMv7 the baseline. ARMv7 is mostly the de facto standard nowadays, so using ARMv7 addresses the common case out of the box. ARMv7 ISA also enjoys a performance boost over older ISAs, and provides additional loads and stores not available in ARMv6 and below. However, ARMv5 and ARMv6 will show up on occasion, like old dev-boards and iPads, so a fallback is needed on occasion.

At runtime, user code may do something like the following:

if (HasARMv7()) {
    CRYPTOGAMS_aes_encrypt(data, blocks, subkeys);
} else {
    CXX_aes_encrypt(data, blocks, subkeys);
}

We can't really use ARM_NEON as a proxy for ARMv7 because some ARMv6 devices have NEON. And some ARMv7 dev-boards lack NEON.


It looks like Mozilla synthesises it from /proc/cpuinfo on Linux; see Changeset 522951ff7046. Crypto++ library utilizes a SIGILL probe using movw and movt because getauxval does not provide a define (movw and movt are part of ARMv7 ISA):

int a;
asm volatile("movw %0,%1 \n"
             "movt %0,%1 \n"
             : "=r"(a) : "i"(0x1234));

hasARMv7 = (a == 0x12341234);

Unfortunately, SIGILL probes have several of shortcomings. First, they are expensive when compared to getauxval and friends. Second, they only work with the GNU Assembler (GAS) or compatible assemblers. Third, they trash a program on Apple platforms. Apple does not restore the context properly when a longjmp is taken, so they can't be used on iPhones and iPads.

Overwriting windows syscalls

Windows, the Win32 API in specific, has a function called ReadFile, OpenFile and CloseFile. You define all three in filesystem.c, non-namespaced, and therefore overriding the Win32 syscalls for everybody who links with your library. This is a huge oversight and cost me a couple of hours of trying to figure out why my programs all start to crash when reading files out of a sudden.

int OpenFile(const char* filename) { return _open(filename, _O_RDONLY); }
void CloseFile(int file_descriptor) { _close(file_descriptor); }
int ReadFile(int file_descriptor, void* buffer, size_t buffer_size) {
return _read(file_descriptor, buffer, buffer_size);
}

Unable to build this on ARMv7 server

I tried it on my bare-metal ARMv7 - C1 instance on scaleaway and it failed to compile again

root@tuffy:~/projects/code/c-shared-libs# cd cpu_features/
root@tuffy:~/projects/code/c-shared-libs/cpu_features# cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -H. -Bcmake_build
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /root/projects/code/c-shared-libs/cpu_features/cmake_build
root@tuffy:~/projects/code/c-shared-libs/cpu_features# cd cmake_build/
root@tuffy:~/projects/code/c-shared-libs/cpu_features/cmake_build# make
Scanning dependencies of target cpu_features
[  6%] Building C object CMakeFiles/cpu_features.dir/src/linux_features_aggregator.c.o
[ 13%] Building C object CMakeFiles/cpu_features.dir/src/cpuid_x86_clang_gcc.c.o
[ 20%] Building C object CMakeFiles/cpu_features.dir/src/cpuid_x86_msvc.c.o
[ 26%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_aarch64.c.o
[ 33%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_arm.c.o
[ 40%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_mips.c.o
[ 46%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_ppc.c.o
[ 53%] Building C object CMakeFiles/cpu_features.dir/src/cpuinfo_x86.c.o
[ 60%] Building C object CMakeFiles/cpu_features.dir/src/filesystem.c.o
[ 66%] Building C object CMakeFiles/cpu_features.dir/src/hwcaps.c.o
[ 73%] Building C object CMakeFiles/cpu_features.dir/src/stack_line_reader.c.o
[ 80%] Building C object CMakeFiles/cpu_features.dir/src/string_view.c.o
[ 86%] Linking C shared library libcpu_features.so
[ 86%] Built target cpu_features
Scanning dependencies of target list_cpu_features
[ 93%] Building C object CMakeFiles/list_cpu_features.dir/src/utils/list_cpu_features.c.o
[100%] Linking C executable list_cpu_features
libcpu_features.so: undefined reference to `GetXCR0Eax'
libcpu_features.so: undefined reference to `CpuId'
collect2: error: ld returned 1 exit status
CMakeFiles/list_cpu_features.dir/build.make:95: recipe for target 'list_cpu_features' failed
make[2]: *** [list_cpu_features] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/list_cpu_features.dir/all' failed
make[1]: *** [CMakeFiles/list_cpu_features.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

Am I doing something wrong here ?

Shared build not working?

Positive statement first: All works fine when doing static builds. Our project that wants to use cpu_features has to build its libraries shared, though, too, and there I'm running into problems:

I read and followed these suggestions (#44 ) as well as the subproject build instructions but to no avail:

  1. When doing a cpu_features "local" cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DBUILD_PIC=ON -H. -Bcmake_build this is the result of make on Ubuntu:
/usr/bin/ld: CMakeFiles/utils.dir/src/string_view.c.o: relocation R_X86_64_PC32 against symbol `CpuFeatures_StringView_IndexOf' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
CMakeFiles/cpu_features.dir/build.make:100: recipe for target 'libcpu_features.so' failed
make[2]: *** [libcpu_features.so] Error 1
CMakeFiles/Makefile2:178: recipe for target 'CMakeFiles/cpu_features.dir/all' failed
make[1]: *** [CMakeFiles/cpu_features.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2
  1. When doing this embedded in our larger project, not only can I not build the shared lib, even this happens during make:
[ 95%] Linking C executable list_cpu_features
CMakeFiles/list_cpu_features.dir/src/utils/list_cpu_features.c.o: In function `CreateTree':
list_cpu_features.c:(.text+0x575): undefined reference to `GetX86Info'
list_cpu_features.c:(.text+0x57d): undefined reference to `GetX86CacheInfo'
list_cpu_features.c:(.text+0x585): undefined reference to `FillX86BrandString'
list_cpu_features.c:(.text+0x676): undefined reference to `GetX86Microarchitecture'
list_cpu_features.c:(.text+0x67d): undefined reference to `GetX86MicroarchitectureName'
list_cpu_features.c:(.text+0x6c9): undefined reference to `GetX86FeaturesEnumValue'
list_cpu_features.c:(.text+0x6d9): undefined reference to `GetX86FeaturesEnumName'
collect2: error: ld returned 1 exit status
src/cpu_features/CMakeFiles/list_cpu_features.dir/build.make:95: recipe for target 'src/cpu_features/list_cpu_features' failed
make[2]: *** [src/cpu_features/list_cpu_features] Error 1
CMakeFiles/Makefile2:432: recipe for target 'src/cpu_features/CMakeFiles/list_cpu_features.dir/all' failed
make[1]: *** [src/cpu_features/CMakeFiles/list_cpu_features.dir/all] Error 2
Makefile:129: recipe for target 'all' failed

--> Key question: Is the use of cpu_features as a shared lib not just discouraged, but even no longer supported? If so do you see any way how to integrate it into a project that has to build its libraries shared? If so could you update your guidance for integration ? Thanks in advance!

Completed missing ARM features

  • swp SWP instruction (atomic read-modify-write)
  • _26bit "26 Bit" Model (Processor status register folded into program counter)
  • fpa Floating point accelerator
  • crunch MaverickCrunch coprocessor
  • thumbee ThumbEE
  • vfpd32 VFP with 32 D-registers
  • lpae Large Physical Address Extension (>4GB physical memory on 32-bit architecture)
  • evtstrm kernel event stream using generic architected timer

fixed by #79

Frequency scaling can hurt, even in same family

This looks like a fantastically useful library.

Just a point for consideration, given that one of the targets is to make it easier to write fast code:
https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/

Note that same generation, same feature set can have wildly different performance depending on if it's "Silver", "Gold" or "Platinum", none of which is exposed by flags. Silver, for example, is de-tuned for AVX-512 (as someone puts it in the comments there).

You can parse that out of /proc/cpuinfo, if you're solely looking at Linux use cases, but this seems like something that would be ideal to find out via this library.

CPU flags not correctly detected

Although through cat /proc/cpuinfo shows SSE3, SSE4.1, SSE4.2, list_cpu_features doesn't list them out.

Output of list_cpu_features (compiled using source of time of posting) :

arch            : x86
brand           : Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz
family          :   6 (0x06)
model           :  37 (0x25)
stepping        :   2 (0x02)
uarch           : INTEL_WSM
flags           : aes,cx16,smx
leewp14@CTK942S:~/Desktop/cpu_features-master$ 

Output of cat /proc/cpuinfo :

[sudo] password for leewp14: 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 37
model name	: Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz
stepping	: 2
microcode	: 0xe
cpu MHz		: 1199.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm pti tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 5319.90
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 37
model name	: Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz
stepping	: 2
microcode	: 0xe
cpu MHz		: 1199.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 2
apicid		: 4
initial apicid	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm pti tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 5319.90
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 37
model name	: Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz
stepping	: 2
microcode	: 0xe
cpu MHz		: 1199.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm pti tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 5319.90
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 37
model name	: Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz
stepping	: 2
microcode	: 0xe
cpu MHz		: 1199.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 2
apicid		: 5
initial apicid	: 5
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm pti tpr_shadow vnmi flexpriority ept vpid dtherm arat
bugs		: cpu_meltdown spectre_v1 spectre_v2
bogomips	: 5319.90
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

leewp14@CTK942S:~$ 

vpclmulqdq should be pclmulqdq

vpclmulqdq should be pclmulqdq. The "v" version of the instruction simply means to encode the instruction in AVX format instead of SSE format.

Support disabling of extensions

To test all code paths of apps that use cpu_features to select between different implementations of an algorithm, it would be useful to be able to mock having a CPU with fewer features than actually supported.

While this can be done in wrapper layer which uses cpu_features to query the real feature set, little prevents direct use of cpu_features functions that would break the purpose of the wrapper. Hence I think it would be useful if this was a core feature of cpu_features itself.

One complication is that for tests to be able to run concurrently this would have to be thread-local (assuming the tests are single-threaded). For multi-threaded tests a global mutex could be used to prevent concurrent execution from the point where cpu_features is provided with override options, to the point where they're removed.

Multiple users of cpu_features have to deal with this themselves currently, so having it as a core feature instead avoids duplicate effort and incompatible solutions which might have subtle bugs that are best fixed centrally.

this is not c89

You've got a build system that depends on c++, // comments in your code, and you use the inline keyword in string_view.h, leaving aside the stdint.h includes and uintxx_t types scattered everywhere.

Attempting to compile this even once with -std=c89 would have revealed these problems. I recommend you put, at least,
set(CMAKE_C_FLAGS "-std=c89 ${CMAKE_C_FLAGS}")
in your CMakeLists.txt file.

CMake: Install Missing

Hi and thank you for the great project!

Can you please add install() lines to your CMake code? This would allow setting -DCMAKE_INSTALL_PREFIX to the cmake call and installing the final library via make install.

Here is an example on how to install targets in modern CMake:

If you want, you can also generate a little *Config.cmake file on install which makes it easy to find the library in projects depending on it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.