Git Product home page Git Product logo

hardware-perfcounter's Introduction

HardwarePerfCounter

HardwarePerfCounter is a set of libraries and utilities for sampling hardware performance counters, focusing on GPU right now, for benchmarking/profiling purposes.

Rationale

Benchmarking/profiling is crucial for improving application performance. However, the GPU landscape is diverse, with many different architectures from quite a few hardware vendors. To understand the fine performance details, we often need to resort to vendor-specific tools like AMD Radeon GPU Profiler, ARM Mobile Studio, NVIDIA Nsight, and Qualcomm Snapdragon Profiler.

They are really all-encompassing tool suites, providing many utilities and aiming for profiling the whole system. It's great if we just want an integrated solution for one vendor, but not so much if we need to support multiple vendors and/or have our own benchmarking/profiling solution in the development flow that just needs a non-intrusive plugin.

This project aims to provide a cross-vendor, lightweight, and embeddable library for sampling GPU performance counters to serve such needs. It is inspired by HWCPipe. However, HWCPipe only supports Mali GPUs and it uses many C++ features (STL, exception, etc.) that cannot be configured away.

Goals and Features

  • Cross-vendor: for now, Adreno GPUs and Mali GPUs on Android/Linux. Others to come.
  • Layered abstractions: there are abstractions across vendors, specific to one vendor, and specific to one vendor product category. It's possible to target broadly with fewer counter choices to specific with more counter choices.
  • Layered libraries: the core libraries are implemented in C and directly talk to kernel drivers under the hood. C++ libraries (to be added later) are on top of the C libraries.
  • Flexible building configurations: follows good CMake practice for integration. There are CMake options for each vendor. There is proper support for installation and target import/export.

Status

Right now only the low level APIs are implemented, which exposes vendor-specific counters as is. Such acess will stay as it gives users choice to dive into the details.

The next step is to build high level APIs to provide vendor-agonistic counters.

Dependencies

This repository requires a common C/C++ project development environment:

  • CMake with version >= 3.13
  • Optional Ninja build system
  • A C/C++ compiler that supports C11/C++14

Building

Android

git clone https://github.com/google/HardwarePerfCounter.git
cd HardwarePerfCounter

cmake -G Ninja -S ./ -B build-android/  \
  -DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK?}/build/cmake/android.toolchain.cmake" \
  -DANDROID_ABI="arm64-v8a" -DANDROID_PLATFORM=android-30
cmake --build build-android/

Where ANDROID_NDK is the path to the Android NDK installation. See Android's CMake guide for explanation over ANDROID_ABI and ANROID_PLATFORM.

Linux/macOS

git clone https://github.com/google/HardwarePerfCounter.git
cd HardwarePerfCounter

cmake -G Ninja -S ./ -B build/
cmake --build build/

Windows

Not supported yet.

hardware-perfcounter's People

Contributors

antiagainst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hardware-perfcounter's Issues

how much data could get in one ioctl_counter?

I follow the example to create the counters array to query some data.
I found the data is inconsistant each counter with different order。
eg. I create the counter item like this
`hpc_gpu_adreno_common_counter_t counters[] = {
HPC_GPU_ADRENO_COMMON_CP_BUSY_GFX_CORE_IDLE,
HPC_GPU_ADRENO_COMMON_CP_BUSY_CYCLES,

  HPC_GPU_ADRENO_COMMON_TP_L1_CACHELINE_REQUESTS,
  HPC_GPU_ADRENO_COMMON_TP_L1_CACHELINE_MISSES,

  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_POINT,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_BILINEAR,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_MIP,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_ANISO,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_ZERO_LOD,

  HPC_GPU_ADRENO_COMMON_PC_VERTEX_HITS,
  HPC_GPU_ADRENO_COMMON_PC_NON_DRAWCALL_GLOBAL_EVENTS,

};`

And I got the result of counter data as the data1.
Then I change the counter item like:
`hpc_gpu_adreno_common_counter_t counters[] = {
HPC_GPU_ADRENO_COMMON_PC_VERTEX_HITS,
HPC_GPU_ADRENO_COMMON_PC_NON_DRAWCALL_GLOBAL_EVENTS,

  HPC_GPU_ADRENO_COMMON_CP_BUSY_GFX_CORE_IDLE,
  HPC_GPU_ADRENO_COMMON_CP_BUSY_CYCLES,

  HPC_GPU_ADRENO_COMMON_TP_L1_CACHELINE_REQUESTS,
  HPC_GPU_ADRENO_COMMON_TP_L1_CACHELINE_MISSES,

  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_POINT,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_BILINEAR,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_MIP,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_ANISO,
  HPC_GPU_ADRENO_COMMON_TP_OUTPUT_PIXELS_ZERO_LOD,    

};`
We got the result of counter data as the data2.
It is apperantly that the last two item is move to the first. In the same phone and the same scene, I got total different magnitudes data with the vertex_hit and non_dracall_global_event compared with data1.

I'm wondering if there is something limit with the size of counter item in one ioctl counter?

ninja build error .What is missing?

ninja: error: '../third_party/envytools/registers/adreno/a5xx.xml', needed by '../lib/gpu/adreno/a5xx.c', missing and no known rule to make it

Could you make readme more specificly ?
e.g NDK version ; targetSDK version , c/c++ specific Compiler and version , ninjia version . Thanks .

Out of memory error in example

Compiled for Android using Readme, and getting:

crate context: Out of memory

When I try to run the mali_common_c_example on a Pixel 6 Pro 99% of the time.

Where can I find the description about the adreno’s counters?

I want to collect performance datas which are the same as SnapDragon profiler. But I can't find any information about the counters which this project collects at Qcom Websites, and the counters are different from Snapdragon profiler.

Could you give me some suggestions? Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.