Git Product home page Git Product logo

apnn-tc's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

apnn-tc's Issues

Question about the bit-width of outputs of int1 Tensor Core

Hello, thanks for your wonderful work and available code! However, during read this paper, I have the following questions.

In the paper, the authors claim that "the int1 Tensor Core compute primitive can only generate 32 outputs". It seems that 32 bit-width output is relatively large for Int1 matric multiply-accumulate (MMA). So I want to check whether we can control the bit-width of outputs for Int1 MMA.

First, I try to access the white paper in reference [32] https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-amperearchitecture-whitepaper.pdf but I got an error with "Even AI can't find this page!".

Then, I access https://www.nvidia.com/content/PDF/[nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf](https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf) to search relative information in this white paper. However, I still can not find the introduction about the setting of Tensor Core outputs' bitwidth.

So, my question is:
(1) Where can I find the description that can support or provide evidence for "the int1 Tensor Core compute primitive can only generate 32 outputs" in white papers or other documents?
(2) Do I have any way to control the outputs' bit-width of the int1 Tensor Core compute?

Looking forward to your reply.
Best wish to you!

Compile out a bin file. How to run it?

Hi! I am learning your code, and firstly, I download the whole zip file, secondly, I put cutlass package into cutlass directory here. Then I cd to cutlass_kernel, I enter "make all". And it works! It shows something like this:
(base) C:\trash_can\APNN-TC-main\cutlass_kernel>make all nvcc -I../cutlass/include -I../cutlass/tools/util/include -I../cutlass/examples/common -std=c++11 -O3 -w -arch=sm_86 bench_gemm.cu -o bench_gemm.bin nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored. bench_gemm.cu 正在创建库 bench_gemm.lib 和对象 bench_gemm.exp
And... I get a bench_gemm.bin. I am not sure...how to run this bin file? Previously I meet a.exe and a.out for win and linux as nvcc's return file. But never meet bin. Also find nothing on google...

Thank you!!!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.