apnn-tc's People
Forkers
zhaojp-frank eunhyeokpark ozturkosu sirius93123 yanghaojin damionfan eteral00 rugl6263 k4245191 machinelearningsystem frankinwi yukewang96 lswzjuer mkj77apnn-tc's Issues
Question about the bit-width of outputs of int1 Tensor Core
Hello, thanks for your wonderful work and available code! However, during read this paper, I have the following questions.
In the paper, the authors claim that "the int1 Tensor Core compute primitive can only generate 32 outputs". It seems that 32 bit-width output is relatively large for Int1 matric multiply-accumulate (MMA). So I want to check whether we can control the bit-width of outputs for Int1 MMA.
First, I try to access the white paper in reference [32] https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-amperearchitecture-whitepaper.pdf but I got an error with "Even AI can't find this page!".
Then, I access https://www.nvidia.com/content/PDF/[nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf](https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf) to search relative information in this white paper. However, I still can not find the introduction about the setting of Tensor Core outputs' bitwidth.
So, my question is:
(1) Where can I find the description that can support or provide evidence for "the int1 Tensor Core compute primitive can only generate 32 outputs" in white papers or other documents?
(2) Do I have any way to control the outputs' bit-width of the int1 Tensor Core compute?
Looking forward to your reply.
Best wish to you!
Compile out a bin file. How to run it?
Hi! I am learning your code, and firstly, I download the whole zip file, secondly, I put cutlass package into cutlass directory here. Then I cd to cutlass_kernel, I enter "make all". And it works! It shows something like this:
(base) C:\trash_can\APNN-TC-main\cutlass_kernel>make all nvcc -I../cutlass/include -I../cutlass/tools/util/include -I../cutlass/examples/common -std=c++11 -O3 -w -arch=sm_86 bench_gemm.cu -o bench_gemm.bin nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored. bench_gemm.cu 正在创建库 bench_gemm.lib 和对象 bench_gemm.exp
And... I get a bench_gemm.bin. I am not sure...how to run this bin file? Previously I meet a.exe and a.out for win and linux as nvcc's return file. But never meet bin. Also find nothing on google...
Thank you!!!!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.