Comments (2)
- Eager will compare with BF16 data type in
. For 0.2 and 0.2002, when
round to nearest even
will all get0x3E4D
in storage. - Inductor will convert to FP32 and compare with FP32 as attached Inductor CPP generated code:
cpp_fused_gt_0 = async_compile.cpp_pybinding(['const bfloat16*', 'bool*'], '''
#include "/tmp/torchinductor_leslie/sk/cskh5dx62fglpphcrl6723dnmowdabouerrzy3dmqcngbxwfa7bv.h"
extern "C" void kernel(const bfloat16* in_ptr0,
bool* out_ptr0)
{
{
#pragma omp simd simdlen(8)
for(long x0=static_cast<long>(0L); x0<static_cast<long>(4L); x0+=static_cast<long>(1L))
{
auto tmp0 = in_ptr0[static_cast<long>(x0)];
auto tmp1 = c10::convert<float>(tmp0);
auto tmp2 = static_cast<float>(0.2);
auto tmp3 = tmp1 > tmp2;
out_ptr0[static_cast<long>(x0)] = tmp3;
}
}
}
''')
from pytorch.
Taking this numerical difference as expected.
from pytorch.
Related Issues (20)
- Dynamo graph breaks when passing a constant value to a nn module initialized within a function
- MaskedTensor do not support _is_any_true`
- internal asset failure HOT 1
- post grad batch linear fusion performs unsafe optimization when inlining nn modules HOT 3
- NotImplementedError: Could not run 'quantized::linear' with arguments from the 'CPU' backend. 'quantized::linear' is only available for these backends: [QuantizedCPU, QuantizedCUDA, BackendSelect, Python,....] HOT 1
- TorchScript JIT UnicodeDecodeError: βutf-8β codec canβt decode byte 0xe0 in position 0: unexpected end of data
- Lintrunner results inconsistent HOT 1
- make AutocastPrivateUse1 have a default implement
- jacrev and jacfwd yield different results if one uses torch.no_grad blocks in module HOT 4
- Retry on CUBLAS_STATUS_ALLOC_FAILED HOT 1
- The running will become slower and slower with epoch continuing, but without error when the mlpmixer model is used HOT 2
- custom torch.autograd.Function with jvp method outputs error without explicit torch._dynamo import HOT 2
- [BUG] Circular dependency `torch.nn.parameter` -> `torch.nn` -> `torch.autograd` -> `torch._C._autograd_init()` -> `torch.nn.parameter`
- torch.onnx.export() 1.16 with pytorch 2.2.1 produces invalid model
- Saved variable packing unpacking incorrect aliases version counter
- CUDA causes ipykernel failure HOT 1
- Be able to dynamically disable a kernel for a custom op
- [ONNX] Enable op tests to run with torch.export.export pre_dispatch
- Generated some code to try to torch compile it, and it fails for the same code when it's torch compiled HOT 1
- cuda12.1-py3.10-gcc9-sm80 / test (inductor_torchbench_smoketest_perf, 1, 1, linux.gcp.a100) has long queue times HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.