Comments (4)
I tried to fix it. Can you try again? Anyway, this faulthandler support is optional, so you can remove that. But patches to fix it are also welcome.
from returnn.
The TF benchmark uses also our native LSTM op, which it tries to compile on-the-fly.
It (the native LSTM op and other native ops) expects that it finds the TF header files. They are usually part of the PIP package of TF. It gets the path via tf.sysconfig.get_include()
. Maybe you can check if that function returns a valid path for you and if you have the header files in there. Otherwise, it's a bit strange that the Windows PIP package does not contain the headers. Maybe you can report that upstream here.
The next thing is, it expects that there is a compiler. Currently there is g++
/nvcc
hardcoded and also the flags are probably partly specific for those. You will find related code in TFUtil.OpCodeCompiler
. You would need to see how you can call the Windows default compiler and the relevant settings. See the TF documentation how to compile a custom op on Windows.
You don't need the native ops, though, if you don't use them. They are faster, esp the LSTM implementation is faster, but you can also use LSTMBlockFused
for example. In that case it should work (or maybe you will hit another bug as Windows is not really tested by us). For example, you can try:
rnn.py demos/demo-tf-contribrnn-lstm.12ax.config
Or:
rnn.py demos/demo-tf-vanilla-lstm.12ax.config
You can also run the TF benchmark without the native op, e.g.:
demos/demo-tf-benchmark.py --selected "BasicLSTM,StandardLSTM,LSTMBlock,LSTMBlockFused,CudnnLSTM"
from returnn.
@albertz Thank you very much for the fast reply. It solves the issue!
However, when runing the demos/demo-tf-benchmark.py
, there is another issue raised,
fatal error: tensorflow/core/framework/op.h: No such file or directory
Windows environment seems not suitable for the open source projects.
Thank you again for providing such awesome codes.
from returnn.
@albertz Great, everything works smooth now. I also created a PR to fix the print in config file to support python 3.5. 😄 #12
Thanks for your patient help.
from returnn.
Related Issues (20)
- DistributeFilesDataset has issues with DataLoader and `num_workers > 0` HOT 1
- RF scaled_dot_product_attention
- DistributeFilesDataset Sharding with PT Dataloader breaks HOT 3
- Hang in training (often with multi GPU training) HOT 1
- PyTorch Distributed Training: File descriptors opened and never closed HOT 8
- Dataset ctx_left/ctx_right extension: ctx_clip_to_valid option HOT 5
- PyTorch/RF (?): choosing on which epochs to save optimizer state
- Datasets: blocklist in addition to allowlist for segment list file
- Make batch_size configurable for cross validation HOT 1
- Ignore a single broken gradient HOT 2
- DistributeFilesDataset: _distribute_evenly_by_size suboptimal for multi-gpu sharding HOT 8
- multiprocessing: OSError: AF_UNIX path too long HOT 11
- ConcatSeqsDataset with extended functionality HOT 3
- Torch: print model at log verbosity 3 HOT 1
- RuntimeError: CUDA error: an illegal memory access was encountered HOT 1
- Torch gradient_checkpoint_scope _unregister_custom_saved_tensors_hooks error HOT 4
- RF parametrization breaks Conv
- Torch gradient_checkpoint_scope could trigger segmentation fault? HOT 16
- Torch gradient_checkpoint_scope potential memory leak
- Torch multiple simultaneous gradient_checkpoint_scope
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from returnn.