Git Product home page Git Product logo

Comments (4)

daltonj avatar daltonj commented on September 12, 2024

Start on a colab notebook
https://colab.research.google.com/drive/1G2-EPTbQMQ_QWDh5Ht-p38jFBLc1SUyi

from lasertagger.

varepsilon avatar varepsilon commented on September 12, 2024

Looks like one needs to store the data on Google Cloud bucket in order to be able to use Google's TPU.

E.g., for the BERT data:*

INPUT_DIR = 'gs://{}/bert/{}'.format(BUCKET, BERT_MODEL)
tf.gfile.MakeDirs(INPUT_DIR)

for f in tf.gfile.Glob('/content/{}/*'.format(BERT_MODEL)):
  tf.gfile.Copy(f, os.path.join(INPUT_DIR, f.split('/')[-1]))

%env BERT_BASE_DIR=$INPUT_DIR

And for the output data:

GS_OUTPUT_DIR = 'gs://{}/output'.format(BUCKET)
tf.gfile.MakeDirs(GS_OUTPUT_DIR)

for f in tf.gfile.Glob('/content/output/*'):
  if tf.gfile.Stat(f).is_directory:
    continue
  tf.gfile.Copy(f, os.path.join(GS_OUTPUT_DIR, f.split('/')[-1]))

%env OUTPUT_DIR=$GS_OUTPUT_DIR

(*) Assuming the data was previously stored locally, e.g.,

bert_url = 'https://storage.googleapis.com/bert_models/2018_10_18/' + BERT_MODEL + '.zip'
bert_zip = BERT_MODEL + '.zip'
!wget $bert_url
!unzip $bert_zip

from lasertagger.

daltonj avatar daltonj commented on September 12, 2024

The notebook has been updated to support TPU, via writing all data to a cloud bucket.

Note: the prediction step - predict_main.py doesn't have flags for using a TPU. It's currently running very slowly, roughly 1 example per second. Is this expected?

I1107 15:31:14.475278 139814862706560 predict_main.py:89] 0 examples processed, 0 converted to tf.Example.
I1107 15:33:06.323075 139814862706560 predict_main.py:89] 100 examples processed, 100 converted to tf.Example.

from lasertagger.

ekQ avatar ekQ commented on September 12, 2024

Hi and sorry for the slow reply! Having a Colab would indeed be very useful but atm I don't have time to do that. However, if you'd like to create a pull request, I'd be very happy to review it.

Regarding slow inference: This is indeed an issue and the expected behavior when you run the code as such. Internally, we heavily parallelize inference so it's not an issue in that case. To make it faster, one should ideally increase the batch size (currently it's 1 [*]) which requires small code changes.

A quicker fix is to use LaserTaggerFF by setting use_t2t_decoder to false in configs/lasertagger_config.json. This should already make prediction about 40 times faster (at least on GPU). This may hurt the accuracy slightly but not radically at least in our experiments.

[*] https://github.com/google-research/lasertagger/blob/master/predict_utils.py#L57

from lasertagger.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.