Git Product home page Git Product logo

Comments (3)

kentonl avatar kentonl commented on May 18, 2024

It looks like it couldn't find any input files matching $NQ_DATA_DIR/dev/nq-dev-*.jsonl.gz to process.

Did you download the data using the first few commands?

export DATA_DIR=data
mkdir -p $DATA_DIR
gsutil -m cp -r gs://natural_questions $DATA_DIR
export NQ_DATA_DIR=$DATA_DIR/natural_questions/v1.0

from language.

abaheti95 avatar abaheti95 commented on May 18, 2024

You were right. I opened a new terminal and the variable $NQ_DATA_DIR was lost. However now I'm getting a new decoding error:

python -m language.question_answering.preprocessing.create_nq_short_pipeline_examples   --input_pattern=$NQ_DATA_DIR/dev/nq-dev-*.jsonl.gz   --output_dir=$NQ_DATA_DIR/dev
I0124 14:34:15.836303 4424058304 tf_logging.py:115] Converting input 5 files: ['/Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-00.jsonl.gz', '/Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-01.jsonl.gz', '/Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-04.jsonl.gz', '/Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-03.jsonl.gz', '/Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-02.jsonl.gz']
I0124 14:34:15.851707 4424058304 tf_logging.py:115] Converting examples in /Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-01.jsonl.gz to tf.Examples.
I0124 14:34:15.851823 4424058304 tf_logging.py:115] Converting examples in /Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-04.jsonl.gz to tf.Examples.
I0124 14:34:15.851601 4424058304 tf_logging.py:115] Converting examples in /Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-00.jsonl.gz to tf.Examples.
I0124 14:34:15.851979 4424058304 tf_logging.py:115] Converting examples in /Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-03.jsonl.gz to tf.Examples.
I0124 14:34:15.854323 4424058304 tf_logging.py:115] Converting examples in /Users/user/Reseach/QA_and_Dialog/Datasets/natural_questions/v1.0/dev/nq-dev-02.jsonl.gz to tf.Examples.
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/Users/user/Reseach/QA_and_Dialog/language/language/question_answering/preprocessing/create_nq_short_pipeline_examples.py", line 88, in _create_short_answer_examples
    for i, tf_example in enumerate(_generate_tf_examples(input_file)):
  File "/Users/user/Reseach/QA_and_Dialog/language/language/question_answering/preprocessing/create_nq_short_pipeline_examples.py", line 52, in _generate_tf_examples
    for line in input_file:
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 374, in readline
    return self._buffer.readline(size)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 463, in read
    if not self._read_gzip_header():
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 406, in _read_gzip_header
    magic = self._fp.read(2)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 91, in read
    self.file.read(size-self._length+read)
  File "/Users/user/Reseach/QA_and_Dialog/language/venv/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 132, in read
    pywrap_tensorflow.ReadFromStream(self._read_buf, length, status))
  File "/Users/user/Reseach/QA_and_Dialog/language/venv/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 100, in _prepare_value
    return compat.as_str_any(val)
  File "/Users/user/Reseach/QA_and_Dialog/language/venv/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 107, in as_str_any
    return as_str(value)
  File "/Users/user/Reseach/QA_and_Dialog/language/venv/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 80, in as_text
    return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/user/Reseach/QA_and_Dialog/language/language/question_answering/preprocessing/create_nq_short_pipeline_examples.py", line 109, in <module>
    app.run(main)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/Users/user/Reseach/QA_and_Dialog/language/language/question_answering/preprocessing/create_nq_short_pipeline_examples.py", line 105, in main
    pool.map(_create_short_answer_examples, input_paths)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

from language.

kentonl avatar kentonl commented on May 18, 2024

This seems to be a Python 3 compatibility issue. This should be fixed in the latest version.

from language.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.