Git Product home page Git Product logo

Comments (7)

cdxeve avatar cdxeve commented on May 20, 2024

Hi, thanks for your info. I rerun the code but do not encounter the same error, you may diff your "raw2read.py" with the one in our repo to see if there are any changes, or you may try to upgrade your "concurrent" packages?

from lmops.

rajvadiyala avatar rajvadiyala commented on May 20, 2024

I git cloned the repo and used VS code. Python Version is 3.11.

Here is the troubleshooting I did>

I am trying to run a python code and I am getting this error:
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\rajas\Desktop\LMOps-main\adaptllm\raw2read.py", line 88, in
read_compre =list(process_map(search, raw_texts, max_workers=max_workers, chunksize=8192))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\contrib\concurrent.py", line 105, in process_map
return _executor_map(ProcessPoolExecutor, fn, *iterables, **tqdm_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\contrib\concurrent.py", line 51, in _executor_map
return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\std.py", line 1182, in iter
for obj in iterable:
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\process.py", line 602, in _chain_from_iterable_of_lists
for element in iterable:
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 401, in __get_result
raise self._exception
NameError: name 'overall_cls' is not defined
ChatGPT
The error you're encountering is a NameError which indicates that a name (in this case, overall_cls) is not recognized in the current scope of your code.

Here's the relevant part of the traceback:

csharp
Copy code
NameError: name 'overall_cls' is not defined
This suggests that somewhere in your code (or potentially in a function or method you're calling), you are trying to reference a variable or function named overall_cls, but it hasn't been defined.

To resolve this issue:

Check the spelling: Ensure that you've spelled the variable/function name correctly everywhere it's used.
Check the scope: Ensure that the variable/function overall_cls is defined and is accessible in the scope where you're trying to use it.
Import missing modules: If overall_cls is part of an external module, ensure that you've imported the module correctly.
If you can provide the relevant sections of your code, especially the parts that reference overall_cls, I can provide more specific guidance.

Will you be kind enough to look into the folder that has the above code and tell me what is going wrong and suggest me steps to fix the issue?
ChatGPT
Of course! Please upload the folder or the relevant code files, and I'll take a look to help you identify and address the issue.

comparison.png
Image
raw2read.py
Python
README.md
File
requirements.txt
Document
ChatGPT
Thank you for providing the files. I'll start by examining the raw2read.py file since it's the one mentioned in the traceback. I'll look for any references to overall_cls and try to understand the context in which it's being used to diagnose the issue.

Finished working
Show work
From the initial part of the raw2read.py file, I see that the overall_cls variable is referenced within the search function:

python
Copy code

truncate the context to meet the max_seq_len

context_wo_title = overall_cls.truncate_sentence(context_wo_title, max_len=overall_cls.max_seq_len-200)
It seems that overall_cls is expected to be an object that provides methods and properties related to text truncation. However, within the scope of the search function, overall_cls is not defined.

To fully diagnose the issue, I need to check if overall_cls is defined elsewhere in the file or if it's supposed to be passed as an argument to the search function or accessed in some other way.

Let's continue analyzing the file to understand how overall_cls is intended to be used.

Finished working
Show work
The variable overall_cls is indeed defined later in the code:

python
Copy code
overall_cls = type_map.cls_dic'overall'
overall_cls.init_spm(args.general_spm_path, args.domain_spm_path)
However, the search function, which tries to access overall_cls, does not have visibility of this variable since it's defined outside of its scope. The function attempts to use it without it being passed as an argument or defined within its local scope, leading to the NameError.

To fix the issue, you have a few options:

Pass overall_cls as an Argument: Modify the search function definition to accept overall_cls as an argument and pass it when calling the function.
Use a Global Variable: If you intend overall_cls to be a global variable that should be accessible throughout the script, you can declare it as global inside the search function. However, this approach is generally discouraged due to potential side effects and reduced code clarity.
Given the structure of your code, I'd recommend the first approach: passing overall_cls as an argument to the search function. Would you like me to demonstrate how to make this change?

Give me modified code to paste or save
ChatGPT
Sure! I'll provide a modified version of the search function to accept overall_cls as an argument, and I'll also modify the call to process_map to pass overall_cls to the search function. Here's the modified code:

Modify the search function definition:
python
Copy code
def search(entry, overall_cls):
...
Modify the call to process_map:
python
Copy code
read_compre = list(process_map(lambda entry: search(entry, overall_cls), raw_texts, max_workers=max_workers, chunksize=8192))
Here's the modified version of the relevant parts of your code:

Finished working
Show work
I've made the necessary modifications. You can download the modified raw2read.py file here.

Replace your original file with this modified version, and the NameError should be resolved.

max_workers: 12
loading raw texts in the input folder...
paths: ['./data_samples/input-raw-texts\0.txt', './data_samples/input-raw-texts\1.txt', './data_samples/input-raw-texts\10.txt', './data_samples/input-raw-texts\11.txt', './data_samples/input-raw-texts\2.txt', './data_samples/input-raw-texts\3.txt', './data_samples/input-raw-texts\4.txt', './data_samples/input-raw-texts\5.txt', './data_samples/input-raw-texts\6.txt', './data_samples/input-raw-texts\7.txt', './data_samples/input-raw-texts\8.txt', './data_samples/input-raw-texts\9.txt']
12it [00:00, 11567.83it/s]
transferring raw texts into reading comprehension...
0%| | 0/12 [00:00<?, ?it/s]
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\queues.py", line 244, in _feed
obj = _ForkingPickler.dumps(obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x0000021EA573A200>: attribute lookup on main failed
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\rajas\Desktop\LMOps-main\adaptllm\raw2read.py", line 88, in
read_compre = list(process_map(lambda entry: search(entry, overall_cls), raw_texts, max_workers=max_workers, chunksize=8192))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\contrib\concurrent.py", line 105, in process_map
return _executor_map(ProcessPoolExecutor, fn, *iterables, **tqdm_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\contrib\concurrent.py", line 51, in _executor_map
return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\std.py", line 1182, in iter
for obj in iterable:
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\process.py", line 602, in _chain_from_iterable_of_lists
for element in iterable:
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 401, in __get_result
raise self._exception
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\queues.py", line 244, in _feed
obj = _ForkingPickler.dumps(obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x0000021EA573A200>: attribute lookup on main failed
ChatGPT
The new error is related to the use of the lambda function with the ProcessPoolExecutor. The issue arises because the lambda function cannot be pickled (i.e., serialized) for the inter-process communication that occurs with ProcessPoolExecutor.

To address this, you can define a new function that wraps around the search function with the required arguments, rather than using a lambda function.

Here's what we can do:

Define a new function called search_wrapper that takes just the entry argument and calls search(entry, overall_cls) inside.
Replace the lambda in the process_map call with this search_wrapper function.
Let's make these modifications:

Finished working
Show work
I've made the necessary modifications. You can download the updated raw2read.py file here.

Replace your original file with this updated version, and it should resolve the pickling issue related to the lambda function.

max_workers: 12
loading raw texts in the input folder...
paths: ['./data_samples/input-raw-texts\0.txt', './data_samples/input-raw-texts\1.txt', './data_samples/input-raw-texts\10.txt', './data_samples/input-raw-texts\11.txt', './data_samples/input-raw-texts\2.txt', './data_samples/input-raw-texts\3.txt', './data_samples/input-raw-texts\4.txt', './data_samples/input-raw-texts\5.txt', './data_samples/input-raw-texts\6.txt', './data_samples/input-raw-texts\7.txt', './data_samples/input-raw-texts\8.txt', './data_samples/input-raw-texts\9.txt']
12it [00:00, 7837.38it/s]
transferring raw texts into reading comprehension...
0%| | 0/12 [00:00<?, ?it/s]
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\process.py", line 256, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\process.py", line 205, in _process_chunk
return [fn(*args) for args in chunk]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\process.py", line 205, in
return [fn(*args) for args in chunk]
^^^^^^^^^
File "C:\Users\rajas\Desktop\LMOps-main\adaptllm\raw2read_modified_v2.py", line 12, in search_wrapper
return search(entry, overall_cls)
^^^^^^^^^^^
NameError: name 'overall_cls' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\rajas\Desktop\LMOps-main\adaptllm\raw2read_modified_v2.py", line 93, in
read_compre = list(process_map(search_wrapper, raw_texts, max_workers=max_workers, chunksize=8192))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\contrib\concurrent.py", line 105, in process_map
return _executor_map(ProcessPoolExecutor, fn, *iterables, **tqdm_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\contrib\concurrent.py", line 51, in _executor_map
return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\site-packages\tqdm\std.py", line 1182, in iter
for obj in iterable:
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\process.py", line 602, in _chain_from_iterable_of_lists
for element in iterable:
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rajas\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures_base.py", line 401, in __get_result
raise self._exception
NameError: name 'overall_cls' is not defined
ChatGPT
I apologize for the oversight. The problem persists because overall_cls is not accessible in the separate processes spawned by ProcessPoolExecutor.

To resolve this, we need to pass overall_cls as part of the data being mapped over. One way to do this is to modify the data passed to process_map to include both the entry and overall_cls for each item.

Here's the plan:

Instead of passing raw_texts directly to process_map, we'll pass a list of tuples, where each tuple contains an entry and overall_cls.
Modify the search_wrapper function to unpack the tuple and call search with the appropriate arguments.
Let's make these modifications.

Finished working
Show work
I've made the further necessary modifications. You can download the newest version of raw2read.py here.

Replace your original file with this latest version, and it should address the issue with the overall_cls not being accessible in the worker processes.

File "C:\Users\rajas\Desktop\LMOps-main\adaptllm\raw2read_modified_v3.py", line 98
print('saving reading comprehension texts...')
IndentationError: unexpected indent
ChatGPT
I apologize for the oversight. The error message indicates that there's an indentation issue in the code, specifically around the line that says print('saving reading comprehension texts...').

Let's correct this and provide you with the updated file.

Finished working
Show work
I've corrected the indentation issue. You can download the updated raw2read.py file here.

Replace your original file with this latest version, and it should resolve the indentation error.

File "C:\Users\rajas\Desktop\LMOps-main\adaptllm\raw2read_modified_v4.py", line 100
for entry in read_compre:
IndentationError: unexpected indent
ChatGPT
I apologize for the inconvenience. Let's address this additional indentation issue.

I will correct the indentation for the line for entry in read_compre: and any subsequent lines that may be affected.

Finished working
Show work
I've corrected the additional indentation issue. You can download the latest version of raw2read.py here.

Replace your original file with this version, and it should address the IndentationError.

you cant do it. same error
ChatGPT
I apologize for the inconvenience. Let's fix this issue once and for all.

To ensure I address the problem correctly, I will explicitly look at the section of code around the line causing the error, adjust the indentation, and share that portion with you for confirmation before creating the entire file.

Finished working
Show work
The displayed code section seems to have inconsistent indentation. Specifically, the lines following the print('saving reading comprehension texts...') statement are indented more than they should be.

Here's the corrected section:

python
Copy code
read_compre = list(process_map(search_wrapper, raw_texts_with_overall_cls, max_workers=max_workers, chunksize=8192))

print('saving reading comprehension texts...')

sort by text_id to align with the order of raw texts

for entry in read_compre:
path = os.path.join(args.output_dir, entry["file_name"])
with open(path, 'w') as f:
f.write(entry['read_compre'])
f.close()

print(f'saved to {args.output_dir}')
I'll replace this corrected section in the entire file and provide you with the updated file.

Finished working
Show work
I've made the necessary corrections to the indentation. You can download the updated raw2read.py file here.

Replace your original file with this latest version, and it should resolve the IndentationError.

I am trying to do like shown in this page:
https://github.com/microsoft/LMOps/tree/main/adaptllm
And getting the above errors. Please tell me what exactly I should do. I went to adaptllm folder and ran requirements.tx and tyring to execute raw2ready.py file and started getting errors
ChatGPT
Thank you for sharing the context. I understand the frustration, and I'll guide you through the process step by step.

The repository you've shared is Microsoft's Language Model Operations (LMOps) framework for adapting and deploying large language models. The errors you're facing arise from the raw2read.py script, which converts raw text to a reading comprehension format.

From the details you've provided and the code fixes we've worked on, here's a recommended approach:

Dependencies: Make sure all the dependencies are correctly installed. Navigate to the adaptllm folder and run:
bash
Copy code
pip install -r requirements.txt
Replace raw2read.py: Use the updated raw2read.py file I provided. Download it and replace the original raw2read.py in the adaptllm folder.

Run the Script: Once you've replaced the file, try running the script again:

bash
Copy code
python raw2read.py [your arguments here]
Make sure to replace [your arguments here] with the appropriate arguments needed by the script.

Check Other Scripts: If you intend to run other scripts from the LMOps repository, it might be a good idea to check them for potential issues as well. Given that you encountered errors with raw2read.py, there might be other scripts that require updates or fixes.

Python Version: Ensure you're using a compatible Python version. Some code might behave differently across Python versions. The repository doesn't explicitly mention a version, but since your error logs showed Python 3.11, it might be worth trying with a more commonly used version like Python 3.8 or 3.9, in case there are compatibility issues.

GitHub Issues: If you continue to face challenges, consider checking the "Issues" section of the GitHub repository. Others might have encountered similar problems, and solutions or workarounds could be discussed there. If not, you can create a new issue detailing the problems you're facing.

Following these steps should help you progress with the LMOps framework. If you encounter further issues, please let me know, and I'll assist you.

Thanks in advance

from lmops.

cdxeve avatar cdxeve commented on May 20, 2024

Thanks for your reply, but I have some difficulties in understanding your info (it seems like a generated response from ChatGPT). Could you please specify your info?

from lmops.

shamanez avatar shamanez commented on May 20, 2024

@cdxeve

I also get the same error.

context_wo_title = overall_cls.truncate_sentence(context_wo_title, max_len=overall_cls.max_seq_len-200)
                       ^^^^^^^^^^^
NameError: name 'overall_cls' is not defined

I am trying to execute the code with the given examples.

from lmops.

cdxeve avatar cdxeve commented on May 20, 2024

@shamanez
Hi, what is your python version? Have you ever tried "pip install --upgrade tqdm"?

from lmops.

buaahsh avatar buaahsh commented on May 20, 2024

@rajvadiyala @shamanez We have fixed this problem by adding functools in process_map

from lmops.

shamanez avatar shamanez commented on May 20, 2024

from lmops.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.