Git Product home page Git Product logo

chatdocs's People

Contributors

ianmeinert avatar marella avatar pyrater avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chatdocs's Issues

Problem to install GTPQ

hello,

i have a problem with the command "pip install git+https://github.com/PanQiWei/[email protected]"

ERROR: Failed building wheel for auto-gptq Running setup.py clean for auto-gptq Failed to build auto-gptq ERROR: Could not build wheels for auto-gptq, which is required to install pyproject.toml-based projects
i have installed
conda install cuda --channel nvidia/label/cuda-12.1.0
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

chunksize and max_seq_length of embedding not matching

AFAIK the default of the length measure RecursiveCharacterTextSplitter is len while it is some token measure for the instrutor embeddings.

The programm still works, however chunks inserted into the database a smaller than one would suspect.

Error on chatdocs download at Macos chip M1 pro

Hi guys, As the title said, I try to run the chatdocs download command in my MacBook and got this error.
Can somebody tell me how to fix it? I would like to try this cool tool in my localmchine

Thank you in advanced!

load INSTRUCTOR_Transformer
max_seq_length 512
Fetching 0 files: 0it [00:00, ?it/s]
Fetching 1 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 19239.93it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/-/Library/Python/3.9/lib/python/site-packages/chatdocs/main.py:26 in download │
│ │
│ 23 │ from .download import download │
│ 24 │ │
│ 25 │ config = get_config(config) │
│ ❱ 26 │ download(config=config) │
│ 27 │
│ 28 │
│ 29 @app.command() │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': False, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ │ download = <function download at 0x105ea9d30> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/chatdocs/download.py:10 in download │
│ │
│ 7 def download(config: Dict[str, Any]) -> None: │
│ 8 │ config = {**config, "download": True} │
│ 9 │ get_embeddings(config) │
│ ❱ 10 │ get_llm(config) │
│ 11 │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': True, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/chatdocs/llms.py:73 in get_llm │
│ │
│ 70 │ if config["llm"] == "ctransformers": │
│ 71 │ │ config = {**config["ctransformers"]} │
│ 72 │ │ config = merge(config, {"config": {"local_files_only": local_files_only}}) │
│ ❱ 73 │ │ llm = CTransformers(callbacks=callbacks, **config) │
│ 74 │ elif config["llm"] == "gptq": │
│ 75 │ │ llm = get_gptq_llm(config) │
│ 76 │ else: │
│ │
│ ╭─────────────────────────────────────── locals ───────────────────────────────────────╮ │
│ │ callback = None │ │
│ │ CallbackHandler = <class 'chatdocs.llms.get_llm..CallbackHandler'> │ │
│ │ callbacks = None │ │
│ │ config = { │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ local_files_only = False │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/langchain/load/serializable.py:74 │
│ in init
│ │
│ 71 │ _lc_kwargs = PrivateAttr(default_factory=dict) │
│ 72 │ │
│ 73 │ def init(self, **kwargs: Any) -> None: │
│ ❱ 74 │ │ super().init(**kwargs) │
│ 75 │ │ self._lc_kwargs = kwargs │
│ 76 │ │
│ 77 │ def to_json(self) -> Union[SerializedConstructor, SerializedNotImplemented]: │
│ │
│ ╭─────────────────────────────────── locals ────────────────────────────────────╮ │
│ │ class = <class 'langchain.load.serializable.Serializable'> │ │
│ │ kwargs = { │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ self = CTransformers() │ │
│ ╰───────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Desktop/chatdocs/pydantic/main.py:339 in pydantic.main.BaseModel.init
│ │
│ [Errno 2] No such file or directory: '/Users/-/Desktop/chatdocs/pydantic/main.py' │
│ │
│ /Users/-/Desktop/chatdocs/pydantic/main.py:1102 in pydantic.main.validate_model │
│ │
│ [Errno 2] No such file or directory: '/Users/-/Desktop/chatdocs/pydantic/main.py' │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/langchain/llms/ctransformers.py:73 │
│ in validate_environment │
│ │
│ 70 │ │ │ ) │
│ 71 │ │ │
│ 72 │ │ config = values["config"] or {} │
│ ❱ 73 │ │ values["client"] = AutoModelForCausalLM.from_pretrained( │
│ 74 │ │ │ values["model"], │
│ 75 │ │ │ model_type=values["model_type"], │
│ 76 │ │ │ model_file=values["model_file"], │
│ │
│ ╭──────────────────────────────────────── locals ─────────────────────────────────────────╮ │
│ │ AutoModelForCausalLM = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ cls = <class 'langchain.llms.ctransformers.CTransformers'> │ │
│ │ config = {'context_length': 1024, 'local_files_only': False} │ │
│ │ values = { │ │
│ │ │ 'cache': None, │ │
│ │ │ 'verbose': False, │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'callback_manager': None, │ │
│ │ │ 'tags': None, │ │
│ │ │ 'metadata': None, │ │
│ │ │ 'client': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ ... +2 │ │
│ │ } │ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/hub.py:157 in │
│ from_pretrained │
│ │
│ 154 │ │ │ │ local_files_only=local_files_only, │
│ 155 │ │ │ ) │
│ 156 │ │ │
│ ❱ 157 │ │ return LLM( │
│ 158 │ │ │ model_path=model_path, │
│ 159 │ │ │ model_type=model_type, │
│ 160 │ │ │ config=config.config, │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ cls = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ config = AutoConfig( │ │
│ │ │ config=Config( │ │
│ │ │ │ top_k=40, │ │
│ │ │ │ top_p=0.95, │ │
│ │ │ │ temperature=0.8, │ │
│ │ │ │ repetition_penalty=1.1, │ │
│ │ │ │ last_n_tokens=64, │ │
│ │ │ │ seed=-1, │ │
│ │ │ │ batch_size=8, │ │
│ │ │ │ threads=-1, │ │
│ │ │ │ max_new_tokens=256, │ │
│ │ │ │ stop=None, │ │
│ │ │ │ stream=False, │ │
│ │ │ │ reset=True, │ │
│ │ │ │ context_length=1024, │ │
│ │ │ │ gpu_layers=0 │ │
│ │ │ ), │ │
│ │ │ model_type=None │ │
│ │ ) │ │
│ │ kwargs = {'context_length': 1024} │ │
│ │ lib = None │ │
│ │ local_files_only = False │ │
│ │ model_file = 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin' │ │
│ │ model_path = '/Users/-/.cache/huggingface/hub/models--TheBloke--Wizard-V… │ │
│ │ model_path_or_repo_id = 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML' │ │
│ │ model_type = 'llama' │ │
│ │ path_type = 'repo' │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/llm.py:206 in │
init
│ │
│ 203 │ │ if not Path(model_path).is_file(): │
│ 204 │ │ │ raise ValueError(f"Model path '{model_path}' doesn't exist.") │
│ 205 │ │ │
│ ❱ 206 │ │ self._lib = load_library(lib, cuda=config.gpu_layers > 0) │
│ 207 │ │ self._llm = self._lib.ctransformers_llm_create( │
│ 208 │ │ │ model_path.encode(), │
│ 209 │ │ │ model_type.encode(), │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = Config( │ │
│ │ │ top_k=40, │ │
│ │ │ top_p=0.95, │ │
│ │ │ temperature=0.8, │ │
│ │ │ repetition_penalty=1.1, │ │
│ │ │ last_n_tokens=64, │ │
│ │ │ seed=-1, │ │
│ │ │ batch_size=8, │ │
│ │ │ threads=-1, │ │
│ │ │ max_new_tokens=256, │ │
│ │ │ stop=None, │ │
│ │ │ stream=False, │ │
│ │ │ reset=True, │ │
│ │ │ context_length=1024, │ │
│ │ │ gpu_layers=0 │ │
│ │ ) │ │
│ │ lib = None │ │
│ │ model_path = '/Users/-/.cache/huggingface/hub/models--TheBloke--Wizard-Vicuna-7B-Un… │ │
│ │ model_type = 'llama' │ │
│ │ self = <ctransformers.llm.LLM object at 0x287252d00> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/llm.py:101 in │
│ load_library │
│ │
│ 98 │ if hasattr(os, "add_dll_directory") and "CUDA_PATH" in os.environ: │
│ 99 │ │ os.add_dll_directory(os.path.join(os.environ["CUDA_PATH"], "bin")) │
│ 100 │ │
│ ❱ 101 │ path = find_library(path, cuda=cuda) │
│ 102 │ lib = CDLL(path) │
│ 103 │ │
│ 104 │ lib.ctransformers_llm_create.argtypes = [ │
│ │
│ ╭─── locals ───╮ │
│ │ cuda = False │ │
│ │ path = None │ │
│ ╰──────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/lib.py:23 in │
│ find_library │
│ │
│ 20 │ │ else: │
│ 21 │ │ │ from cpuinfo import get_cpu_info │
│ 22 │ │ │ │
│ ❱ 23 │ │ │ flags = get_cpu_info()["flags"] │
│ 24 │ │ │ │
│ 25 │ │ │ if "avx2" in flags: │
│ 26 │ │ │ │ path = "avx2" │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ cuda = False │ │
│ │ get_cpu_info = <function get_cpu_info at 0x287262670> │ │
│ │ lib_directory = PosixPath('/Users/-/Library/Python/3.9/lib/python/site-packages/ctr… │ │
│ │ path = None │ │
│ │ system = 'Darwin' │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'flags'

ValueError: Could not parse output: RetrievalQA.from_chain_type(chain_type="map_rerank")

I switched the chain_type to "map_rerank" and got the first of my three answers but afterwards i got this error. Seems like there is an update of apply_and_parse necessary?

C:\Users\USERID\Downloads\word01.00.docx:
socre:   0.2196645587682724
C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py:303: UserWarning: The apply_and_parse method is deprecated, instead pass an output parser directly to LLMChain.
 warnings.warn(
Exception in thread Thread-3 (worker):
Traceback (most recent call last):
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\threading.py", line 1038, in _bootstrap_inner
   self.run()
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\threading.py", line 975, in run
   self._target(*self._args, **self._kwargs)
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\chatdocs\ui.py", line 38, in worker
   res = qa(query)
         ^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 243, in __call__
   raise e
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 237, in __call__
   self._call(inputs, run_manager=run_manager)
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\retrieval_qa\base.py", line 131, in _call
   answer = self.combine_documents_chain.run(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 445, in run
   return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 243, in __call__
   raise e
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 237, in __call__
   self._call(inputs, run_manager=run_manager)
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\combine_documents\base.py", line 106, in _call
   output, extra_return_dict = self.combine_docs(
                               ^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\combine_documents\map_rerank.py", line 154, in combine_docs
   results = self.llm_chain.apply_and_parse(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py", line 308, in apply_and_parse
   return self._parse_generation(result)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py", line 314, in _parse_generation
   return [
          ^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py", line 315, in <listcomp>
   self.prompt.output_parser.parse(res[self.output_key])
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\output_parsers\regex.py", line 32, in parse
   raise ValueError(f"Could not parse output: {text}")
ValueError: Could not parse output:
Please find it all

RuntimeError: Failed to create LLM 'llama' from

I'm getting this error when I type, "chatdocs download" in command prompt. What am I doing wrong?

C:\AI\chatdocs>chatdocs download
load INSTRUCTOR_Transformer
max_seq_length 512
Fetching 0 files: 0it [00:00, ?it/s]
Fetching 1 files: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 999.36it/s]
error loading model: unrecognized tensor type 10

llama_init_from_file: failed to load model
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\dkv9\miniconda3\lib\site-packages\chatdocs\main.py:26 in download │
│ │
│ 23 │ from .download import download │
│ 24 │ │
│ 25 │ config = get_config(config) │
│ ❱ 26 │ download(config=config) │
│ 27 │
│ 28 │
│ 29 @app.command() │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\chatdocs\download.py:10 in download │
│ │
│ 7 def download(config: Dict[str, Any]) -> None: │
│ 8 │ config = {**config, "download": True} │
│ 9 │ get_embeddings(config) │
│ ❱ 10 │ get_llm(config) │
│ 11 │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\chatdocs\llms.py:73 in get_llm │
│ │
│ 70 │ if config["llm"] == "ctransformers": │
│ 71 │ │ config = {**config["ctransformers"]} │
│ 72 │ │ config = merge(config, {"config": {"local_files_only": local_files_only}}) │
│ ❱ 73 │ │ llm = CTransformers(callbacks=callbacks, **config) │
│ 74 │ elif config["llm"] == "gptq": │
│ 75 │ │ llm = get_gptq_llm(config) │
│ 76 │ else: │
│ │
│ C:\AI\chatdocs\pydantic\main.py:339 in pydantic.main.BaseModel.init
│ │
│ [Errno 2] No such file or directory: 'C:\AI\chatdocs\pydantic\main.py' │
│ │
│ C:\AI\chatdocs\pydantic\main.py:1102 in pydantic.main.validate_model │
│ │
│ [Errno 2] No such file or directory: 'C:\AI\chatdocs\pydantic\main.py' │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\langchain\llms\ctransformers.py:70 in │
│ validate_environment │
│ │
│ 67 │ │ │ ) │
│ 68 │ │ │
│ 69 │ │ config = values["config"] or {} │
│ ❱ 70 │ │ values["client"] = AutoModelForCausalLM.from_pretrained( │
│ 71 │ │ │ values["model"], │
│ 72 │ │ │ model_type=values["model_type"], │
│ 73 │ │ │ model_file=values["model_file"], │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\ctransformers\hub.py:157 in from_pretrained │
│ │
│ 154 │ │ │ │ local_files_only=local_files_only, │
│ 155 │ │ │ ) │
│ 156 │ │ │
│ ❱ 157 │ │ return LLM( │
│ 158 │ │ │ model_path=model_path, │
│ 159 │ │ │ model_type=model_type, │
│ 160 │ │ │ config=config.config, │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\ctransformers\llm.py:205 in init
│ │
│ 202 │ │ │ config.gpu_layers, │
│ 203 │ │ ) │
│ 204 │ │ if self._llm is None: │
│ ❱ 205 │ │ │ raise RuntimeError( │
│ 206 │ │ │ │ f"Failed to create LLM '{model_type}' from '{model_path}'." │
│ 207 │ │ │ ) │
│ 208 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Failed to create LLM 'llama' from
'C:\Users\dkv9.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-7B-Uncensored-GGML\snapshots\531879da598ebc577cd4
a03bdde9fbe3a641fc63\Wizard-Vicuna-7B-Uncensored.ggmlv3.q2_K.bin'.

Memory optimization

Out of curiosity, I tried containerizing the package with the offline models, installing the required PIP packages and a sample document (size: 62 kB) loaded through chatdocs add.

The build could not complete locally due to memory issues. I tried it in a hosted environment and I got the following warnings during the build process:

docker_build_error
server_health_error

Even though the package does not have an official support for containers yet, this indicates that we need to optimize the code perfoming various operations (like adding files or loading models) so that it works well on low end PCs.

Incomplete response

The response of a query is incomplete. ChatGPT3.5 allows users to prompt it to continue with the response in a new prompt and the team at Oobabooga added a "Continue" button to their WebUI to do the same.

There is no way to continue lost response currently through a new prompt and splitting the prompts into smaller chunks resulted in a lost of context when responding.

I'm unsure whether this is a bug or missing feature and am wondering is there a fix to allow a full response.

I'm new to this.

Screenshot 2023-07-09 at 08-47-17 ChatDocs

is it possible to reset DB ?

hello,
i have upload document and after delete them but the LLM have this dcoument in memory.
How i can reset memroy or refresh the document list ?

thanks

Conda Installation? Anyone got it running?

Hi!

I have been having some issues with dependencies lately and wanted to isolate an install over Conda. I am running into some issues. Anyone gotten this to run under Conda?

can't use ggml-gpt4all-j-v1.3-groovy.bin

I wanted to use another llm but i had some errors as:

Screenshot 2023-06-14 173048

and this is my chatdocs.yml:

Screenshot 2023-06-14 173357

I already did pip install ctransformers and set CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers

How can i use ggml-gpt4all-j-v1.3-groovy.bin?
I'm sorry if these questions/problems are easy. I'm still a beginner on this subject but i really love the work you're putting on.

How does one uninstall everything?

Hi,

I have found most of the folders that this package installs to, it's quite spread out on Windows. Is there a good way to uninstall everything if needed?

Trouble with chat docs download

I have tried this with 0.2.4 and 0.2.5, getting the same results.


pete@chatdocs:~ $ pip show ctransformers
Name: ctransformers
Version: 0.2.15
Summary: Python bindings for the Transformer models implemented in C/C++ using GGML library.
Home-page: https://github.com/marella/ctransformers
Author: Ravindra Marella
Author-email: [email protected]
License: MIT
Location: /home/pete/.local/lib/python3.9/site-packages
Requires: huggingface-hub
Required-by: chatdocs

Initially I had other errors that were resolved by installing:

  • typing_extensions==4.5.0
  • typing-inspect==0.8.0

Now I get the following:

OSError: /home/pete/.local/lib/python3.9/site-packages/ctransformers/lib/avx2/libctransformers.so: cannot open shared object file: No such file or
directory
pete@chatdocs:~ $ ls .local/lib/python3.9/site-packages/ctransformers/lib/avx2
ctransformers.dll libctransformers.dylib libctransformers.so

Permissions on the file are -rwxr-xr-x

So I'm stuck. Any help appreciated. Thanks.

Where do I store pre-downloaded models?

Sorry for the daft question, but I've already downloaded a host of models and would rather not copy them into yet another area of my pc. is there a location I should store them for this to access them? or any way I can change the filepath to the model?

won't use gpu

im trying to have ctransformers use gpu but it won't work.

my chatdocs.yml:

ctransformers:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML
  model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama
  config:
    gpu_layers: 50


llm: ctransformers

Error when parquet files get too big / function for splitting?

Hi!

I have been uploading a lot of data and ran into a snappy compress error after reaching around 3,6GB of data in the parquet file.

Error: Invalid Error: Snappy decompression failure

I saw that there was a limit for parquetfiles and that limit is 4GB. Could we add functionality to split the parquet files when they reach 1 GB of data to get rid of this issue. Does anyone know how to do it ?

Upload document function and embeddings processing in web ui

Dear Dev,

It would be great to have a sidebar in the web UI with the ability to upload directly (dragn'drop) and start processing embeddings. After that, a prompt with a "reload web ui" button would be the perfect workflow for adding new documents on the fly.

Highly appreciated :)

Failed building wheel for auto-gptq

` "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc" -c autogptq_cuda/autogptq_cuda_kernel.cu -o build\temp.win-amd64-cpython-310\Release\autogptq_cuda/autogptq_cuda_kernel.obj -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\TH -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -Iautogptq_cuda -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\include -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.36.32532\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.36.32532\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=autogptq_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --use-local-env
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include\crt/host_config.h(160): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2019 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
autogptq_cuda_kernel.cu
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for auto-gptq
Running setup.py clean for auto-gptq
Failed to build auto-gptq
ERROR: Could not build wheels for auto-gptq, which is required to install pyproject.toml-based `projects``

Have NVIDIA card, NVIDIA Toolkit mapped to environmental var, and visual studio 2019, Still wont work.

Document does not import after db folder has been deleted.

I am attempting to use chatdocs in different folders, so that each folder refrences a different data set. I wanted to clear data set A and start over again, so I removed the db folder. When I added a document to data set A using "chatdocs add /path/to/file" I get the following error message -

Loading new documents: 0it [00:00, ?it/s]
No new documents to load

I attempted to uninstall chatdocs via pip, and reinstall, but I'm still getting the same error message.

When attempting to launch chatdocs from the folder (which no longer has a db) it crashes after being asked a question.

Index not found, please create an instance before querying

Config to Limit Chatdoc responses only to documents added

First of all, this is an amazing tool for offline Q&A modules.

Wanted to check if there is a config available in chatdocs.yml file to limit responses only to documents added and reply open questions with standard "I don't know" responses.

Or at least set the temperature of the models used to ensure the answers are not made using LLMs pre-trained texts.

Great Project - Question About Docs

Is there an overview or guide as to which kinds of documents this can handle or what interim steps are needed to -- for example process doc, docx, ppt, or pdf files?

RuntimeError: Failed to create LLM 'llama'

I have the same issue :

(chatdocs-main) PS G:\Chat\chatdocs-main> chatdocs download
load INSTRUCTOR_Transformer
max_seq_length 512
Fetching 0 files: 0it [00:00, ?it/s]
Fetching 1 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s]
error loading model: failed to open C:\Users\寰′付濂昞.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-7B-Uncensored-GGML\blobs\c31a4edd96527dcd808bcf9b99e3894065ac950747dac84ecd
415a2387454e7c: No such file or directory
llama_init_from_file: failed to load model
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ E:\chatdocs-main\lib\site-packages\chatdocs\main.py:26 in download │
│ │
│ 23 │ from .download import download │
│ 24 │ │
│ 25 │ config = get_config(config) │
│ ❱ 26 │ download(config=config) │
│ 27 │
│ 28 │
│ 29 @app.command() │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': False, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ │ download = <function download at 0x000001F46E199790> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\chatdocs\download.py:10 in download │
│ │
│ 7 def download(config: Dict[str, Any]) -> None: │
│ 8 │ config = {**config, "download": True} │
│ 9 │ get_embeddings(config) │
│ ❱ 10 │ get_llm(config) │
│ 11 │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': True, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\chatdocs\llms.py:73 in get_llm │
│ │
│ 70 │ if config["llm"] == "ctransformers": │
│ 71 │ │ config = {**config["ctransformers"]} │
│ 72 │ │ config = merge(config, {"config": {"local_files_only": local_files_only}}) │
│ ❱ 73 │ │ llm = CTransformers(callbacks=callbacks, **config) │
│ 74 │ elif config["llm"] == "gptq": │
│ 75 │ │ llm = get_gptq_llm(config) │
│ 76 │ else: │
│ │
│ ╭─────────────────────────────────────── locals ───────────────────────────────────────╮ │
│ │ callback = None │ │
│ │ CallbackHandler = <class 'chatdocs.llms.get_llm..CallbackHandler'> │ │
│ │ callbacks = None │ │
│ │ config = { │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ local_files_only = False │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\langchain\load\serializable.py:74 in init │
│ │
│ 71 │ _lc_kwargs = PrivateAttr(default_factory=dict) │
│ 72 │ │
│ 73 │ def init(self, **kwargs: Any) -> None: │
│ ❱ 74 │ │ super().init(**kwargs) │
│ 75 │ │ self._lc_kwargs = kwargs │
│ 76 │ │
│ 77 │ def to_json(self) -> Union[SerializedConstructor, SerializedNotImplemented]: │
│ │
│ ╭─────────────────────────────────── locals ────────────────────────────────────╮ │
│ │ class = <class 'langchain.load.serializable.Serializable'> │ │
│ │ kwargs = { │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ self = CTransformers() │ │
│ ╰───────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ G:\Chat\chatdocs-main\pydantic\main.py:339 in pydantic.main.BaseModel.init │
│ │
│ [Errno 2] No such file or directory: 'G:\Chat\chatdocs-main\pydantic\main.py' │
│ │
│ G:\Chat\chatdocs-main\pydantic\main.py:1102 in pydantic.main.validate_model │
│ │
│ [Errno 2] No such file or directory: 'G:\Chat\chatdocs-main\pydantic\main.py' │
│ │
│ E:\chatdocs-main\lib\site-packages\langchain\llms\ctransformers.py:73 in validate_environment │
│ │
│ 70 │ │ │ ) │
│ 71 │ │ │
│ 72 │ │ config = values["config"] or {} │
│ ❱ 73 │ │ values["client"] = AutoModelForCausalLM.from_pretrained( │
│ 74 │ │ │ values["model"], │
│ 75 │ │ │ model_type=values["model_type"], │
│ 76 │ │ │ model_file=values["model_file"], │
│ │
│ ╭──────────────────────────────────────── locals ─────────────────────────────────────────╮ │
│ │ AutoModelForCausalLM = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ cls = <class 'langchain.llms.ctransformers.CTransformers'> │ │
│ │ config = {'context_length': 1024, 'local_files_only': False} │ │
│ │ values = { │ │
│ │ │ 'cache': None, │ │
│ │ │ 'verbose': False, │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'callback_manager': None, │ │
│ │ │ 'tags': None, │ │
│ │ │ 'metadata': None, │ │
│ │ │ 'client': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ ... +2 │ │
│ │ } │ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\ctransformers\hub.py:157 in from_pretrained │
│ │
│ 154 │ │ │ │ local_files_only=local_files_only, │
│ 155 │ │ │ ) │
│ 156 │ │ │
│ ❱ 157 │ │ return LLM( │
│ 158 │ │ │ model_path=model_path, │
│ 159 │ │ │ model_type=model_type, │
│ 160 │ │ │ config=config.config, │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ cls = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ config = AutoConfig( │ │
│ │ │ config=Config( │ │
│ │ │ │ top_k=40, │ │
│ │ │ │ top_p=0.95, │ │
│ │ │ │ temperature=0.8, │ │
│ │ │ │ repetition_penalty=1.1, │ │
│ │ │ │ last_n_tokens=64, │ │
│ │ │ │ seed=-1, │ │
│ │ │ │ batch_size=8, │ │
│ │ │ │ threads=-1, │ │
│ │ │ │ max_new_tokens=256, │ │
│ │ │ │ stop=None, │ │
│ │ │ │ stream=False, │ │
│ │ │ │ reset=True, │ │
│ │ │ │ context_length=1024, │ │
│ │ │ │ gpu_layers=0 │ │
│ │ │ ), │ │
│ │ │ model_type=None │ │
│ │ ) │ │
│ │ kwargs = {'context_length': 1024} │ │
│ │ lib = None │ │
│ │ local_files_only = False │ │
│ │ model_file = 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin' │ │
│ │ model_path = 'C:\Users\御丶奕.cache\huggingface\hub\models--TheBloke--Wiz… │ │
│ │ model_path_or_repo_id = 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML' │ │
│ │ model_type = 'llama' │ │
│ │ path_type = 'repo' │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\ctransformers\llm.py:214 in init │
│ │
│ 211 │ │ │ config.gpu_layers, │
│ 212 │ │ ) │
│ 213 │ │ if self._llm is None: │
│ ❱ 214 │ │ │ raise RuntimeError( │
│ 215 │ │ │ │ f"Failed to create LLM '{model_type}' from '{model_path}'." │
│ 216 │ │ │ ) │
│ 217 │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = Config( │ │
│ │ │ top_k=40, │ │
│ │ │ top_p=0.95, │ │
│ │ │ temperature=0.8, │ │
│ │ │ repetition_penalty=1.1, │ │
│ │ │ last_n_tokens=64, │ │
│ │ │ seed=-1, │ │
│ │ │ batch_size=8, │ │
│ │ │ threads=-1, │ │
│ │ │ max_new_tokens=256, │ │
│ │ │ stop=None, │ │
│ │ │ stream=False, │ │
│ │ │ reset=True, │ │
│ │ │ context_length=1024, │ │
│ │ │ gpu_layers=0 │ │
│ │ ) │ │
│ │ lib = None │ │
│ │ model_path = 'C:\Users\御丶奕.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-… │ │
│ │ model_type = 'llama' │ │
│ │ self = <ctransformers.llm.LLM object at 0x000001F4ED4A19A0> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Failed to create LLM 'llama' from
'C:\Users\御丶奕.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-7B-Uncensored-GGML\blobs\c31a4edd96527dcd808bcf9b99e3894065ac950747dac84ecd415a2387454e7c'.

Illegal instruction

hello. i installed chatdocs as th docs.

when i installed and chatdoc download the model, and then add the docs.
chatdocs add /root/docs
and then run chatdocs ui
it reports
Illegal instruction

`root@9bb4ee29f890:# chatdocs add /root/docs
Creating new vectorstore
Loading documents from /root/docs
Loading new documents: 100%|██████████████████████| 1/1 [00:00<00:00, 41.39it/s]
Loaded 1 new documents from /root/docs
Creating embeddings. May take a few minutes...
load INSTRUCTOR_Transformer
max_seq_length 512
root@9bb4ee29f890:
# chatdocs ui

load INSTRUCTOR_Transformer

max_seq_length 512
Illegal instruction`

Not using chatdocs.yml

chatdocs download --config C:\Users\Administrator\Documents\chatdocs\chatdocs.yml

Tried this variation

chatdocs download --config chatdocs.yml

Still does not use it.

How to pass this to chatdocs ui command to ensure it uses GPU and the model there in? Is this a recent change?

How do I install?

I think the instructions are too vague...

I put the commands in but says chatdocs isnt an available command.

Sorry I am a noob.

Doesn't fully use CPU?

I have noticed it never uses more than 400% in glances (equivalent of 4 cores at 100%), that's only 25% of what my CPU has to offer. Is it normal, or do I have something configured wrong?

$ cat chatdocs.yml
llm: ctransformers

ctransformers:
  model: /mnt/dev/ai/oobabooga_linux/text-generation-webui/models/Wizard-Vicuna-7B-Uncensored/Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama

download: false

I tried other (ggml) models, but it behaved pretty much same.

I remember from trying oobabooga's text gen webui that "Transformers" had similar poor performance and I had to switch to "llama.cpp" to get better CPU utilization.

Sadly for my GPU is still not available rocm (5.5?) in Manajro, so CPU is currently my only option :(.

Cannot delete DB

There are duplicate entries in my DB which I would like to delete.

Where is the DB stored on Windows 10? The documentation says that "The processed documents will be stored in db directory by default" and "Note: When you change the embeddings model, delete the db directory and add documents again."

But where is the DB directory stored? I have tried endless googling and searching for every directory in my computer named "db" and I still cannot find it.

Uninstalling chatdocs / chromadb and reinstalling does nothing, the duplicate entries in chatdocs remain.

Does the falcon models work with chatdocs ?

I know the falcon models are a little different and that they might not work with Chatdocs. Have anyone tried and got it to run. Which model did you use? GGML, GPTQ or maybe the standard models?

Running chatdocs ui on headless server

I'm attempting to run this on a headless server (Ubuntu 22.04) where I have considerably more resources, and can't get internet access to it. I've attempted to modify the IP/Port numbers on the chatdocs.yml, tried modifying the ui.py, nothing. Keeps returning the

local firewalls are disabled and the device is reachable via Ping.

Any suggestions?

Invisible output

I asked two things in italian, the first time it worked, the second one I got this (no error log):
Input: Nella fase di stima del valore di mercato, l'income approach impiega due calcoli differenti, quali, puoi descriverli?
Output:





























helpful il mettere e una o Comment on micro-

(blank lines, then at the end an incoherent partial sentence, and then the sources). It's like if the output is in the lines there, but invisible, not printed.

Console: [2023-06-16 15:14:19 +0200] [66608] [INFO] 127.0.0.1:53010 GET /favicon.ico 1.1 404 207 997
I'm using the gpu (rtx 3060) on windows 11:

ctransformers:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML
  model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama
  config:
    context_length: 1024
    gpu_layers: 50

huggingface:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-HF
  pipeline_kwargs:
    max_new_tokens: 256
  device: 0

gptq:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ
  model_file: Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors
  pipeline_kwargs:
    max_new_tokens: 256
  device: 0

how to ... ?

  1. deploy
  2. update db
  3. hide document/file name source
  4. install to specific folder

Feature: --listen

Is there a way to add the --listen flag to access the ui from other computers on the net. Would it be as simple as me changing line 63 in util.py
from app.run(host="localhost", port=config["port"], use_reloader=False)
to app.run(host="0.0.0.0", port=config["port"], use_reloader=False) ?

GPTQ model seems slow

I've been using this chatdocs project with a ggml model which has worked really well if a bit slow. I have read a lot online about GPTQ models delivering significantly better speeds, but when I trialed this it's only getting a roughly 2x speed up.

When I run chatdocs ui command it raises a message "CUDA extension not installed" but I have installed just about every CUDA related package (several of which looked to be CUDA extension) I can find online and the message is still present. Is this likely to be slowing the model down? If so, any idea exactly which package this message is wanting installed?

I'm also getting the message "skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet" but again, I have the triton package installed in my env. Any ideas on a likely cause and if this issue is likely to affect the speed?

Just to round off, I am very pleased with this project in general. it looks good, works nicely and was relatively easy to install (just had to find a few other packages online such as CUDNN)

Crash on low end PC

crashing after adding a lot of PDFs on low end PC

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 258032 bytes for Chunk::new
# Possible reasons:
#   The system is out of physical RAM or swap space
#   The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
#   JVM is running with Unscaled Compressed Oops mode in which the Java heap is
#     placed in the first 4GB address space. The Java Heap base address is the
#     maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
#     to set the Java Heap base and to place the Java Heap above 4GB virtual address.
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (arena.cpp:189), pid=13844, tid=23816
#
# JRE version: OpenJDK Runtime Environment JBR-17.0.6+10-829.9-jcef (17.0.6+10) (build 17.0.6+10-b829.9)
# Java VM: OpenJDK 64-Bit Server VM JBR-17.0.6+10-829.9-jcef (17.0.6+10-b829.9, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
# No core dump will be written. Minidumps are not enabled by default on client versions of Windows
#
---------------  S U M M A R Y ------------

Command Line: exit -XX:ErrorFile=C:\Users\support2\\java_error_in_pycharm64_%p.log -XX:HeapDumpPath=C:\Users\support2\\java_error_in_pycharm64.hprof -Xms128m -Xmx750m -XX:ReservedCodeCacheSize=512m -XX:+UseG1GC -XX:SoftRefLRUPolicyMSPerMB=50 -XX:CICompilerCount=2 -XX:+HeapDumpOnOutOfMemoryError -XX:-OmitStackTraceInFastThrow -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=exclude,com/intellij/openapi/vfs/impl/FilePartNodeRoot,trieDescend -ea -Dsun.io.useCanonCaches=false -Dsun.java2d.metal=true -Djbr.catch.SIGABRT=true -Djdk.http.auth.tunneling.disabledSchemes="" -Djdk.attach.allowAttachSelf=true -Djdk.module.illegalAccess.silent=true -Dkotlinx.coroutines.debug=off -Xmx987m -Djb.vmOptionsFile=C:\Users\support2\AppData\Roaming\\JetBrains\\PyCharmCE2023.1\pycharm64.exe.vmoptions -Djava.system.class.loader=com.intellij.util.lang.PathClassLoader -Didea.vendor.name=JetBrains -Didea.paths.selector=PyCharmCE2023.1 -Djna.boot.library.path=C:\Program Files\JetBrains\PyCharm Community Edition 2023.1.2/lib/jna/amd64 -Dpty4j.preferred.native.folder=C:\Program Files\JetBrains\PyCharm Community Edition 2023.1.2/lib/pty4j -Djna.nosys=true -Djna.noclasspath=true -Didea.platform.prefix=PyCharmCore -Dsplash=true --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.ref=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.nio.charset=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.vm=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.fs=ALL-UNNAMED --add-opens=java.base/sun.security.ssl=ALL-UNNAMED --add-opens=java.base/sun.security.util=ALL-UNNAMED --add-opens=java.base/sun.net.dns=ALL-UNNAMED --add-opens=java.desktop/java.awt=ALL-UNNAMED --add-opens=java.desktop/java.awt.dnd.peer=ALL-UNNAMED --add-opens=java.desktop/java.awt.event=ALL-UNNAMED --add-opens=java.desktop/java.awt.image=ALL-UNNAMED --add-opens=java.desktop/java.awt.peer=ALL-UNNAMED --add-opens=java.desktop/java.awt.font=ALL-UNNAMED --add-opens=java.desktop/javax.swing=ALL-UNNAMED --add-opens=java.desktop/javax.swing.plaf.basic=ALL-UNNAMED --add-opens=java.desktop/javax.swing.text.html=ALL-UNNAMED --add-opens=java.desktop/sun.awt.datatransfer=ALL-UNNAMED --add-opens=java.desktop/sun.awt.image=ALL-UNNAMED --add-opens=java.desktop/sun.awt.windows=ALL-UNNAMED --add-opens=java.desktop/sun.awt=ALL-UNNAMED --add-opens=java.desktop/sun.font=ALL-UNNAMED --add-opens=java.desktop/sun.java2d=ALL-UNNAMED --add-opens=java.desktop/sun.swing=ALL-UNNAMED --add-opens=jdk.attach/sun.tools.attach=ALL-UNNAMED --add-opens=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-opens=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED --add-opens=jdk.jdi/com.sun.tools.jdi=ALL-UNNAMED -Dide.native.launcher=true -Djcef.sandbox.ptr=0000020E90174FA0 

Host: 12th Gen Intel(R) Core(TM) i7-12700, 20 cores, 7G,  Windows 10 , 64 bit Build 19041 (10.0.19041.2913)
Time: Wed Jun  7 12:36:17 2023 Arabian Standard Time elapsed time: 32.528581 seconds (0d 0h 0m 32s)

---------------  T H R E A D  ---------------

Current thread (0x0000020ebc83f590):  JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=23816, stack(0x000000f0b5800000,0x000000f0b5900000)]


Current CompileTask:
C2:  32528 25800   !   4       com.intellij.ide.IdeEventQueue::dispatchByCustomDispatchers (110 bytes)

Stack: [0x000000f0b5800000,0x000000f0b5900000]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0x683c5a]
V  [jvm.dll+0x842764]
V  [jvm.dll+0x843f5e]
V  [jvm.dll+0x8445c3]
V  [jvm.dll+0x249b75]
V  [jvm.dll+0xabcac]
V  [jvm.dll+0xac27c]
V  [jvm.dll+0x368857]
V  [jvm.dll+0x1bd0b8]
V  [jvm.dll+0x21c359]
V  [jvm.dll+0x21b621]
V  [jvm.dll+0x1a4fdd]
V  [jvm.dll+0x22b098]
V  [jvm.dll+0x229159]
V  [jvm.dll+0x7f81ac]
V  [jvm.dll+0x7f270a]
V  [jvm.dll+0x682a95]
C  [ucrtbase.dll+0x21bb2]
C  [KERNEL32.DLL+0x17614]
C  [ntdll.dll+0x526a1]

---------------  S Y S T E M  ---------------

OS:
 Windows 10 , 64 bit Build 19041 (10.0.19041.2913)
OS uptime: 7 days 4:23 hours
Hyper-V role detected

CPU: total 20 (initial active 20) (10 cores per cpu, 2 threads per core) family 6 model 151 stepping 2 microcode 0x1f, cx8, cmov, fxsr, ht, mmx, 3dnowpref, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, lzcnt, tsc, tscinvbit, avx, avx2, aes, erms, clmul, bmi1, bmi2, adx, sha, fma, vzeroupper, clflush, clflushopt, clwb, hv

Memory: 4k page, system-wide physical 7897M (408M free)
TotalPageFile size 32473M (AvailPageFile size 0M)
current process WorkingSet (physical memory assigned to process): 1128M, peak: 1133M
current process commit charge ("private bytes"): 1004M, peak: 1008M

vm_info: OpenJDK 64-Bit Server VM (17.0.6+10-b829.9) for windows-amd64 JRE (17.0.6+10-b829.9), built on 2023-04-09 by "builduser" with MS VC++ 16.10 / 16.11 (VS2019)

END.

>

couldn't ingest docs & typer incompatible

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
openapi-python-client 0.13.4 requires typer<0.8.0,>=0.6, but you have typer 0.9.0 which is incompatible.
spacy 3.5.3 requires typer<0.8.0,>=0.3.0, but you have typer 0.9.0 which is incompatible.

1- i had this error when i was trying to 'pip install chatdocs'
Does it effect my app overall?

2- And when i use it, it looks like a chatpgt, i couldn't ingest any pdf file using chatdocs add /path/to/documents (adding a screenshot)
Screenshot 2023-06-09 124423

Something is wrong with 0.2.5 - chatdocs download command

Hi @marella

Something broke with 0.2.5. I have tried many different ways to get it to work.

When I run 0.2.4 everything works flawlessly.

With 0.2.5 something goes wrong any time I do chatdocs download, it does not seem to respect changes to chatdocs.yml

When I go into the site-packages folder and change to the correct model file path it give me the error that ctransformers.dll was not found.

Anyone able to fix this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.