marella / chatdocs Goto Github PK

View Code? Open in Web Editor NEW

647.0 647.0 96.0 261 KB

Chat with your documents offline using AI.

License: MIT License

Python 79.97% Shell 1.23% HTML 18.80%

ai chatdocs chroma ctransformers langchain llm transformers

chatdocs's People

Contributors

Stargazers

Watchers

Forkers

mindrages jesusoctavioas maximilian-winter worthmining cheizr dkzdev ranfysvalle02 cecvic virajshah huggenguggen markoni985 nirvana6 ianmeinert itsharex boguslaw-d momodu-victor billgaici20230401 ntluong95 zhouhoo hyzwz hangj11 jyutech julianx4 rickganesh davidhuangucd chironblue octag0no joel29dec silvernailj oceans0423 schoemantian jfontestad rafikwahby hekateai nasirtrekker officialjaware wangy950 sumirasystem tjipenk ay-yay thefinality nwalker4483 devedtara p0rt23 andrei-stanescu23 mumu0419 lightnin yanska kstawiski artanis99 machinelearning-ai mib348 cbassoc3265 forest1040 yassengpt vitalij555 matteocargnelutti chrisammon3000 ykyou khalyaan fartypants mweth vidminas sauraz hyperupscale techxbase soderholmen mullet-bsd nsi60 buivangiang-12148701 budiholan-github saketsam000 tonywhite11 qiuhd wangxiya subhashalini ego gabrielburcea keesj-riscure budirs86 redlion 552587 finalxcode vivek6923 lilli2798 sanyaade-projects joelvaneenwyk holly-marie lhfeng99 datenmischwerk datnoor sj1406576460 infoaitek24

chatdocs's Issues

Problem to install GTPQ

hello,

i have a problem with the command "pip install git+https://github.com/PanQiWei/[email protected]"

ERROR: Failed building wheel for auto-gptq Running setup.py clean for auto-gptq Failed to build auto-gptq ERROR: Could not build wheels for auto-gptq, which is required to install pyproject.toml-based projects
i have installed
conda install cuda --channel nvidia/label/cuda-12.1.0
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

I cant get the GPU to work

I dont understand why the GPU doesnt want to work. its always the CPU

The answers get cut off in the middle when it gives longer answers

Hi, is there any way to increase the lenght of the responses, the answers just cut off in the middle and there is no way to continue the prompt

chunksize and max_seq_length of embedding not matching

AFAIK the default of the length measure RecursiveCharacterTextSplitter is len while it is some token measure for the instrutor embeddings.

The programm still works, however chunks inserted into the database a smaller than one would suspect.

When trained on PDF I get results other than the content of the PDF

I uploaded a pdf to be trained, and then I asked who captain America was and it gave me an answer. how can i make it specific to the document only.

Error on chatdocs download at Macos chip M1 pro

Hi guys, As the title said, I try to run the chatdocs download command in my MacBook and got this error.
Can somebody tell me how to fix it? I would like to try this cool tool in my localmchine

Thank you in advanced!

load INSTRUCTOR_Transformer
max_seq_length 512
Fetching 0 files: 0it [00:00, ?it/s]
Fetching 1 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 19239.93it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/-/Library/Python/3.9/lib/python/site-packages/chatdocs/main.py:26 in download │
│ │
│ 23 │ from .download import download │
│ 24 │ │
│ 25 │ config = get_config(config) │
│ ❱ 26 │ download(config=config) │
│ 27 │
│ 28 │
│ 29 @app.command() │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': False, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ │ download = <function download at 0x105ea9d30> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/chatdocs/download.py:10 in download │
│ │
│ 7 def download(config: Dict[str, Any]) -> None: │
│ 8 │ config = {**config, "download": True} │
│ 9 │ get_embeddings(config) │
│ ❱ 10 │ get_llm(config) │
│ 11 │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': True, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/chatdocs/llms.py:73 in get_llm │
│ │
│ 70 │ if config["llm"] == "ctransformers": │
│ 71 │ │ config = {**config["ctransformers"]} │
│ 72 │ │ config = merge(config, {"config": {"local_files_only": local_files_only}}) │
│ ❱ 73 │ │ llm = CTransformers(callbacks=callbacks, **config) │
│ 74 │ elif config["llm"] == "gptq": │
│ 75 │ │ llm = get_gptq_llm(config) │
│ 76 │ else: │
│ │
│ ╭─────────────────────────────────────── locals ───────────────────────────────────────╮ │
│ │ callback = None │ │
│ │ CallbackHandler = <class 'chatdocs.llms.get_llm..CallbackHandler'> │ │
│ │ callbacks = None │ │
│ │ config = { │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ local_files_only = False │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/langchain/load/serializable.py:74 │
│ in init │
│ │
│ 71 │ _lc_kwargs = PrivateAttr(default_factory=dict) │
│ 72 │ │
│ 73 │ def init(self, **kwargs: Any) -> None: │
│ ❱ 74 │ │ super().init(**kwargs) │
│ 75 │ │ self._lc_kwargs = kwargs │
│ 76 │ │
│ 77 │ def to_json(self) -> Union[SerializedConstructor, SerializedNotImplemented]: │
│ │
│ ╭─────────────────────────────────── locals ────────────────────────────────────╮ │
│ │ class = <class 'langchain.load.serializable.Serializable'> │ │
│ │ kwargs = { │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ self = CTransformers() │ │
│ ╰───────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Desktop/chatdocs/pydantic/main.py:339 in pydantic.main.BaseModel.init │
│ │
│ [Errno 2] No such file or directory: '/Users/-/Desktop/chatdocs/pydantic/main.py' │
│ │
│ /Users/-/Desktop/chatdocs/pydantic/main.py:1102 in pydantic.main.validate_model │
│ │
│ [Errno 2] No such file or directory: '/Users/-/Desktop/chatdocs/pydantic/main.py' │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/langchain/llms/ctransformers.py:73 │
│ in validate_environment │
│ │
│ 70 │ │ │ ) │
│ 71 │ │ │
│ 72 │ │ config = values["config"] or {} │
│ ❱ 73 │ │ values["client"] = AutoModelForCausalLM.from_pretrained( │
│ 74 │ │ │ values["model"], │
│ 75 │ │ │ model_type=values["model_type"], │
│ 76 │ │ │ model_file=values["model_file"], │
│ │
│ ╭──────────────────────────────────────── locals ─────────────────────────────────────────╮ │
│ │ AutoModelForCausalLM = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ cls = <class 'langchain.llms.ctransformers.CTransformers'> │ │
│ │ config = {'context_length': 1024, 'local_files_only': False} │ │
│ │ values = { │ │
│ │ │ 'cache': None, │ │
│ │ │ 'verbose': False, │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'callback_manager': None, │ │
│ │ │ 'tags': None, │ │
│ │ │ 'metadata': None, │ │
│ │ │ 'client': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ ... +2 │ │
│ │ } │ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/hub.py:157 in │
│ from_pretrained │
│ │
│ 154 │ │ │ │ local_files_only=local_files_only, │
│ 155 │ │ │ ) │
│ 156 │ │ │
│ ❱ 157 │ │ return LLM( │
│ 158 │ │ │ model_path=model_path, │
│ 159 │ │ │ model_type=model_type, │
│ 160 │ │ │ config=config.config, │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ cls = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ config = AutoConfig( │ │
│ │ │ config=Config( │ │
│ │ │ │ top_k=40, │ │
│ │ │ │ top_p=0.95, │ │
│ │ │ │ temperature=0.8, │ │
│ │ │ │ repetition_penalty=1.1, │ │
│ │ │ │ last_n_tokens=64, │ │
│ │ │ │ seed=-1, │ │
│ │ │ │ batch_size=8, │ │
│ │ │ │ threads=-1, │ │
│ │ │ │ max_new_tokens=256, │ │
│ │ │ │ stop=None, │ │
│ │ │ │ stream=False, │ │
│ │ │ │ reset=True, │ │
│ │ │ │ context_length=1024, │ │
│ │ │ │ gpu_layers=0 │ │
│ │ │ ), │ │
│ │ │ model_type=None │ │
│ │ ) │ │
│ │ kwargs = {'context_length': 1024} │ │
│ │ lib = None │ │
│ │ local_files_only = False │ │
│ │ model_file = 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin' │ │
│ │ model_path = '/Users/-/.cache/huggingface/hub/models--TheBloke--Wizard-V… │ │
│ │ model_path_or_repo_id = 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML' │ │
│ │ model_type = 'llama' │ │
│ │ path_type = 'repo' │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/llm.py:206 in │
│ init │
│ │
│ 203 │ │ if not Path(model_path).is_file(): │
│ 204 │ │ │ raise ValueError(f"Model path '{model_path}' doesn't exist.") │
│ 205 │ │ │
│ ❱ 206 │ │ self._lib = load_library(lib, cuda=config.gpu_layers > 0) │
│ 207 │ │ self._llm = self._lib.ctransformers_llm_create( │
│ 208 │ │ │ model_path.encode(), │
│ 209 │ │ │ model_type.encode(), │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = Config( │ │
│ │ │ top_k=40, │ │
│ │ │ top_p=0.95, │ │
│ │ │ temperature=0.8, │ │
│ │ │ repetition_penalty=1.1, │ │
│ │ │ last_n_tokens=64, │ │
│ │ │ seed=-1, │ │
│ │ │ batch_size=8, │ │
│ │ │ threads=-1, │ │
│ │ │ max_new_tokens=256, │ │
│ │ │ stop=None, │ │
│ │ │ stream=False, │ │
│ │ │ reset=True, │ │
│ │ │ context_length=1024, │ │
│ │ │ gpu_layers=0 │ │
│ │ ) │ │
│ │ lib = None │ │
│ │ model_path = '/Users/-/.cache/huggingface/hub/models--TheBloke--Wizard-Vicuna-7B-Un… │ │
│ │ model_type = 'llama' │ │
│ │ self = <ctransformers.llm.LLM object at 0x287252d00> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/llm.py:101 in │
│ load_library │
│ │
│ 98 │ if hasattr(os, "add_dll_directory") and "CUDA_PATH" in os.environ: │
│ 99 │ │ os.add_dll_directory(os.path.join(os.environ["CUDA_PATH"], "bin")) │
│ 100 │ │
│ ❱ 101 │ path = find_library(path, cuda=cuda) │
│ 102 │ lib = CDLL(path) │
│ 103 │ │
│ 104 │ lib.ctransformers_llm_create.argtypes = [ │
│ │
│ ╭─── locals ───╮ │
│ │ cuda = False │ │
│ │ path = None │ │
│ ╰──────────────╯ │
│ │
│ /Users/-/Library/Python/3.9/lib/python/site-packages/ctransformers/lib.py:23 in │
│ find_library │
│ │
│ 20 │ │ else: │
│ 21 │ │ │ from cpuinfo import get_cpu_info │
│ 22 │ │ │ │
│ ❱ 23 │ │ │ flags = get_cpu_info()["flags"] │
│ 24 │ │ │ │
│ 25 │ │ │ if "avx2" in flags: │
│ 26 │ │ │ │ path = "avx2" │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ cuda = False │ │
│ │ get_cpu_info = <function get_cpu_info at 0x287262670> │ │
│ │ lib_directory = PosixPath('/Users/-/Library/Python/3.9/lib/python/site-packages/ctr… │ │
│ │ path = None │ │
│ │ system = 'Darwin' │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'flags'

ValueError: Could not parse output: RetrievalQA.from_chain_type(chain_type="map_rerank")

I switched the chain_type to "map_rerank" and got the first of my three answers but afterwards i got this error. Seems like there is an update of apply_and_parse necessary?

C:\Users\USERID\Downloads\word01.00.docx:
socre:   0.2196645587682724
C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py:303: UserWarning: The apply_and_parse method is deprecated, instead pass an output parser directly to LLMChain.
 warnings.warn(
Exception in thread Thread-3 (worker):
Traceback (most recent call last):
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\threading.py", line 1038, in _bootstrap_inner
   self.run()
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\threading.py", line 975, in run
   self._target(*self._args, **self._kwargs)
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\chatdocs\ui.py", line 38, in worker
   res = qa(query)
         ^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 243, in __call__
   raise e
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 237, in __call__
   self._call(inputs, run_manager=run_manager)
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\retrieval_qa\base.py", line 131, in _call
   answer = self.combine_documents_chain.run(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 445, in run
   return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 243, in __call__
   raise e
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\base.py", line 237, in __call__
   self._call(inputs, run_manager=run_manager)
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\combine_documents\base.py", line 106, in _call
   output, extra_return_dict = self.combine_docs(
                               ^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\combine_documents\map_rerank.py", line 154, in combine_docs
   results = self.llm_chain.apply_and_parse(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py", line 308, in apply_and_parse
   return self._parse_generation(result)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py", line 314, in _parse_generation
   return [
          ^
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\chains\llm.py", line 315, in <listcomp>
   self.prompt.output_parser.parse(res[self.output_key])
 File "C:\Users\USERID\.conda\envs\chatdocs_clean\Lib\site-packages\langchain\output_parsers\regex.py", line 32, in parse
   raise ValueError(f"Could not parse output: {text}")
ValueError: Could not parse output:
Please find it all

RuntimeError: Failed to create LLM 'llama' from

I'm getting this error when I type, "chatdocs download" in command prompt. What am I doing wrong?

C:\AI\chatdocs>chatdocs download
load INSTRUCTOR_Transformer
max_seq_length 512
Fetching 0 files: 0it [00:00, ?it/s]
Fetching 1 files: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 999.36it/s]
error loading model: unrecognized tensor type 10

llama_init_from_file: failed to load model
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\dkv9\miniconda3\lib\site-packages\chatdocs\main.py:26 in download │
│ │
│ 23 │ from .download import download │
│ 24 │ │
│ 25 │ config = get_config(config) │
│ ❱ 26 │ download(config=config) │
│ 27 │
│ 28 │
│ 29 @app.command() │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\chatdocs\download.py:10 in download │
│ │
│ 7 def download(config: Dict[str, Any]) -> None: │
│ 8 │ config = {**config, "download": True} │
│ 9 │ get_embeddings(config) │
│ ❱ 10 │ get_llm(config) │
│ 11 │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\chatdocs\llms.py:73 in get_llm │
│ │
│ 70 │ if config["llm"] == "ctransformers": │
│ 71 │ │ config = {**config["ctransformers"]} │
│ 72 │ │ config = merge(config, {"config": {"local_files_only": local_files_only}}) │
│ ❱ 73 │ │ llm = CTransformers(callbacks=callbacks, **config) │
│ 74 │ elif config["llm"] == "gptq": │
│ 75 │ │ llm = get_gptq_llm(config) │
│ 76 │ else: │
│ │
│ C:\AI\chatdocs\pydantic\main.py:339 in pydantic.main.BaseModel.init │
│ │
│ [Errno 2] No such file or directory: 'C:\AI\chatdocs\pydantic\main.py' │
│ │
│ C:\AI\chatdocs\pydantic\main.py:1102 in pydantic.main.validate_model │
│ │
│ [Errno 2] No such file or directory: 'C:\AI\chatdocs\pydantic\main.py' │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\langchain\llms\ctransformers.py:70 in │
│ validate_environment │
│ │
│ 67 │ │ │ ) │
│ 68 │ │ │
│ 69 │ │ config = values["config"] or {} │
│ ❱ 70 │ │ values["client"] = AutoModelForCausalLM.from_pretrained( │
│ 71 │ │ │ values["model"], │
│ 72 │ │ │ model_type=values["model_type"], │
│ 73 │ │ │ model_file=values["model_file"], │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\ctransformers\hub.py:157 in from_pretrained │
│ │
│ 154 │ │ │ │ local_files_only=local_files_only, │
│ 155 │ │ │ ) │
│ 156 │ │ │
│ ❱ 157 │ │ return LLM( │
│ 158 │ │ │ model_path=model_path, │
│ 159 │ │ │ model_type=model_type, │
│ 160 │ │ │ config=config.config, │
│ │
│ C:\Users\dkv9\miniconda3\lib\site-packages\ctransformers\llm.py:205 in init │
│ │
│ 202 │ │ │ config.gpu_layers, │
│ 203 │ │ ) │
│ 204 │ │ if self._llm is None: │
│ ❱ 205 │ │ │ raise RuntimeError( │
│ 206 │ │ │ │ f"Failed to create LLM '{model_type}' from '{model_path}'." │
│ 207 │ │ │ ) │
│ 208 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Failed to create LLM 'llama' from
'C:\Users\dkv9.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-7B-Uncensored-GGML\snapshots\531879da598ebc577cd4
a03bdde9fbe3a641fc63\Wizard-Vicuna-7B-Uncensored.ggmlv3.q2_K.bin'.

Memory optimization

Out of curiosity, I tried containerizing the package with the offline models, installing the required PIP packages and a sample document (size: 62 kB) loaded through chatdocs add.

The build could not complete locally due to memory issues. I tried it in a hosted environment and I got the following warnings during the build process:

Even though the package does not have an official support for containers yet, this indicates that we need to optimize the code perfoming various operations (like adding files or loading models) so that it works well on low end PCs.

Integration with the UIs of the other major projects [Question]/[Suggestion]

Hey, neat project and/but I'm wondering how one could approach to integrating a project like this to/with the big ones like oobabooga/text-generation-webui & SillyTavern.

Both support extensions. Wouldn't it be cool if all the extra features of these somehow magically worked with separate efforts like this one?

Asking to hopefully get the discussions rolling.

Add MPS acceleration for Apple Silicon Macs

Dear Dev,

Great project and good structure.

Would be awesome to add MPS gpu acceleration for Apple Silicon Macs in the same fashion as https://github.com/PromtEngineer/localGPT.

Thanks!!

Excel Minutes of the meeting creation

Hi!

It would be nice if with langchain, a prompt could be passed for an Llm to create structured output like json which can then be parsed to excel.

Inspiration:
https://github.com/Parassharmaa/mom-ai/blob/main/generate_mom.py
https://python.langchain.com/docs/modules/model_io/output_parsers/structured

I'll try to sketch up a draft myself, but thought I should mention it here.

Skye

Incomplete response

The response of a query is incomplete. ChatGPT3.5 allows users to prompt it to continue with the response in a new prompt and the team at Oobabooga added a "Continue" button to their WebUI to do the same.

There is no way to continue lost response currently through a new prompt and splitting the prompts into smaller chunks resulted in a lost of context when responding.

I'm unsure whether this is a bug or missing feature and am wondering is there a fix to allow a full response.

I'm new to this.

is it possible to reset DB ?

hello,
i have upload document and after delete them but the LLM have this dcoument in memory.
How i can reset memroy or refresh the document list ?

thanks

Conda Installation? Anyone got it running?

Hi!

I have been having some issues with dependencies lately and wanted to isolate an install over Conda. I am running into some issues. Anyone gotten this to run under Conda?

can't use ggml-gpt4all-j-v1.3-groovy.bin

I wanted to use another llm but i had some errors as:

and this is my chatdocs.yml:

I already did pip install ctransformers and set CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers

How can i use ggml-gpt4all-j-v1.3-groovy.bin?
I'm sorry if these questions/problems are easy. I'm still a beginner on this subject but i really love the work you're putting on.

Query giving wrong citations and answers out of the document

How does one uninstall everything?

Hi,

I have found most of the folders that this package installs to, it's quite spread out on Windows. Is there a good way to uninstall everything if needed?

Trouble with chat docs download

I have tried this with 0.2.4 and 0.2.5, getting the same results.

pete@chatdocs:~ $ pip show ctransformers
Name: ctransformers
Version: 0.2.15
Summary: Python bindings for the Transformer models implemented in C/C++ using GGML library.
Home-page: https://github.com/marella/ctransformers
Author: Ravindra Marella
Author-email: [email protected]
License: MIT
Location: /home/pete/.local/lib/python3.9/site-packages
Requires: huggingface-hub
Required-by: chatdocs

Initially I had other errors that were resolved by installing:

typing_extensions==4.5.0
typing-inspect==0.8.0

Now I get the following:

OSError: /home/pete/.local/lib/python3.9/site-packages/ctransformers/lib/avx2/libctransformers.so: cannot open shared object file: No such file or
directory
pete@chatdocs:~ $ ls .local/lib/python3.9/site-packages/ctransformers/lib/avx2
ctransformers.dll libctransformers.dylib libctransformers.so

Permissions on the file are -rwxr-xr-x

So I'm stuck. Any help appreciated. Thanks.

Where do I store pre-downloaded models?

Sorry for the daft question, but I've already downloaded a host of models and would rather not copy them into yet another area of my pc. is there a location I should store them for this to access them? or any way I can change the filepath to the model?

won't use gpu

im trying to have ctransformers use gpu but it won't work.

my chatdocs.yml:

ctransformers:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML
  model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama
  config:
    gpu_layers: 50


llm: ctransformers

Error when parquet files get too big / function for splitting?

Hi!

I have been uploading a lot of data and ran into a snappy compress error after reaching around 3,6GB of data in the parquet file.

Error: Invalid Error: Snappy decompression failure

I saw that there was a limit for parquetfiles and that limit is 4GB. Could we add functionality to split the parquet files when they reach 1 GB of data to get rid of this issue. Does anyone know how to do it ?

Upload document function and embeddings processing in web ui

Dear Dev,

It would be great to have a sidebar in the web UI with the ability to upload directly (dragn'drop) and start processing embeddings. After that, a prompt with a "reload web ui" button would be the perfect workflow for adding new documents on the fly.

Highly appreciated :)

Failed building wheel for auto-gptq

` "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc" -c autogptq_cuda/autogptq_cuda_kernel.cu -o build\temp.win-amd64-cpython-310\Release\autogptq_cuda/autogptq_cuda_kernel.obj -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\TH -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -Iautogptq_cuda -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\include -IC:\Users\jsviv\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.36.32532\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.36.32532\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=autogptq_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --use-local-env
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include\crt/host_config.h(160): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2019 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
autogptq_cuda_kernel.cu
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for auto-gptq
Running setup.py clean for auto-gptq
Failed to build auto-gptq
ERROR: Could not build wheels for auto-gptq, which is required to install pyproject.toml-based `projects``

Have NVIDIA card, NVIDIA Toolkit mapped to environmental var, and visual studio 2019, Still wont work.

Document does not import after db folder has been deleted.

I am attempting to use chatdocs in different folders, so that each folder refrences a different data set. I wanted to clear data set A and start over again, so I removed the db folder. When I added a document to data set A using "chatdocs add /path/to/file" I get the following error message -

Loading new documents: 0it [00:00, ?it/s]
No new documents to load

I attempted to uninstall chatdocs via pip, and reinstall, but I'm still getting the same error message.

When attempting to launch chatdocs from the folder (which no longer has a db) it crashes after being asked a question.

Index not found, please create an instance before querying

Question regarding model

Config to Limit Chatdoc responses only to documents added

First of all, this is an amazing tool for offline Q&A modules.

Wanted to check if there is a config available in chatdocs.yml file to limit responses only to documents added and reply open questions with standard "I don't know" responses.

Or at least set the temperature of the models used to ensure the answers are not made using LLMs pre-trained texts.

Great Project - Question About Docs

Is there an overview or guide as to which kinds of documents this can handle or what interim steps are needed to -- for example process doc, docx, ppt, or pdf files?

Prompting in another language sometimes gives a english answer

Has anyone figured out how to get it to just prompt in the language you are using to ask questions. It jumps a little back and forth between english and the other language.

RuntimeError: Failed to create LLM 'llama'

I have the same issue :

(chatdocs-main) PS G:\Chat\chatdocs-main> chatdocs download
load INSTRUCTOR_Transformer
max_seq_length 512
Fetching 0 files: 0it [00:00, ?it/s]
Fetching 1 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s]
error loading model: failed to open C:\Users\寰′付濂昞.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-7B-Uncensored-GGML\blobs\c31a4edd96527dcd808bcf9b99e3894065ac950747dac84ecd
415a2387454e7c: No such file or directory
llama_init_from_file: failed to load model
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ E:\chatdocs-main\lib\site-packages\chatdocs\main.py:26 in download │
│ │
│ 23 │ from .download import download │
│ 24 │ │
│ 25 │ config = get_config(config) │
│ ❱ 26 │ download(config=config) │
│ 27 │
│ 28 │
│ 29 @app.command() │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': False, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ │ download = <function download at 0x000001F46E199790> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\chatdocs\download.py:10 in download │
│ │
│ 7 def download(config: Dict[str, Any]) -> None: │
│ 8 │ config = {**config, "download": True} │
│ 9 │ get_embeddings(config) │
│ ❱ 10 │ get_llm(config) │
│ 11 │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = { │ │
│ │ │ 'embeddings': {'model': 'hkunlp/instructor-large'}, │ │
│ │ │ 'llm': 'ctransformers', │ │
│ │ │ 'ctransformers': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ │ 'model_type': 'llama', │ │
│ │ │ │ 'config': {'context_length': 1024, 'local_files_only': False} │ │
│ │ │ }, │ │
│ │ │ 'huggingface': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-HF', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'gptq': { │ │
│ │ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ', │ │
│ │ │ │ 'model_file': │ │
│ │ 'Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors', │ │
│ │ │ │ 'pipeline_kwargs': {'max_new_tokens': 256} │ │
│ │ │ }, │ │
│ │ │ 'download': True, │ │
│ │ │ 'host': 'localhost', │ │
│ │ │ 'port': 5000, │ │
│ │ │ 'auth': False, │ │
│ │ │ 'chroma': { │ │
│ │ │ │ 'persist_directory': 'db', │ │
│ │ │ │ 'chroma_db_impl': 'duckdb+parquet', │ │
│ │ │ │ 'anonymized_telemetry': False │ │
│ │ │ }, │ │
│ │ │ ... +1 │ │
│ │ } │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\chatdocs\llms.py:73 in get_llm │
│ │
│ 70 │ if config["llm"] == "ctransformers": │
│ 71 │ │ config = {**config["ctransformers"]} │
│ 72 │ │ config = merge(config, {"config": {"local_files_only": local_files_only}}) │
│ ❱ 73 │ │ llm = CTransformers(callbacks=callbacks, **config) │
│ 74 │ elif config["llm"] == "gptq": │
│ 75 │ │ llm = get_gptq_llm(config) │
│ 76 │ else: │
│ │
│ ╭─────────────────────────────────────── locals ───────────────────────────────────────╮ │
│ │ callback = None │ │
│ │ CallbackHandler = <class 'chatdocs.llms.get_llm..CallbackHandler'> │ │
│ │ callbacks = None │ │
│ │ config = { │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ local_files_only = False │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\langchain\load\serializable.py:74 in init │
│ │
│ 71 │ _lc_kwargs = PrivateAttr(default_factory=dict) │
│ 72 │ │
│ 73 │ def init(self, **kwargs: Any) -> None: │
│ ❱ 74 │ │ super().init(**kwargs) │
│ 75 │ │ self._lc_kwargs = kwargs │
│ 76 │ │
│ 77 │ def to_json(self) -> Union[SerializedConstructor, SerializedNotImplemented]: │
│ │
│ ╭─────────────────────────────────── locals ────────────────────────────────────╮ │
│ │ class = <class 'langchain.load.serializable.Serializable'> │ │
│ │ kwargs = { │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'config': { │ │
│ │ │ │ 'context_length': 1024, │ │
│ │ │ │ 'local_files_only': False │ │
│ │ │ } │ │
│ │ } │ │
│ │ self = CTransformers() │ │
│ ╰───────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ G:\Chat\chatdocs-main\pydantic\main.py:339 in pydantic.main.BaseModel.init │
│ │
│ [Errno 2] No such file or directory: 'G:\Chat\chatdocs-main\pydantic\main.py' │
│ │
│ G:\Chat\chatdocs-main\pydantic\main.py:1102 in pydantic.main.validate_model │
│ │
│ [Errno 2] No such file or directory: 'G:\Chat\chatdocs-main\pydantic\main.py' │
│ │
│ E:\chatdocs-main\lib\site-packages\langchain\llms\ctransformers.py:73 in validate_environment │
│ │
│ 70 │ │ │ ) │
│ 71 │ │ │
│ 72 │ │ config = values["config"] or {} │
│ ❱ 73 │ │ values["client"] = AutoModelForCausalLM.from_pretrained( │
│ 74 │ │ │ values["model"], │
│ 75 │ │ │ model_type=values["model_type"], │
│ 76 │ │ │ model_file=values["model_file"], │
│ │
│ ╭──────────────────────────────────────── locals ─────────────────────────────────────────╮ │
│ │ AutoModelForCausalLM = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ cls = <class 'langchain.llms.ctransformers.CTransformers'> │ │
│ │ config = {'context_length': 1024, 'local_files_only': False} │ │
│ │ values = { │ │
│ │ │ 'cache': None, │ │
│ │ │ 'verbose': False, │ │
│ │ │ 'callbacks': None, │ │
│ │ │ 'callback_manager': None, │ │
│ │ │ 'tags': None, │ │
│ │ │ 'metadata': None, │ │
│ │ │ 'client': None, │ │
│ │ │ 'model': 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML', │ │
│ │ │ 'model_type': 'llama', │ │
│ │ │ 'model_file': 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin', │ │
│ │ │ ... +2 │ │
│ │ } │ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\ctransformers\hub.py:157 in from_pretrained │
│ │
│ 154 │ │ │ │ local_files_only=local_files_only, │
│ 155 │ │ │ ) │
│ 156 │ │ │
│ ❱ 157 │ │ return LLM( │
│ 158 │ │ │ model_path=model_path, │
│ 159 │ │ │ model_type=model_type, │
│ 160 │ │ │ config=config.config, │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ cls = <class 'ctransformers.hub.AutoModelForCausalLM'> │ │
│ │ config = AutoConfig( │ │
│ │ │ config=Config( │ │
│ │ │ │ top_k=40, │ │
│ │ │ │ top_p=0.95, │ │
│ │ │ │ temperature=0.8, │ │
│ │ │ │ repetition_penalty=1.1, │ │
│ │ │ │ last_n_tokens=64, │ │
│ │ │ │ seed=-1, │ │
│ │ │ │ batch_size=8, │ │
│ │ │ │ threads=-1, │ │
│ │ │ │ max_new_tokens=256, │ │
│ │ │ │ stop=None, │ │
│ │ │ │ stream=False, │ │
│ │ │ │ reset=True, │ │
│ │ │ │ context_length=1024, │ │
│ │ │ │ gpu_layers=0 │ │
│ │ │ ), │ │
│ │ │ model_type=None │ │
│ │ ) │ │
│ │ kwargs = {'context_length': 1024} │ │
│ │ lib = None │ │
│ │ local_files_only = False │ │
│ │ model_file = 'Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin' │ │
│ │ model_path = 'C:\Users\御丶奕.cache\huggingface\hub\models--TheBloke--Wiz… │ │
│ │ model_path_or_repo_id = 'TheBloke/Wizard-Vicuna-7B-Uncensored-GGML' │ │
│ │ model_type = 'llama' │ │
│ │ path_type = 'repo' │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ E:\chatdocs-main\lib\site-packages\ctransformers\llm.py:214 in init │
│ │
│ 211 │ │ │ config.gpu_layers, │
│ 212 │ │ ) │
│ 213 │ │ if self._llm is None: │
│ ❱ 214 │ │ │ raise RuntimeError( │
│ 215 │ │ │ │ f"Failed to create LLM '{model_type}' from '{model_path}'." │
│ 216 │ │ │ ) │
│ 217 │
│ │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ config = Config( │ │
│ │ │ top_k=40, │ │
│ │ │ top_p=0.95, │ │
│ │ │ temperature=0.8, │ │
│ │ │ repetition_penalty=1.1, │ │
│ │ │ last_n_tokens=64, │ │
│ │ │ seed=-1, │ │
│ │ │ batch_size=8, │ │
│ │ │ threads=-1, │ │
│ │ │ max_new_tokens=256, │ │
│ │ │ stop=None, │ │
│ │ │ stream=False, │ │
│ │ │ reset=True, │ │
│ │ │ context_length=1024, │ │
│ │ │ gpu_layers=0 │ │
│ │ ) │ │
│ │ lib = None │ │
│ │ model_path = 'C:\Users\御丶奕.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-… │ │
│ │ model_type = 'llama' │ │
│ │ self = <ctransformers.llm.LLM object at 0x000001F4ED4A19A0> │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Failed to create LLM 'llama' from
'C:\Users\御丶奕.cache\huggingface\hub\models--TheBloke--Wizard-Vicuna-7B-Uncensored-GGML\blobs\c31a4edd96527dcd808bcf9b99e3894065ac950747dac84ecd415a2387454e7c'.

Illegal instruction

hello. i installed chatdocs as th docs.

when i installed and chatdoc download the model, and then add the docs.
chatdocs add /root/docs
and then run chatdocs ui
it reports
Illegal instruction

`root@9bb4ee29f890:# chatdocs add /root/docs
Creating new vectorstore
Loading documents from /root/docs
Loading new documents: 100%|██████████████████████| 1/1 [00:00<00:00, 41.39it/s]
Loaded 1 new documents from /root/docs
Creating embeddings. May take a few minutes...
load INSTRUCTOR_Transformer
max_seq_length 512
root@9bb4ee29f890:# chatdocs ui

load INSTRUCTOR_Transformer

max_seq_length 512
Illegal instruction`

Not using chatdocs.yml

chatdocs download --config C:\Users\Administrator\Documents\chatdocs\chatdocs.yml

Tried this variation

chatdocs download --config chatdocs.yml

Still does not use it.

How to pass this to chatdocs ui command to ensure it uses GPU and the model there in? Is this a recent change?

Please implement queuing on the web interface

Please implement queuing so it would show that another request is being processed and start as soon as this is finished when it is busy processing another request.

How do I install?

I think the instructions are too vague...

I put the commands in but says chatdocs isnt an available command.

Sorry I am a noob.

Doesn't fully use CPU?

I have noticed it never uses more than 400% in glances (equivalent of 4 cores at 100%), that's only 25% of what my CPU has to offer. Is it normal, or do I have something configured wrong?

$ cat chatdocs.yml
llm: ctransformers

ctransformers:
  model: /mnt/dev/ai/oobabooga_linux/text-generation-webui/models/Wizard-Vicuna-7B-Uncensored/Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama

download: false

I tried other (ggml) models, but it behaved pretty much same.

I remember from trying oobabooga's text gen webui that "Transformers" had similar poor performance and I had to switch to "llama.cpp" to get better CPU utilization.

Sadly for my GPU is still not available rocm (5.5?) in Manajro, so CPU is currently my only option :(.

Cannot delete DB

There are duplicate entries in my DB which I would like to delete.

Where is the DB stored on Windows 10? The documentation says that "The processed documents will be stored in db directory by default" and "Note: When you change the embeddings model, delete the db directory and add documents again."

But where is the DB directory stored? I have tried endless googling and searching for every directory in my computer named "db" and I still cannot find it.

Uninstalling chatdocs / chromadb and reinstalling does nothing, the duplicate entries in chatdocs remain.

Does the falcon models work with chatdocs ?

I know the falcon models are a little different and that they might not work with Chatdocs. Have anyone tried and got it to run. Which model did you use? GGML, GPTQ or maybe the standard models?

Running chatdocs ui on headless server

I'm attempting to run this on a headless server (Ubuntu 22.04) where I have considerably more resources, and can't get internet access to it. I've attempted to modify the IP/Port numbers on the chatdocs.yml, tried modifying the ui.py, nothing. Keeps returning the

Running on http://localhost:5000

local firewalls are disabled and the device is reachable via Ping.

Any suggestions?

ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory

hello,
i have a new error when i install GPTQ on python 3.10. I run chatdocs chat

ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory
....
ValueError: Dependencies for InstructorEmbedding not found.

Invisible output

I asked two things in italian, the first time it worked, the second one I got this (no error log):
Input: Nella fase di stima del valore di mercato, l'income approach impiega due calcoli differenti, quali, puoi descriverli?
Output:





























helpful il mettere e una o Comment on micro-

(blank lines, then at the end an incoherent partial sentence, and then the sources). It's like if the output is in the lines there, but invisible, not printed.

Console: [2023-06-16 15:14:19 +0200] [66608] [INFO] 127.0.0.1:53010 GET /favicon.ico 1.1 404 207 997
I'm using the gpu (rtx 3060) on windows 11:

ctransformers:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML
  model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama
  config:
    context_length: 1024
    gpu_layers: 50

huggingface:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-HF
  pipeline_kwargs:
    max_new_tokens: 256
  device: 0

gptq:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ
  model_file: Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors
  pipeline_kwargs:
    max_new_tokens: 256
  device: 0

how to ... ?

deploy
update db
hide document/file name source
install to specific folder

Feature: --listen

Is there a way to add the --listen flag to access the ui from other computers on the net. Would it be as simple as me changing line 63 in util.py
from app.run(host="localhost", port=config["port"], use_reloader=False)
to app.run(host="0.0.0.0", port=config["port"], use_reloader=False) ?

GPTQ model seems slow

I've been using this chatdocs project with a ggml model which has worked really well if a bit slow. I have read a lot online about GPTQ models delivering significantly better speeds, but when I trialed this it's only getting a roughly 2x speed up.

When I run chatdocs ui command it raises a message "CUDA extension not installed" but I have installed just about every CUDA related package (several of which looked to be CUDA extension) I can find online and the message is still present. Is this likely to be slowing the model down? If so, any idea exactly which package this message is wanting installed?

I'm also getting the message "skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet" but again, I have the triton package installed in my env. Any ideas on a likely cause and if this issue is likely to affect the speed?

Just to round off, I am very pleased with this project in general. it looks good, works nicely and was relatively easy to install (just had to find a few other packages online such as CUDNN)

Featurerequest: ConversationalRetrival and definition of chain_type

I think conversationalRetrival would be a great feature on this awesome project. https://python.langchain.com/docs/modules/chains/popular/chat_vector_db Unfortunately i am not good enough to make it work, i dont know how to pass the "memory" around.

Additionaly i think some would profit from "threads" for the ctransformers as well as the chain_type or the search_type of the retriever.

Crash on low end PC

crashing after adding a lot of PDFs on low end PC

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 258032 bytes for Chunk::new
# Possible reasons:
#   The system is out of physical RAM or swap space
#   The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
#   JVM is running with Unscaled Compressed Oops mode in which the Java heap is
#     placed in the first 4GB address space. The Java Heap base address is the
#     maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
#     to set the Java Heap base and to place the Java Heap above 4GB virtual address.
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (arena.cpp:189), pid=13844, tid=23816
#
# JRE version: OpenJDK Runtime Environment JBR-17.0.6+10-829.9-jcef (17.0.6+10) (build 17.0.6+10-b829.9)
# Java VM: OpenJDK 64-Bit Server VM JBR-17.0.6+10-829.9-jcef (17.0.6+10-b829.9, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
# No core dump will be written. Minidumps are not enabled by default on client versions of Windows
#

---------------  S U M M A R Y ------------

Command Line: exit -XX:ErrorFile=C:\Users\support2\\java_error_in_pycharm64_%p.log -XX:HeapDumpPath=C:\Users\support2\\java_error_in_pycharm64.hprof -Xms128m -Xmx750m -XX:ReservedCodeCacheSize=512m -XX:+UseG1GC -XX:SoftRefLRUPolicyMSPerMB=50 -XX:CICompilerCount=2 -XX:+HeapDumpOnOutOfMemoryError -XX:-OmitStackTraceInFastThrow -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=exclude,com/intellij/openapi/vfs/impl/FilePartNodeRoot,trieDescend -ea -Dsun.io.useCanonCaches=false -Dsun.java2d.metal=true -Djbr.catch.SIGABRT=true -Djdk.http.auth.tunneling.disabledSchemes="" -Djdk.attach.allowAttachSelf=true -Djdk.module.illegalAccess.silent=true -Dkotlinx.coroutines.debug=off -Xmx987m -Djb.vmOptionsFile=C:\Users\support2\AppData\Roaming\\JetBrains\\PyCharmCE2023.1\pycharm64.exe.vmoptions -Djava.system.class.loader=com.intellij.util.lang.PathClassLoader -Didea.vendor.name=JetBrains -Didea.paths.selector=PyCharmCE2023.1 -Djna.boot.library.path=C:\Program Files\JetBrains\PyCharm Community Edition 2023.1.2/lib/jna/amd64 -Dpty4j.preferred.native.folder=C:\Program Files\JetBrains\PyCharm Community Edition 2023.1.2/lib/pty4j -Djna.nosys=true -Djna.noclasspath=true -Didea.platform.prefix=PyCharmCore -Dsplash=true --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.ref=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.nio.charset=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.vm=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.fs=ALL-UNNAMED --add-opens=java.base/sun.security.ssl=ALL-UNNAMED --add-opens=java.base/sun.security.util=ALL-UNNAMED --add-opens=java.base/sun.net.dns=ALL-UNNAMED --add-opens=java.desktop/java.awt=ALL-UNNAMED --add-opens=java.desktop/java.awt.dnd.peer=ALL-UNNAMED --add-opens=java.desktop/java.awt.event=ALL-UNNAMED --add-opens=java.desktop/java.awt.image=ALL-UNNAMED --add-opens=java.desktop/java.awt.peer=ALL-UNNAMED --add-opens=java.desktop/java.awt.font=ALL-UNNAMED --add-opens=java.desktop/javax.swing=ALL-UNNAMED --add-opens=java.desktop/javax.swing.plaf.basic=ALL-UNNAMED --add-opens=java.desktop/javax.swing.text.html=ALL-UNNAMED --add-opens=java.desktop/sun.awt.datatransfer=ALL-UNNAMED --add-opens=java.desktop/sun.awt.image=ALL-UNNAMED --add-opens=java.desktop/sun.awt.windows=ALL-UNNAMED --add-opens=java.desktop/sun.awt=ALL-UNNAMED --add-opens=java.desktop/sun.font=ALL-UNNAMED --add-opens=java.desktop/sun.java2d=ALL-UNNAMED --add-opens=java.desktop/sun.swing=ALL-UNNAMED --add-opens=jdk.attach/sun.tools.attach=ALL-UNNAMED --add-opens=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-opens=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED --add-opens=jdk.jdi/com.sun.tools.jdi=ALL-UNNAMED -Dide.native.launcher=true -Djcef.sandbox.ptr=0000020E90174FA0 

Host: 12th Gen Intel(R) Core(TM) i7-12700, 20 cores, 7G,  Windows 10 , 64 bit Build 19041 (10.0.19041.2913)
Time: Wed Jun  7 12:36:17 2023 Arabian Standard Time elapsed time: 32.528581 seconds (0d 0h 0m 32s)

---------------  T H R E A D  ---------------

Current thread (0x0000020ebc83f590):  JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=23816, stack(0x000000f0b5800000,0x000000f0b5900000)]


Current CompileTask:
C2:  32528 25800   !   4       com.intellij.ide.IdeEventQueue::dispatchByCustomDispatchers (110 bytes)

Stack: [0x000000f0b5800000,0x000000f0b5900000]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0x683c5a]
V  [jvm.dll+0x842764]
V  [jvm.dll+0x843f5e]
V  [jvm.dll+0x8445c3]
V  [jvm.dll+0x249b75]
V  [jvm.dll+0xabcac]
V  [jvm.dll+0xac27c]
V  [jvm.dll+0x368857]
V  [jvm.dll+0x1bd0b8]
V  [jvm.dll+0x21c359]
V  [jvm.dll+0x21b621]
V  [jvm.dll+0x1a4fdd]
V  [jvm.dll+0x22b098]
V  [jvm.dll+0x229159]
V  [jvm.dll+0x7f81ac]
V  [jvm.dll+0x7f270a]
V  [jvm.dll+0x682a95]
C  [ucrtbase.dll+0x21bb2]
C  [KERNEL32.DLL+0x17614]
C  [ntdll.dll+0x526a1]

---------------  S Y S T E M  ---------------

OS:
 Windows 10 , 64 bit Build 19041 (10.0.19041.2913)
OS uptime: 7 days 4:23 hours
Hyper-V role detected

CPU: total 20 (initial active 20) (10 cores per cpu, 2 threads per core) family 6 model 151 stepping 2 microcode 0x1f, cx8, cmov, fxsr, ht, mmx, 3dnowpref, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, lzcnt, tsc, tscinvbit, avx, avx2, aes, erms, clmul, bmi1, bmi2, adx, sha, fma, vzeroupper, clflush, clflushopt, clwb, hv

Memory: 4k page, system-wide physical 7897M (408M free)
TotalPageFile size 32473M (AvailPageFile size 0M)
current process WorkingSet (physical memory assigned to process): 1128M, peak: 1133M
current process commit charge ("private bytes"): 1004M, peak: 1008M

vm_info: OpenJDK 64-Bit Server VM (17.0.6+10-b829.9) for windows-amd64 JRE (17.0.6+10-b829.9), built on 2023-04-09 by "builduser" with MS VC++ 16.10 / 16.11 (VS2019)

END.

>

couldn't ingest docs & typer incompatible

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
openapi-python-client 0.13.4 requires typer<0.8.0,>=0.6, but you have typer 0.9.0 which is incompatible.
spacy 3.5.3 requires typer<0.8.0,>=0.3.0, but you have typer 0.9.0 which is incompatible.

1- i had this error when i was trying to 'pip install chatdocs'
Does it effect my app overall?

2- And when i use it, it looks like a chatpgt, i couldn't ingest any pdf file using chatdocs add /path/to/documents (adding a screenshot)

Can we add a buffer memory to this so that it can remember previous prompts?

I have been playing around with this and it is a great project!

It does not seem to remember your previous prompts and I am wondering if there is a way for us to add "memory" to the prompts.

Also, is there a way to add logic to train it and remember certain things that you have told it?

What would be the best way to implement web scraping into this solution?

Hi again,

I am looking into langchain and there are so many tools available (doc loaders). I am looking at making this solution go online and scrape websites as an alternative to documents. How would you suggest to do that as easily as possible?

Any way to set temperature in the yaml file?

Hi,

I have seen quite a bit of hallucinations with the model. I am trying to find a way to set temperature. Is there something I can do in the yml file?

Something is wrong with 0.2.5 - chatdocs download command

Hi @marella

Something broke with 0.2.5. I have tried many different ways to get it to work.

When I run 0.2.4 everything works flawlessly.

With 0.2.5 something goes wrong any time I do chatdocs download, it does not seem to respect changes to chatdocs.yml

When I go into the site-packages folder and change to the correct model file path it give me the error that ctransformers.dll was not found.

Anyone able to fix this?

marella / chatdocs Goto Github PK

chatdocs's People

Contributors

Stargazers

Watchers

Forkers

chatdocs's Issues

Now I get the following:

OSError: /home/pete/.local/lib/python3.9/site-packages/ctransformers/lib/avx2/libctransformers.so: cannot open shared object file: No such file or directory pete@chatdocs:~ $ ls .local/lib/python3.9/site-packages/ctransformers/lib/avx2 ctransformers.dll libctransformers.dylib libctransformers.so

Recommend Projects

Recommend Topics

Recommend Org

OSError: /home/pete/.local/lib/python3.9/site-packages/ctransformers/lib/avx2/libctransformers.so: cannot open shared object file: No such file or
directory
pete@chatdocs:~ $ ls .local/lib/python3.9/site-packages/ctransformers/lib/avx2
ctransformers.dll libctransformers.dylib libctransformers.so