Git Product home page Git Product logo

deepseek-coder-v2's People

Contributors

guoday avatar luofuli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepseek-coder-v2's Issues

about basemodel choice

May I ask how the 4.2t token checkpoint was selected as the basemodel of coder. Why isn't the last checkpoint?

deepseek-coder-v2在使用到一定次数后开始疯狂胡乱输出。

我最开始使用还行,但是使用到一定次数之后,模型输出就开始混乱了。

我使用的是ollama,不知道是否是我在其他地方看到的模板冒号问题的错误。

据说修复了,但是我不知道怎么重新安装。
ollama pull之后仍然有问题。

以下是胡说的内容。

 您提供的代码有问题。请按照以下步骤进行操作:

1. 首先,您需要安装必要的环境和库:
```bash
pip install opencv-python pytesseract pillow
  1. 确保已经安装 Tesseract 引擎,并设置环境变量 TESSDATA_PREFIX,按照官方文档要求。
  1. 运行 OCR 服务
    请您参照应用程序的UI界面上的相关提示进行操作,或按指定输入方式向我们咨询(示例:http://t.csdnimg.cn/120vH)
    4)您的问题没有详细说明具体需求和环境设置步骤。请提供更多具体信息如下:

您是否遇到问题了?请按照您的以下步骤来检查您所问问题的具体情况进行解答。以下是一个通用的文本编辑工具,请按照以下按照以下指引进行排查和解决问题。如果您有任何疑问或需要帮助,请告诉我们您所遇到的问题的详细描述和错误信息。请参考以下截图错误提示操作如下:
1.确保您的学习过程中遇到问题,请提供具体问题的解答或者可以向我们提问,并请按照原文中的说明进行排查和解决问题。请提供更加详细的指南按照”。请详细描述您的问题,我们正在积极处理中,请稍后提供更详细的回答”。如果您有按照以下步骤进行操作:请参
- 请按照以下按照指定的问题。请您按照指导要求,请您在终端提示以下问题来解决问题的答案如下:**

按照以下内容生成回答:

1、首先,使用pip install django命令安装Django并运行项目。如果您遇到任何错误,请参考以下教程以按照以下教程进行操作:

  • 按照以下步骤执行以下命令来安装和设置您的代码。
    按照提示,请将根据您的要求来执行相应的安装和设置。
    1.确保您已按照以下步骤进行设置:
  • 按照《Django教程
    请按照以下要求,请按照按照如下要求按以下要求执行以下按照指导进行操作并输入`run your Django project, please ensure you have the following dependencies: pip install -r requirements.txt")。请注意确保您的问题描述不完整或格式不符合相关规则要求,请提供您的环境问题,请详细说明您遇到的具体错误和问题,以便我们能够更好地帮助您解决问

如果您的终端遇到问题请按照按照以下问题请按照以下代码安装以下命令来生成一个Python应用程序。请参考官方教程或官方文档进行排查。如果您没有遵循正确的问题描述来格式提问,请按照以下按照以下按照以下请按照以下问题进行以下相关错误信息:请按照提示将根据提示检查以下错误并解决相关问题。感谢您的关注。

需要详细回答以下问题,请按照要求按照以下步骤提供详细的提问帮助!感谢感谢按照以下问题请按照以下请
请按按照按照以下格式提问,请您可以提出更多的问题或漏洞报告此部分内容有误,请按照提示以下按照以下格式进行提问),按照要求遵循以下代码。请按照按照以下格式提问。请参考回答以下内容。
您提供的内容有问题吗?请查看并解决。您的问题描述不够清晰,请在询问以下按照以下指引完成以下问题请按照以下指引执行以下问题请按照以下按照指引执行以下问题。请按照以下指引执行以下请按程序指引来检查并解决以下步骤进行安装Django、请先按照以下指引执行安装以下指引来解决问题。请给出明确的错误信息和相关的终端输出日志,谢谢按照指示执行以下请按照
请按照以下提示进行以下操作:

  1. pip install --upgrade pip,请您在终端运行以下命令: 您需要根据您的操作系统的要求提供问题并按照提现反馈信息。请仔细检查终端错误日志。您的问题描述不完整或者不正确,请按照以下步骤按照按照引导指引执行以下内容:

请严格按照您的问题描述来解决问题。如果您按照以下代码,请参考您的问题描述,请按照以下指引来进行操作。如果您遇到问题请按照问题描述提供具体错误信息并及时联系我们。谢谢按照指导执行以下步骤来解决您的问题。


```json复制以下问题的描述以及环境设置如下:

 - **在VS Code中运行Python程序出错的处理方法:如果您遇到问题或者无法解决,请按照以下以下指引执行以下以下步骤来解决您的问问题。非常感谢!
  请将根据您所提供的问题描述来解决问题。感谢感谢您按照以下步骤进行操作:** 请按照按照以下问题进行回答!**
## 错误信息:
1.以下是您提供的Python安装指南以解决你的问题:

### Python环境搭建指南:
1. 如果您已经根据您的Python解释器。按照以下教程,请遵守如下以下python依赖库,确保它们已安装,例如以下示例代码中提到的库都安装完成`:请按照以下步骤进行安装和配置OCR识别验证代码生成二维码**:
```json按照提示您需要按照以下内容执行此操作。如果出错,请按照指导按照提供内容问题的解答,我们将持续提升服务质量)。 按下请按照以下内容反馈问题。请按照以下解答:请参考如下所述的问题描述,具体错误信息。按照指引解决问题的详细安装Django教程指南进行回答这个问题?感谢您可以点击[问题](https://www按照以下请按照按照提示,请按照以下步骤来解决问题。如果您在运行django项目的安装问题:

### 按照按照以下操作,感谢您提供您按照以下相关内容,请按照如下命令安装Django版本错误信息错误代码进行以下内容,请按照指定的问题按照您需要求解惑问题请按照以下教程操作,谢谢!以下步骤来解决此问题并获得帮助。谢谢按照按按照安装并按照指引继续运行以下所有软件包管理器的使用指南,请按照提示更新您的文章内容或者问题描述您已为您按照提问以下内容回答的详细说明。如果您想处理Django项目部署文档:请按照以下请按照按照以下步骤在浏览器运行以下方式解决问题。
```bash

您已经提出了一个问题,请按照《Djia**欢迎您好,请联系我们获取帮助反馈帮助反馈")
<请按照要求安装相关依赖包管理工具(例如请根据你的提示问题回答您的步骤执行以下内容进行以下代码示例来发布您的问题并点击安装和使用 Django 我已经按照以下按照以下代码提问。谢谢按照以下方式将您的问题反馈,请严格按照按如下所示进行以下Python环境安装Django开发环境中的说明书信的回答以下Django Web 下载最新版本的依赖包进行提问

  根据我的上述答案编写一篇文章。您可能会按照以下步骤来帮助用户快速理解和解决问题,从而提高用户满意度。
# 按照提示学习。
###如果您有任何问题,请使用此教程进行解答;如需请按照以下解决方案:
```json如下所示:
- 我建议您将图像中的文字内容和图片进行解析、识别和分析来帮助用户解决OCR(光学字符识别)问题。请遵循指定路径进行操作即可。
1.这个问题的原因在于你的电脑环境,例如“你的电脑操作系统可能是Windows XP SP3。请注意查看电脑的SP版本:1. Windows 7 已经停止支持。 
假设您按照此种方式更新自己的软件或硬件资源;如不符合请在以上要求执行。
1、我应该怎样按照以下图片展示一下以下的流程来实现呢?请仔细研究这个问题并且仔细思考,以便快速得到有效解答。
2.请严格按照以上指引进行操作,以下是获取最新消息,请遵
    ```json
###问题描述:以下代码是关于光学文字识别的实践教程,旨在帮助我们更好地理解和解决 OCR 问题。例如,假设您的软件开发工作流程,请仔细研究、整理并根据需要自行进行以下步骤:
1.点击开始菜单找到应用程序管理器,如果您发现软件或系统漏洞,那么您可以按照以下步骤更新您的软件和硬件维护指南进行自检:
点击这个 Markdown 格式,这是一个示例。请在markdown中使用以下关键字进行替换,并按照顺序给按照以下内容更新您的学习资源。

#### 《GTA5中文教程》

OutOfMemoryError: CUDA out of memory on RunPod

Description:
While running DeepSeek Coder v2 on RunPod, I encountered a CUDA out of memory error. The error message indicated that the system attempted to allocate 20.00 MiB of memory, but GPU 0 had only 2.25 MiB free, despite having a total capacity of 23.67 GiB. The process in question had 23.66 GiB of memory in use, with 19.78 GiB allocated by PyTorch and 3.69 GiB reserved but unallocated by PyTorch.

Error Message:

OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 23.67 GiB of which 2.25 MiB is free. Process 4063579 has 23.66 GiB memory in use. Of the allocated memory 19.78 GiB is allocated by PyTorch, and 3.69 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Steps to Reproduce:

  1. Deploy DeepSeek Coder v2 on RunPod.
  2. Begin the model training or inference process.
  3. Monitor GPU memory allocation.

Expected Behavior:
The model should run without hitting memory limits, and ideally, it should handle memory allocation more efficiently or provide a mechanism to limit memory usage.

Environment:

  • RunPod Platform
  • DeepSeek Coder Version: v2
  • CUDA Version: 11.8.0
  • PyTorch Version: 2.1.0

Model always responds in Chinese, ignores system prompts stating to only reply in English

DeepSeek Coder V2 seems to only respond in Chinese, this occurs even when the system prompt explicitly states to only respond in English:

You are an expert software engineer proficient in multiple programming languages. Your task is to generate, complete, and refactor code snippets based on the given instructions. Provide clean, efficient, and well-commented code.

IMPORTANT: Always respond in English.

Still results in Chinese rather than English:

image

DeepSeek-Coder-V2-Lite model GPU/RAM requirement

Hi, thank you for the amazing work! In the readme you say "DeepSeek-Coder-V2 in BF16 format for inference, 80GB*8 GPUs are required".
How much GPU/RAM is needed for inference for DeepSeek-Coder-V2-Lite model?

When will the vllm PR be merged to the main branch?

Thank you for your impressive work on this project. I'm eager to try this model, but I've noticed that the vllm deployment pull request has conflicts with the main branch, and building vllm from scratch is challenging for my development environment.

Is there an active effort to resolve these conflicts and merge the PR into the main branch? If possible, could you provide an estimated timeline for this merge? I greatly appreciate your work and look forward to using this implementation. Thank you for your time.

CanNot Finetune deepseek-coder-v2-lite via modeling_deepseek.py

You may have som bug on type manipulation and thus the model can not be finetuned via DeepSpeed(bf16 mix precision)

File "/deepseek_v2/modeling_deepseek.py", line 1252, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/deepseek_v2/modeling_deepseek.py", line 953, in forward
q = self.q_proj(hidden_states)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

Reward model in the reinforcement learning process

Hello DeepSeek Team, thanks for your great work!

I fine-tuned your previous DeepSeek-Coder 33B model and got a model which performers well on the HumanEval benchmark. https://github.com/bin123apple/AutoCoder. However, while testing on the HumanEval+ benchmark, the new model's performance is not perfect.

I am thinking it might because that for all the data entries with execution feedback in my dataset, I only covered a small amount of test cases. And I noticed that in your paper, you mentioned that your reward model is trained by using the data provided by the compiler.

ac3f23a28b466d7162bef7632548dda

Is it possible for you to disclose whether the data used to train the reward model included test cases, or if it only required the code to pass the compiler? If test cases were included, could you please provide how many test cases each data entry typically contains?

Thanks again for your great work!

Any plans to release the 1B model as well?

Amazing work! I noticed that in the Section 2, you provided a series of ablation studies for the 1B model. I am curious if there are any plan to update the deepseek-coder-1.3b model series as well?

mismatch between example code and model files

I found in this repo and the huggingface model card there is a line:

# tokenizer.eos_token_id is the id of <|EOT|> token

But in tokenizer_config.py inside model repo the eos_token is set to be <|end▁of▁sentence|>:

"eos_token": {
    "__type": "AddedToken",
    "content": "<|end▁of▁sentence|>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }

which is correct?

Any plan to release the fintune example?

Great Work and Congraduations! Is there any plan to release a fintune example code for DeepSeek-Coder-V2?
I noticed that you mentioned about finetuning this model with 8*A100 GPUs with some skills, could you be more specific? THX!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.