Git Product home page Git Product logo

Comments (2)

djsaber avatar djsaber commented on June 8, 2024

找到原因了,流式输出时,模型的response是通过对每个step的token单独解码,然后和历史response拼接,作为当前step的response,这样子有个问题,例如”淩“,对应的token是[233, 186, 172],单独对233、186、172解码会出现”�“,拼接后会输出”���“。

我的解决办法是解码时如果出现乱码,缓存当前step的token,然后继续下个step,直到缓存的长度超过限制(5)或解码出明文,则清空缓存的token。

修改modeling_internlm.py中stream_chat()方法中ChatStreamer类:

class ChatStreamer(BaseStreamer):
     def __init__(self, tokenizer) -> None:
        super().__init__()
        self.tokenizer = tokenizer
        self.queue = response_queue
        self.query = query
        self.history = history
        self.response = ""
        self.cache = []
        self.received_inputs = False
        self.queue.put((self.response, history + [(self.query, self.response)]))

        def put(self, value):
            if len(value.shape) > 1 and value.shape[0] > 1:
                raise ValueError("ChatStreamer only supports batch size 1")
            elif len(value.shape) > 1:
                value = value[0]

            if not self.received_inputs:
                # The first received value is input_ids, ignore here
                self.received_inputs = True
                return
                
            self.cache.extend(value.tolist())
            token = self.tokenizer.decode(self.cache, skip_special_tokens=True)
            if "�" in token and len(token) <= 5:
                return
            self.cache = []
            if token.strip() != "<eoa>":
                self.response = self.response + token
                history = self.history + [(self.query, self.response)]
                self.queue.put((self.response, history))
            else:
                self.end()
           

from internlm.

gaoyang07 avatar gaoyang07 commented on June 8, 2024

Brilliant! Would you like to create a PR to fix it as a new contributor?

from internlm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.