Git Product home page Git Product logo

Comments (3)

puellanivis avatar puellanivis commented on August 18, 2024

You haven’t provided any error messages or examples of the text containing a ZWNBSP or where in the string.

However, if this is happening at the start of your string, then this is probably a result of Byte-Order Marking, where a string starts with 0xfeff, and since 0xfffe is defined as an invalid Unicode codepoint, you can then identify if you’re dealing with UTF-16LE, from UTF-16BE. Especially, if it’s pulling this data from lines from a Windows text file, like .BAT as it is known to add these BOMs in files saved in Unicode.

from protobuf.

agruetz avatar agruetz commented on August 18, 2024

Sorry for that missing information. The data is come from a MySQL select query. It is essentially has a gRPC api server that node.GetWork is calling and returning this.

I have confirmed that inside of the server it is not being added. It is being added some where in the encoding and transfer across the wire and then the subsequent decode on the client side.

Yes I have been able to work around it by specifically stripping the 0xfeff character from the string but it seems odd it is there in the first place.

I also agree that this is likely the result of Byte-Order Marking because it is at the very start of the string.

What I find most odd is that it only happens sometimes, it is not every string. It almost feels as if it is being used as padding for the encode/decode for the wire transfer but is not properly being stripped off in all cases.

I am happy to provide more detail or code or debug out put, I just was not sure what all would be helpful. Or if this was some known expected behavior I was not aware of.

from protobuf.

puellanivis avatar puellanivis commented on August 18, 2024

Protobuf doesn’t typically use any padding let alone 0xfeff specifically.

Have you tried looking at the raw MySQL query values directly? Maybe someone is copy-pasting in from a Windows text file somewhere? It can be a notoriously difficult character to notice because it’s zero-width, and thus might not seem to show up normally.

Maybe a short copy of an encoded Work message that triggers the issue? I maybe wouldn’t jump straight to copy-pasting here an excerpt of the MySQL data for that message, but also, it probably wouldn’t hurt.

from protobuf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.