Git Product home page Git Product logo

Comments (5)

urnathan avatar urnathan commented on July 3, 2024

ah, right I'd forgotten/ignored non-ascii collating (for the does-this-need-quoting pieces). I was also making an implicit assumption that the two ends used the same char set (for the message comparison pieces), which I guess wouldn't be true for cross-compiling. I'm unfamiliar with how ebcdic/utf8 interact -- I had thought that ebcdic happened to also use 7-bits, so the utf8 bit-8 scheme also worked there without changing the underlying char encoding. IIUC you're saying u8'a' is 0x61, regardless. Is that right? so UTF8 is both (a) a specific mapping between characters and integers and (b) an encoding of those integers into 8bit octets?

eta: also, need to figure why github doesn;t email me about issues ...

from libcody.

 avatar commented on July 3, 2024

IIUC you're saying u8'a' is 0x61, regardless. Is that right? so UTF8 is both (a) a specific mapping between characters and integers and (b) an encoding of those integers into 8bit octets?

There is a very deep rabbit hole when we start talking about "source character set" and "source character encoding" which is a very hot topic in SG16 right now. So, since I assume that the source code of your library is in ASCII or UTF-8, we can sidestep it and trivialize many things, so let's assume that the source code is written in UTF-8 and consumed by compiler as UTF-8.

Then two things are certain:

auto a = 'a';
std::cout << static_cast<int>(a) << '\n'; // implementation-defined

auto b = u8'a';
std::cout << static_cast<int>(b) << '\n'; // always 97 (0x61)

At least as of right now, there is an understanding that char literals are converted to "execution character set" regardless of what character set/encoding the source file is written in. On z/OS the default execution character set is EBCDIC so in our case the compiler will consume UTF-8 source file as UTF-8 (this may require passing explicit compiler flags in reality), do phases of translation (this includes conversion of the source code to "internal compiler encoding used during translation"), will see literal 'a' and manually convert it from "internal compiler encoding used during translation" to EBCDIC. The literal u8'a' will be treated differently but the integer value will always be 97 in the end.

The standardese will mostly likely change wildly in C++23 but the basic principle will stay the same.

from libcody.

urnathan avatar urnathan commented on July 3, 2024

Ah great, thanks for clarifying -- whenever I google this the results say 'you don't have to care about EBCDIC anymore', if they mention it at all!

from libcody.

urnathan avatar urnathan commented on July 3, 2024

I've just push a patch for this. Excitingly C++11 doesn't have utf8 char literals, only utf8 string literals. I've probably flubbed something.

from libcody.

 avatar commented on July 3, 2024

You are correct, I forgot about that. Yeah, it's a shame that this requires more tricks with literals to get sane code.

from libcody.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.