milesj / emojibase Goto Github PK

🎮 A collection of lightweight, up-to-date, pre-generated, specification compliant, localized emoji JSON datasets, regex patterns, and more.

Home Page: https://emojibase.dev

License: MIT License

JavaScript 1.88% TypeScript 92.58% CSS 0.66% MDX 4.88%

emoji emoji-database emoji-db emojibase emojibase-data emojibase-regex unicode-technical-standard

emojibase's People

Contributors

Stargazers

Watchers

emojibase's Issues

Including `src/` into the bundled package

The npm package emojibase includes index.js.map but doesn't include src/*.ts. This will make a bundler unable to find the source code according to the map files. I propose that we include src/ inside the npm package or remove *.map from the npm package.

$ yarn run parcel index.html
Server running at http://localhost:1234
⚠️  Could not load source file "../src/appendSkinToneIndex.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fetchFromCDN.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/constants.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fetchShortcodes.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/joinShortcodesToEmoji.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/flattenEmojiData.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/joinShortcodes.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fetchEmojis.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fetchMetadata.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fromCodepointToUnicode.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fromHexcodeToCodepoint.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/fromUnicodeToHexcode.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/generateEmoticonPermutations.ts" in source map of "node_modules/emojibase/umd/index.js".
⚠️  Could not load source file "../src/stripHexcode.ts" in source map of "node_modules/emojibase/umd/index.js".
✨  Built in 35.55s.

Emoji v13

Hi @milesj! It seems Unicode has finalized the list https://unicode.org/emoji/charts-13.0/emoji-released.html ... wink wink :)

Supporting Unicode Emoji 14?

Hi @milesj, as usual we are wondering whether and when emojibase will be supporting Unicode Emoji 14? :) Any timeline? Sorry to bother and many thanks for all the good work! 🙏

Is there an API to get the unicode for a given shortcode?

Alias "tada" to "hooray"

(and maybe "confetti" as well?)

emojibase-data with TypeScript: property 'tags' is missing on CompactEmoji.skins

Hi, thanks for all the effort put into this.

I'm using emojibase-data with typescript and --resolveJsonModule and I'm getting a compile-time error.

Type '{ "annotation": string; "group": number; "hexcode": string; "order": number; "shortcodes": string[]; "unicode": string; }[]' is not assignable to type 'CompactEmoji[]'.

Property 'tags' is missing in type '{ "annotation": string; "group": number; "hexcode": string; "order": number; "shortcodes": string[]; "unicode": string; }' but required in type 'CompactEmoji'.

I had a look, and this is the type definition

export interface CompactEmoji {
  ...
  skins?: CompactEmoji[];
  tags: string[];
  ...
}

And looking at the data in en/compact.json, none of the skins seem have the tags property.

This doesn't seem to be an issue with the full data.json dataset because it uses a scalar SkinType value to represent variants.

Possible solution

Depending on how you intended skins to be used, we could

duplicate the parent emoji's tags on all of its skins
or make the tags property optional (probably a bad idea?)
or create a new CompactSkinEmoji type (could use TS 3.5's Omit type?)

I'm happy to give this a go. Cheers.

emoji meta for zh-Hant is showing zh

https://github.com/milesj/emojibase/blob/master/packages/data/zh-hant/meta.raw.json

The content is showing simplified Chinese (zh) instead of Traditional (zh-hant)

probably https://github.com/milesj/emojibase/pull/87/files#diff-d97ea88b1ef5b93bd52dff5ebd3cf282c6d2ab509147fd0358e52df8de0753e7L11 ?

Or is the unicode doesn't have the data?

Missing annotations

Some skins of emojis are missing annotations in /en/data.json (from NPM package) and also seem to be missing in packages/data/en/raw.json in this repository:

1F9D1-1F3FB-200D-1F91D-200D-1F9D1-1F3FC
1F9D1-1F3FB-200D-1F91D-200D-1F9D1-1F3FD
1F9D1-1F3FB-200D-1F91D-200D-1F9D1-1F3FE
1F9D1-1F3FB-200D-1F91D-200D-1F9D1-1F3FF
1F9D1-1F3FC-200D-1F91D-200D-1F9D1-1F3FD
1F9D1-1F3FC-200D-1F91D-200D-1F9D1-1F3FE
1F9D1-1F3FC-200D-1F91D-200D-1F9D1-1F3FF
1F9D1-1F3FD-200D-1F91D-200D-1F9D1-1F3FE
1F9D1-1F3FD-200D-1F91D-200D-1F9D1-1F3FF
1F9D1-1F3FE-200D-1F91D-200D-1F9D1-1F3FF
1F469-1F3FB-200D-1F91D-200D-1F469-1F3FC
1F469-1F3FB-200D-1F91D-200D-1F469-1F3FD
1F469-1F3FB-200D-1F91D-200D-1F469-1F3FE
1F469-1F3FB-200D-1F91D-200D-1F469-1F3FF
1F469-1F3FC-200D-1F91D-200D-1F469-1F3FD
1F469-1F3FC-200D-1F91D-200D-1F469-1F3FE
1F469-1F3FC-200D-1F91D-200D-1F469-1F3FF
1F469-1F3FD-200D-1F91D-200D-1F469-1F3FE
1F469-1F3FD-200D-1F91D-200D-1F469-1F3FF
1F469-1F3FE-200D-1F91D-200D-1F469-1F3FF
1F468-1F3FB-200D-1F91D-200D-1F468-1F3FC
1F468-1F3FB-200D-1F91D-200D-1F468-1F3FD
1F468-1F3FB-200D-1F91D-200D-1F468-1F3FE
1F468-1F3FB-200D-1F91D-200D-1F468-1F3FF
1F468-1F3FC-200D-1F91D-200D-1F468-1F3FD
1F468-1F3FC-200D-1F91D-200D-1F468-1F3FE
1F468-1F3FC-200D-1F91D-200D-1F468-1F3FF
1F468-1F3FD-200D-1F91D-200D-1F468-1F3FE
1F468-1F3FD-200D-1F91D-200D-1F468-1F3FF
1F468-1F3FE-200D-1F91D-200D-1F468-1F3FF

Single JSON file containing shortcodes

This is a small feature request, and I totally understand if it's out-of-scope for the project.

As of emojibase-data v6, the data.json file and shortcode file need to be stitched together to have both the shortcodes and emoji data in the same file (#64).

However, for my own use case (nolanlawson/emoji-picker-element#47), having a single file was a nice convenience. By default, the library just fetches the JSON from jsdelivr (this is designed for folks not using a build step). Switching to multiple files is possible, but pushes complexity either to me or my consumers (or both).

To ease the process of upgrading from v5 to v6, perhaps a new JSON file could be added – call it data-with-shortcodes.json – which just includes the default Emojibase shortcodes? This would provide the most direct upgrade path to v6, while also allowing users to stitch together the files themselves if they prefer some other shortcode set.

I can only speak for my own library, but this would definitely help with the upgrade process. 🙂 Thanks for creating emojibase!

`:P` should also be an emoticon for `😛`

Discord and others do this, so it seems to make sense for consistency

irregular/incorrect data?

I noticed that there are no tags for the "keycap ten" emoji in both data.json and compact.json (hex: 1F51F; shortcode: ten). The other keycaps have at least the tag "keycap".

It also seems to be the only character that has no "tags" property at all.

The `:D` emoticon doesn't match what most other platforms have

In Slack and Discord :D gets converted to 😄 whereas with emojibase it gets converted to 😀.

I think it would be better to be consistent with other platforms

emojis in wrong catecory

Looking at telegram, discord and also https://emojipedia.org/nature/ , it lookes like some emojis are not in the right category. for example, the fire emoji should be in animals-nature, but in emoji-base, its in nature.

New groups?

The website says for the Data Structure of the dataset:

group (number) - The categorical group the emoji belongs to, ranging from 0 (smileys) to 7 (flags). Undefined for uncategorized emojis.

but it seems that the groups range from 0 to 9. One of these groups seems to have only about 10 emojis? Is that part of the website outdated, or am I doing something wrong?

Recognize for example (y)

(y) = 👍 on many platforms

Make native option the "default"/"full" (or a case of two missing shortcodes)

I've noticed that there are two shortcodes missing from the ru/shortcodes/cldr-native, while being present in the "regular" ru/shortcodes/cldr

  "1F4BF": "cd",
  "1F4C0": "dvd",

Then I've noticed this in the docs

cldr-native# Like cldr but shortcodes are not transliterated to Latin characters.
Furthermore, this preset will only include shortcodes that do not contain shortcodes that already exist in the cldr preset

and realised that the issue of missing shortcodes is likely due to the fact that Unicode's CLDR ru/annotations.json are not translated but written in English as "CD" and "DVD", so this triggers a match with the non-native cldr data file and as a result of the above-cited rule an exclusion from the cldr-native data file

If that's the case, I'd like to suggest that at least that for non-Latin-based languages the native data file represents the default/full version as it simply makes no sense to use the transliterated version — no user (or even a developer) should ever be exposed to it and be forced to change layouts just to enter an emoji

(for Latin-based languages this might be complicated by the fact that popular apps force English-only shortcodes anyway, though that's also an unnecessary complication and it should be easier for the users of your awesome dataset to use native shortcodes as well instead of checking whether there is an extra shortcode in the non-native file)

Exclude "repeated" emojis

Hi,

I was trying to find a common pattern to exclude the first version of an emoji that got extended in feature ones. For example, the police officer has the version where its only "officer" and then the next two which are "office male" and "officer female".

Without doing it manually, have you seen anything in the emoji data to filter the old versions?

Thanks

Property "tone" inconsistencies (of emojibase-data 7.0.0 )

Hi @milesj,

many thanks for the new release! Yay!!

Tiny thing: I believe the property "tone" should be either a number or an array, or?

A bad case

👁‍🗨 this emoji character code in iOS is "\uD83D\uDC41\u200D\uD83D\uDDE8"
emojibase-regex can not match it

the code is different between 👁️‍🗨️ and 👁‍🗨 though they look the same

Wrestlers (People, Men, Women) should have skintones?

Hi @milesj!

I think the Wrestlers (People, Men, Women) should have skintones as of Unicode 3.0/4.0 (at least Emojipedia says that). The raw.json(https://github.com/milesj/emojibase/tree/master/packages/data/en) doesn't have them. Is this a mistake or is Emojipedia wrong?

https://emojipedia.org/men-wrestling/
https://emojipedia.org/women-wrestling/
https://emojipedia.org/wrestlers/

PS. Many thanks for the 12.1 update! Keep up the good work! 🙏 Really great to have the emojibase project!

Emoji regex doesn't capture the variation selector after textual symbols

Thanks for this excellent resource! I've found it incredibly useful. I have run into one problem though.

The emoji regex doesn't capture the \uFE0F variation selector that's used to indicate that a non-emoji character like \u2764 (heavy black heart) should be treated as an emoji. Here's a simple repro case:

const regex = require('emoji-database/regex');

'❤️'.match(regex);
// => [ '❤', index: 0, input: '❤️' ]

Here it is again with codepoint escapes instead of literal characters, in case your browser/OS combo actually renders a standalone \u2764 as a red heart (Chrome on OS X doesn't, at least):

const regex = require('emoji-database/regex');

'\u2764\uFE0F'.match(regex);
// => [ '\u2764', index: 0, input: '\u2764\uFE0F' ]

This issue seems to be present for all characters that don't have the Emoji_Presentation property (meaning that they default to a text representation rather than an emoji representation unless followed by \uFE0F).

The fix seems to be to match an optional \uFE0F at the end of the regex:

const fixedRegex = new RegExp(require('emoji-database/regex').source + '\uFE0F?');

'❤️'.match(fixedRegex);
// => [ '❤️', index: 0, input: '❤️' ]

I could see an argument for not capturing \uFE0F by default since that may not always be desirable, so I wasn't sure if this was intentional or not. If it is intentional, it might be helpful to mention this caveat (and the workaround) in the readme.

Thanks again!

Is it possible to add additional tags?

Hello.
Can you add additional tags to pleading_face 🥺 U+1F97A in Japanese?
Most Japanese call the emoji ぴえん (PIEN).
This is a famous meme and is written about in Wikipedia.
Thank you!

Group and subgroup seem to have the same value

Hey there,
I just stumbled upon your emojibase lists some days ago – such a nicely ordered and useful list of all the emojis – great work!
However we realized that the numbers of the group and subgroup are equal to eachother. The group numbers seem to be perfectly fine but the subgroups turn out to be faulty.
It would be really great if that issue could be fixed! We're so glad to have found your database and it would make us even happier if we were able to make use of the subgroups in near future, too!

Best regards,
Kai

😊 shows up under :anxious:

http://unicode.org/emoji/charts/emoji-list.html shows the closest emoji to anxious is 😰.

More emoji list data in test utils?

There can input "minimally-qualified" or "unqualified" emoji in Some devices
so, I suggest that you can provide a function to get full emoji data include "unqualified" and "minimally-qualified" emoji

e.g. #114

Documentation: User-friendly browser

The documentation links to https://github.com/milesj/emojibase/blob/master/packages/generator/src/resources/shortcodes.ts. I personally think a more user-friendly page with a simple in-page filter, similar to here might be nice.

Would you be open to a PR that includes this, or would this kind of thing be best in a separate repo?

Thanks!

Add next datasets

Datasets that point to the next upcoming non-released version.

Regex fails to match many emoji symbols

The emojibase regex fails to match quite a few symbols. Here's a quick example showing all the unmatched symbols I've been able to find:

const emojibaseRegex = require('emojibase-regex');
const emojis = '😐👽👍👎👈👉👆👇🖐👂👁🗣🕴👪🎓⛑👓🕶🐦🕷🕸🐟🐕🐈🕊🐿🌎🌍🌏🌕🌜🌤🌥🌦🌧⛈🌩🌨🌬🌪🌫🌶🍸🍽⛸⛷🏂🏄🏊🎖🏆🏵🎗🎟🎭🎬🎧🎮🏎🚑🚲🏍🚔🚍🚘🚇🛩🛰🛥🛳⛴🗺🏟⛱🏖🏝⛰🏔🏜🏕🛤🛣🏗🏭🏠🏘🏚🏛⛩🏞🏙💻🖥🖨🖱🖲🕹🗜💿📷📹📽🎞📟📺📻🎙🎚🎛⏱⏲🕰⏳🕯🗑🛢💰💳⚒🛠⛏⛓💣🗡🛡🕳🌡🛎🗝🛋🛏🖼🛍📥📤📦🏷📪📫📬📭🗒🗓🗃🗳🗄📋🗂🗞📚🖇🖊🖋🖌🖍🔍🔒🔓🕉🚭❓🚹🚺🚼⏸⏯⏹⏺⏭⏮⏩⏪🔈👁‍🗨🗯🕐🕑🕒🕓🕔🕕🕖🕗🕘🕙🕚🕛🕜🕝🕞🕟🕠🕡🕢🕣🕤🕥🕦🕧';

emojis.match(emojibaseRegex);
// => null

I spent some time poking around to see if I could find the root cause and open a PR, but ran out of time to work on this today so I figured I'd at least report it.

Localize "shortcodes.ts"

It is somewhat awkward to type an english description of an emoji to enter it, or know its "meaning" when typing in another language. Moreover, people not proficient at English are unduly penalized when using otherwise-translated pieces of software that leverage this database (I'm thinking of riot.im at least). People are not going to teach English to their relatives (grandparents, etc) because those would like to type 🌛 or something similar.

I guess the easiest way would be to provide more shortcodes.ts files, but crowd-sourcing could also be handled trough weblate (ideally providing the emoji as context).

See: element-hq/element-web#11013 and possibly https://github.com/vector-im/riot-web/issues/9298

more locales?

Hey, Miles. I'd like to use your emoji datasets in a project, but I need support for locales you haven't converted CLDR data for. I've examined your conversion scripts, and they seem to reference some libraries not included in the repo, so I haven't attempted to run them myself. Could I make a request for some locales?

Emoji Table page is broken

I’m getting the following error at https://emojibase.dev/emojis/:

Uncaught (in promise) TypeError: (0 , i.t) is not a function
    at c5e36b27.1064ba1b.js:1
    at s (3749.e06c569f.js:2)
    at Generator._invoke (3749.e06c569f.js:2)
    at Generator.next (3749.e06c569f.js:2)
    at e (3749.e06c569f.js:2)
    at u (3749.e06c569f.js:2)
    at 3749.e06c569f.js:2
    at new Promise (<anonymous>)
    at 3749.e06c569f.js:2
    at c5e36b27.1064ba1b.js:1

Emoticon docs

Add to data structure section
Add permutation API
Add emoticons section

Feature request: make the CDN url configurable

Would you be open to (a PR) adding the CDN url to the FetchFromCDNOptions?

Because of security policies, we want to host the emoji data on our own CDN. It would be great if I can keep using this package with its caching capabilities and simply give it a different URL to load the data from.

Something along the lines of:

// types.ts
export interface FetchFromCDNOptions extends RequestInit {
	/** Cache the response in local storage instead of session storage. Defaults to `false`. */
	local?: boolean;
	/** The url from which to load the JSON files */
	cdnUrl?: string;
	/** The release version to fetch. Defaults to `latest`. */
	version?: string;
}

// fetchFromCDN.ts
export async function fetchFromCDN<T>(path: string, options: FetchFromCDNOptions = {}): Promise<T> {
	const {
		local = false,
		cdnUrl = 'https://cdn.jsdelivr.net/npm/emojibase-data@',
		version = 'latest',
		...opts,
	} = options;

	...

	// eslint-disable-next-line compat/compat
	const response = await fetch(`${cdnUrl}${version}/${path}`, {
		credentials: 'omit',
		mode: 'cors',
		redirect: 'error',
		...opts,
	});

It could also be worth considering making it a function that receives version and path, so developers get more flexibility of how to organize their own CDN.

Detecting gender-neutral emoji

There are a few emoji that are available both with a gender and in a neutral version:

  {
    "annotation": "genie",
    "name": "GENIE",
    "hexcode": "1F9DE",
    "shortcodes": [
      "genie"
    ],
    "tags": [
      "djinn"
    ],
    "emoji": "🧞",
    "text": "",
    "type": 1,
    "order": 1507,
    "group": 1,
    "subgroup": 25,
    "version": 5
  },
  {
    "annotation": "man genie",
    "name": "GENIE, MALE SIGN",
    "hexcode": "1F9DE-200D-2642-FE0F",
    "shortcodes": [
      "man_genie"
    ],
    "tags": [
      "djinn"
    ],
    "emoji": "🧞‍♂️",
    "text": "",
    "type": 1,
    "order": 1508,
    "group": 1,
    "subgroup": 25,
    "version": 5,
    "gender": 1
  },
  {
    "annotation": "woman genie",
    "name": "GENIE, FEMALE SIGN",
    "hexcode": "1F9DE-200D-2640-FE0F",
    "shortcodes": [
      "woman_genie"
    ],
    "tags": [
      "djinn"
    ],
    "emoji": "🧞‍♀️",
    "text": "",
    "type": 1,
    "order": 1510,
    "group": 1,
    "subgroup": 25,
    "version": 5,
    "gender": 0
  },

It would be fantastic, if you could mark emoji that have gendered versions available, but are neutral. For skintones, that's possible by checking whether a skins field is present (if there is, it's the neutral version), but for genders that's not possible.

Regarding notation, I'm unsure what would be best. Maybe "gender": null, "gender": -1 or "gender_neutral": true?

Supporting Unicode Emoji 13.1 ?

Hi @milesj,

Hope you've had a nice start into 2021 despite the depressing times!

As usual ... any timeline on supporting Unicode Emoji 13.1? :)

Support multiple shortcode presets

There is no standard for shortcodes (:smile:), so every website/platform (Slack, GitHub, etc) have different implementations. Furthermore, emojibase implements its own version (since there is no standard to reference from), which results in common emojis having different shortcodes (people don't like this).

Ideally I would like to support shortcode presets from multiple sources, but I'm not entire sure where I could get this information. In an ideal world, it would look like the following:

Remove shortcodes from all emojibase-data.
Add new emojibase-data/shortcodes/*.json datasets that map hexcodes to an array of custom shortcodes for each emoji.
Have the consumer load and stitch them together.

Flags are missing demonym tags

First off, thank you for making emojibase. It is an incredible library that simplifies so many aspects of creating emoji pickers and emoji search databases. My project emoji-picker-element would not have been possible without it. I wouldn't have even attempted it if emojibase did not exist. So thank you! 🙂

I noticed that, in English, the flag emoji tags/annotations do not include demonyms. So for instance:

"Denmark" ✔️ / "Danish" ❌
"Ireland" ✔️ / "Irish" ❌
"England" ✔️ / "English" ❌

You can test this by searching for e.g. "english" or "danish" on emojibase.dev.

I understand if it's the responsibility of emojibase users to do stemming/expansion/etc. E.g. the 🙏 emoji has keywords for pray, but not praying (nor prays, prayer, other conjugations, etc.). Totally understandable, especially for languages with tons of conjugations. Just wanted to raise the issue in case it's of interest. Thanks! 🙂

Categories all over the place?

It seems that the categories are all mixed up, the first category (smiley and faces) goes all the way across several other ones, including animals and objects like a diamond jewel.

Actually every single category in this file is different from what you will find in the emoji keyboards, websites, etc. its here a reason for this?

Missing emojis

I can't find any details about these emojis:

Minify published data

Published JSON data should be minified to further reduce the overall file size.

Supported locale questions

Thanks for the library!

This is the current supported locale
https://emojibase.dev/docs/datasets/#supported-locales

But from https://github.com/unicode-org/cldr-json it seems like there are much more supported emoji locale from the github json repository.

Wondering what are the main differences between http://unicode.org/Public/emoji/ and the github repo one

Add more locales

May I ask to add support for Ukrainian language?
Also, based on translation status of Element Web and according to occurrence of particular language speakers in Element translation room, I would suggest adding Estonian, Finnish, Hungarian, Lithuanian and Norwegian languages as well if that's possible.
Thanks.

Emoji v12

http://blog.unicode.org/2019/02/unicode-emoji-12-final-for-2019.html

"hopeful" emoji looks sad / crying

emojibase/packages/generator/src/resources/shortcodes.ts

Lines 103 to 104 in 31c189b

 // 😥 sad but relieved face 

 '1F625': ['hopeful'],

Via element-hq/element-web#13334 (comment)

Add skin tone translations

These would be useful within meta.

  // Skin tones
  skinNone: 'No skin tone',
  skinLight: 'Light skin tone',
  skinMediumLight: 'Medium-light skin tone',
  skinMedium: 'Medium skin tone',
  skinMediumDark: 'Medium-dark skin tone',
  skinDark: 'Dark skin tone',

Add emoji reference to meta files

First thanks a lot for your great library!

While working with its meta json files, I've noticed I would find it very useful to have objects instead of arrays for shortcodes.json. The keys would be the actual shortcode and as a value I would expect a reference to the actual emoji information. I've thought about a simple integer with the emoji's order property, so that you can get the emoji from your (already ordered) data.json simply using the index.

I think this could also be useful for the hexcodes.json and unicode.json files. If you agree, I would be happy to make a PR with the necessary changes, but I wanted to discuss it here first.

Consider moving all dependencies to devDependencies

As far as I can tell, none of the generated datasets or regexes actually rely on any of the production dependencies listed in package.json. Could they be moved to devDependencies to prevent them from being installed in production?

I ask because EmojiOne in particular is really huge (149MB!). In environments like Heroku where there are limits on how large an app and its dependencies can be, this is unfortunate.

I'd be happy to send a PR for this, but I wanted to make sure I'm not missing something first.

No skintones for 🤝 Handshake?

If I'm not mistaken then there are no skintones for 🤝 "Handshake" 1F91D in emojibase-data, or? But https://emojipedia.org/search/?q=handshake and other vendors are having them ... is this again a non-official Unicode thing like #40?

It seems that the the default regex (import EMOJI_REGEX from 'emojibase-regex'), the emoji regex (import EMOJI_REGEX from 'emojibase-regex/emoji') and also the text regex (import EMOJI_REGEX from 'emojibase-regex/text') all match non-qualified characters like the copyright sign.

As far as I know, the copyright sign is text-default (https://emojipedia.org/emoji/%C2%A9/), so there should be a "non-greedy" regex that does not match the copyright sign without an emoji variant selector, right? Is there such a regex in this project?

Other text-default emoji include the warning sign or the biohazard sign. See also: http://xahlee.info/comp/text_vs_emoji.html

Have I imported the wrong regex?

Mappings are wrong on Apple platform

I have been redirected here from http://github.com/vector-im/riot-web:

It seems the current mappings break Apple platforms:
For example, :( is ☹︎, which is not rendered on Apple platforms as an Emoji, while 🙁 is an Emoji.
:D is rendered as 😁, when it is more commonly rendered as 😀 (not really a breakage, just inconsistency).
Things like 👍 result in 👍️, which is rendered as a thumb up followed by a white box, while 👍 renders correctly.

Basically, look at the entire list on an iOS or Safari on a mac device, and most of them are broken.

milesj / emojibase Goto Github PK

emojibase's People

Contributors

Stargazers

Watchers

Forkers

emojibase's Issues

Possible solution

Recommend Projects

Recommend Topics

Recommend Org