mathiasbynens / emoji-regex Goto Github PK

View Code? Open in Web Editor NEW

1.7K 21.0 174.0 177 KB

A regular expression to match all Emoji-only symbols as per the Unicode Standard.

Home Page: https://mths.be/emoji-regex

License: MIT License

JavaScript 100.00%

emoji regex regexp regular-expression unicode

emoji-regex's Introduction

emoji-regex

emoji-regex offers a regular expression to match all emoji symbols and sequences (including textual representations of emoji) as per the Unicode Standard. It’s based on emoji-test-regex-pattern, which generates (at build time) the regular expression pattern based on the Unicode Standard. As a result, emoji-regex can easily be updated whenever new emoji are added to Unicode.

Installation

Via npm:

npm install emoji-regex

In Node.js:

const emojiRegex = require('emoji-regex');
// Note: because the regular expression has the global flag set, this module
// exports a function that returns the regex rather than exporting the regular
// expression itself, to make it impossible to (accidentally) mutate the
// original regular expression.

const text = `
\u{231A}: ⌚ default emoji presentation character (Emoji_Presentation)
\u{2194}\u{FE0F}: ↔️ default text presentation character rendered as emoji
\u{1F469}: 👩 emoji modifier base (Emoji_Modifier_Base)
\u{1F469}\u{1F3FF}: 👩🏿 emoji modifier base followed by a modifier
`;

const regex = emojiRegex();
for (const match of text.matchAll(regex)) {
  const emoji = match[0];
  console.log(`Matched sequence ${ emoji } — code points: ${ [...emoji].length }`);
}

Console output:

Matched sequence ⌚ — code points: 1
Matched sequence ⌚ — code points: 1
Matched sequence ↔️ — code points: 2
Matched sequence ↔️ — code points: 2
Matched sequence 👩 — code points: 1
Matched sequence 👩 — code points: 1
Matched sequence 👩🏿 — code points: 2
Matched sequence 👩🏿 — code points: 2

For maintainers

How to update emoji-regex after new Unicode Standard releases

Update emoji-test-regex-pattern as described in its repository.
Bump the emoji-test-regex-pattern dependency to the latest version.

Update the Unicode data dependency in package.json by running the following commands:

# Example: updating from Unicode v13 to Unicode v14.
npm uninstall @unicode/unicode-13.0.0
npm install @unicode/unicode-14.0.0 --save-dev

Generate the new output:
```
npm run build
```
Verify that tests still pass:
```
npm test
```

How to publish a new release

On the main branch, bump the emoji-regex version number in package.json:
```
npm version patch -m 'Release v%s'
```
Instead of patch, use minor or major as needed.

Note that this produces a Git commit + tag.
Push the release commit and tag:
```
git push && git push --tags
```
Our CI then automatically publishes the new release to npm.

Author


Mathias Bynens

License

emoji-regex is available under the MIT license.

emoji-regex's People

Contributors

Stargazers

Watchers

Forkers

chrisnicholls frayxrulez haio ywo emmet7life usecanvas bernardo-cs unforeseenocean daveschumaker doraemon0522 devongovett egoist tommoor patrickkettner gilmoreorless lostintime korbin rgrove accengage lisennk vijaiendransv toolmantim 0x263b iamjesse98 deepika087 zhn1010 eholtrop leisenhuang congnt0910 kimjbstar 309746069 lewismoore lishali12345 mesqueeb verystarters fazendaaa andreasnicolaou haedev sunya9 asilinwei meiqi502 jarvisyang ella77 henrikra karmadesign sameoldmadness giapdien1804 dalikmong dovilesand rodrigopolo ahrengot youngsens luojian001 alexmdotru d4rkr3pt0r sethips chunln hitorisensei nxg916 hansuku nicedudu mike-north wuchangpu luqihao rfearing tasha-urbancic youngbo-china semerdzhancarina juancarlosc longndmta cookpete jiisd wsd1993 jays0s silicon-beach-labs rookie-day nukotsuka fenghaisheng jtomaszewski mango906 yuyu1023 webengineerli ptrdu rockingelevator scriby nebhay massyao lyfuci ffet detachhead zdongh2016 kansoft wpfpizicai dominictabu dariusnorv mattbroussard moander koushik-radhakrishan icodein lionsmight

emoji-regex's Issues

Avoid matching emoji followed by text variation selector (U+FE0E)

First of all, thanks for this project! It's very useful.

It appears that the regex even matches codepoints that are followed by a text variant selector (FE0E).

The exclamation mark is an emoji with emoji-default representation. It should be matched both without a variant selector and with an emoji variant selector (FE0F).

However, it should not be matched when followed by a text variant selector (FE0E).

let m: string[];
console.info('no variation');
const r1 = emojiRegex();
while ((m = r1.exec('\u2757')) !== null) {
    console.log('match', m, 'lastIndex', r1.lastIndex);
}
const r2 = emojiRegex();
console.info('text variation');
while ((m = r2.exec('\u2757\ufe0e')) !== null) {
    console.log('match', m, 'lastIndex', r2.lastIndex);
}
const r3 = emojiRegex();
console.info('emoji variation');
while ((m = r3.exec('\u2757\ufe0f')) !== null) {
    console.log('match', m, 'lastIndex', r3.lastIndex);
}

This will match the emoji 3 times, each time with length 1.

My expectation would be that the version without variant selector is matched with length 1, that the version with emoji variant selector is matched with length 2, and that the version with text variant selector is not matched at all.

this one is not included

‍🗨'

Failing to match emoji keycap numbers 0-9, # and *

I tested, all emojis available on iOS 9.1 / OS X 10.11.1 and it is failing to detect emoji numbers:

\u0030\ufe0f\u20e3
\u0031\ufe0f\u20e3
\u0032\ufe0f\u20e3
\u0033\ufe0f\u20e3
\u0034\ufe0f\u20e3
\u0035\ufe0f\u20e3
\u0036\ufe0f\u20e3
\u0037\ufe0f\u20e3
\u0038\ufe0f\u20e3
\u0039\ufe0f\u20e3
\u0023\ufe0f\u20e3
\u002a\ufe0f\u20e3

As typed

0️⃣
1️⃣
2️⃣
3️⃣
4️⃣
5️⃣
6️⃣
7️⃣
8️⃣
9️⃣
#️⃣
*️⃣

Missing Unicode 9.0 "Rolling on the floor laughing" and more

This otherwise wonderful and highly useful regular expression seems to be missing a number of modern / new Emoji included in Unicode 9 / Emoji 3.0. The missing emojis include the popular Rolling on the floor laughing (:rofl:) and Nauseated Face (:nauseated_face:), as well as others.

Any chance we can get these new ones added to the regex?

Thanks for an awesome library!

- Joe

Emoji 🖼 not matched

🖼 doesn't seem to be matched by the regex.
Thanks for this wonderful library btw.

Option of flagless search

Hi Mathias,

We are currently having a use case that will run through array to check if the item is emoji

const emojiRegex = require('emoji-regex');
const textList = ['😁', '😂', '😃' ,'😄', '😅', '😆'];
const isEmoji = char => emojiRegex().text(char);
textList.forEach(char => every(isEmoji(char));

The global search state needs to be reset so we will need to call emojiRegex every time instead of just create once like.

const emojiRegex = require('emoji-regex');
const REGEX = emojiRegex();
const textList = ['😁', '😂', '😃' ,'😄', '😅', '😆'];
const isEmoji = char => REGEX.text(char);
textList.forEach(char => every(isEmoji(char));

probably really minor but there do have some perf difference if the use case is just to match single string.

https://jsperf.com/create-regex/1

Would like to hear about your idea and suggestion.

Thanks and Cheers!

Does not match all combinations of emojis that support multiple skin tones

For instance, the regex will match 🧑🏼‍🤝‍🧑🏻 ('\u{1F9D1}\u{1F3FC}\u200D\u{1F91D}\u200D\u{1F9D1}\u{1F3FB}'), but it doesn't match 🧑🏻‍🤝‍🧑🏼 ('\u{1F9D1}\u{1F3FB}\u200D\u{1F91D}\u200D\u{1F9D1}\u{1F3FC}'). It seems to only work if the darker skin tone is on the left side. There are some more emojis that support multiple skin tones coming, so it would be great if this could match against all possible combinations.

Text version of regexp captures numbers and other symbols

Regexp does not match all emoji but match digits

Hello!
I found two issues with v7.0.1 regular expression.

Regexp matches digits like 0,1...9 but not to matches some emoji codes like \u271d (Latin Cross Emoji). It seems the problem is here: (?:[#*0-9\xA9\xAE\u203C\
Regexp does not matches some long emoji constructions like \uD83D\uDC73\u200D\u2640\uFE0F (👳‍♀️)

missing code points for some emojies

using the library I'm trying to find out if a text has only emojies
this is the way I try to do it:

import emojiRegex from 'emoji-regex/es2015/text'; // can be the regular version as well (not the text one)

later, I'm going through all the matches and accumulating each match's length (to sum up the code points)
at the end I'm comparing the original text's length and the accumulated length as follows:

let totalEmojiesLength = 0;
let match;
while ((match = regex.exec(this.data.body))) {
        const emoji = match[0];
        totalEmojiesLength += [...emoji].length;
}

if (this.data.body.length === totalEmojiesLength) {
        return true;
}

return false;

However, for some emojies, e.g. 🎃, ⛄️ and some more, only the first code point is returned, so the length of the emojies is wrong.

it looks like a bug for those emojies that their match doesn't return all their code points

and follow-up question - is there any other way to test if a text has only emojies using the library? I didn't find one and my solution isn't really effective performance wise...

related to #35

emoji-regex/text thinks "1" is a an emoji

As suggested here I started using emoji-regex/text to detect all emojis. However, when using emoji-regex/text the regular expressions starts failing by thinking numbers are emojis as well.

Some basic emojis does not match

Hi, I'm using your library to match emojis. the problem is about matching some emojis I think are standard emojis. some of them is listed below:

🏳️ White Flag U+1F3F3
☺️ Smiling Face U+263A
☹️ Frowning Face U+2639

Does the library matched with them? Is there any way to add them to regex?

Number is an emoji

How can I fix it? I already read it.

const emojiRegex = require('emoji-regex/text.js')
console.log(emojiRegex().test('1')) // true

Missing skin tone?

Skin tone modifiers seem to break this. 👍🏿will not be matched by the regex <_<

Some emojis lost

I try to parse such string ♿️🎇🏖🌎🗺🍌🐯 , then I get array of emojis that when I join I get string of 13 symbols. When original emojis has 14 symbols.

So after parsing was lost 'VARIATION SELECTOR-16'

I used text.js for working with variations

Published version’s es2015 output is incorrect

Hi there,

It looks like the release cut after #22 (v6.5.0) was merged perhaps didn’t have dependencies bumped correctly when it was built, as the es2015 modules don’t match the output I expected!

They should start with;

module.exports = () => {
	// https://mathiasbynens.be/notes/es-unicode-property-escapes#emoji
	return (/\u{1F3F4}\u{E0067}\u{E0062}(?:\u{E0065}\u{E006E}\u{E0067}|\u{E0077}\u{E006C}\u{E0073}|\u{E0073}\u{E0063}\u{E0074})\u{E007F}|\u{1F469}\u200D\u{1F469}\u200D.../gu
	);
};

But instead begin;

module.exports = () => {
	// https://mathiasbynens.be/notes/es-unicode-property-escapes#emoji
	return (/\uD83C\uDFF4\uDB40\uDC67\uDB40\uDC62(?:\uDB40\uDC65\uDB40\uDC6E\uDB40\uDC67|\uDB40\uDC77\uDB40\uDC6C\uDB40\uDC73|\uDB40\uDC73\uDB40\uDC63\uDB40\uDC74)\uDB40\uDC7F|\uD83D\uDC69\u200D\uD83D\uDC69\u200D(?:\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67]))|…/gu
	);
};

Could you do a new release and double check the dependencies (regexgen in particular!) are up to date? Unfortunately there’s no spec checking the format of the regex! 😅

🎖 is not matched

Hey there,

Love this library, but I've been seeing some new(er) emojis that are not matched correctly?

import emojiRegex from 'emoji-regex'
const e = emojiRegex()
e.test('🎖') // false

Thanks,

Brekk

Handle 👁‍🗨

Looks like the script doesn’t match the character properly:

'<p>Foo 👁‍🗨 Bar</p>'.replace(emojiRegex(), function(match) {
  return '<b>' + match + '</b>';
});

// → <p>Foo <b>👁</b>‍🗨 Bar</p>

this will match 0 as if were a 0️⃣ :P

This regexp is not useful when you want to match just emojis... because then you have a 0 and this library will think "oh yeah 0 could be an emoji as 0️⃣" but thats not what I was expecting.

if regexp '0'
return false

if regexp '0️⃣'
return true

Some Emoji no longer match after 6.1.3

Thanks for providing the library, we notice that some emoji no longer match the regex after the latest version publish.

Not sure if it because the unicode spec changes?
http://www.unicode.org/reports/tr51/

Test case

const emojiRegex = require('emoji-regex');

const emojis = '😁,😂,😃,😄,😅,😆,😉,😊,😋,😌,😍,😏,😒,😓,😔,😖,😘,😚,😜,😝,😞,😠,😡,😢,😣,😤,😥,😨,😩,😪,😫,😭,😰,😱,😲,😳,😵,😷,😸,😹,😺,😻,😼,😽,😾,😿,🙀,🙅,🙆,🙇,🙈,🙉,🙊,🙋,🙌,🙍,🙎,🙏,✂,✅,✈,✉,✊,✋,✌,✏,✒,✔,✖,✨,✳,✴,❄,❇,❌,❎,❓,❔,❕,❗,❤,➕,➖,➗,➡,➰,🚀,🚃,🚄,🚅,🚇,🚉,🚌,🚏,🚑,🚒,🚓,🚕,🚗,🚙,🚚,🚢,🚤,🚥,🚧,🚨,🚩,🚪,🚫,🚬,🚭,🚲,🚶,🚹,🚺,🚻,🚼,🚽,🚾,🛀,Ⓜ,🅰,🅱,🅾,🅿,🆎,🆑,🆒,🆓,🆔,🆕,🆖,🆗,🆘,🆙,🆚,🇩🇪,🇬🇧,🇨🇳,🇯🇵,🇰🇷,🇫🇷,🇪🇸,🇮🇹,🇺🇸,🇷🇺,🈁,🈂,🈚,🈯,🈲,🈳,🈴,🈵,🈶,🈷,🈸,🈹,🈺,🉐,🉑,©,®,‼,⁉,8⃣,9⃣,7⃣,6⃣,1⃣,0⃣,2⃣,3⃣,5⃣,4⃣,#⃣,™,ℹ,↔,↕,↖,↗,↘,↙,↩,↪,⌚,⌛,⏩,⏪,⏫,⏬,⏰,⏳,▪,▫,▶,◀,◻,◼,◽,◾,☀,☁,☎,☑,☔,☕,☝,☺,♈,♉,♊,♋,♌,♍,♎,♏,♐,♑,♒,♓,♠,♣,♥,♦,♨,♻,♿,⚓,⚠,⚡,⚪,⚫,⚽,⚾,⛄,⛅,⛎,⛔,⛪,⛲,⛳,⛵,⛺,⛽,⤴,⤵,⬅,⬆,⬇,⬛,⬜,⭐,⭕,〰,〽,㊗,㊙,🀄,🃏,🌀,🌁,🌂,🌃,🌄,🌅,🌆,🌇,🌈,🌉,🌊,🌋,🌌,🌏,🌑,🌓,🌔,🌕,🌙,🌛,🌟,🌠,🌰,🌱,🌴,🌵,🌷,🌸,🌹,🌺,🌻,🌼,🌽,🌾,🌿,🍀,🍁,🍂,🍃,🍄,🍅,🍆,🍇,🍈,🍉,🍊,🍌,🍍,🍎,🍏,🍑,🍒,🍓,🍔,🍕,🍖,🍗,🍘,🍙,🍚,🍛,🍜,🍝,🍞,🍟,🍠,🍡,🍢,🍣,🍤,🍥,🍦,🍧,🍨,🍩,🍪,🍫,🍬,🍭,🍮,🍯,🍰,🍱,🍲,🍳,🍴,🍵,🍶,🍷,🍸,🍹,🍺,🍻,🎀,🎁,🎂,🎃,🎄,🎅,🎆,🎇,🎈,🎉,🎊,🎋,🎌,🎍,🎎,🎏,🎐,🎑,🎒,🎓,🎠,🎡,🎢,🎣,🎤,🎥,🎦,🎧,🎨,🎩,🎪,🎫,🎬,🎭,🎮,🎯,🎰,🎱,🎲,🎳,🎴,🎵,🎶,🎷,🎸,🎹,🎺,🎻,🎼,🎽,🎾,🎿,🏀,🏁,🏂,🏃,🏄,🏆,🏈,🏊,🏠,🏡,🏢,🏣,🏥,🏦,🏧,🏨,🏩,🏪,🏫,🏬,🏭,🏮,🏯,🏰,🐌,🐍,🐎,🐑,🐒,🐔,🐗,🐘,🐙,🐚,🐛,🐜,🐝,🐞,🐟,🐠,🐡,🐢,🐣,🐤,🐥,🐦,🐧,🐨,🐩,🐫,🐬,🐭,🐮,🐯,🐰,🐱,🐲,🐳,🐴,🐵,🐶,🐷,🐸,🐹,🐺,🐻,🐼,🐽,🐾,👀,👂,👃,👄,👅,👆,👇,👈,👉,👊,👋,👌,👍,👎,👏,👐,👑,👒,👓,👔,👕,👖,👗,👘,👙,👚,👛,👜,👝,👞,👟,👠,👡,👢,👣,👤,👦,👧,👨,👩,👪,👫,👮,👯,👰,👱,👲,👳,👴,👵,👶,👷,👸,👹,👺,👻,👼,👽,👾,👿,💀,💁,💂,💃,💄,💅,💆,💇,💈,💉,💊,💋,💌,💍,💎,💏,💐,💑,💒,💓,💔,💕,💖,💗,💘,💙,💚,💛,💜,💝,💞,💟,💠,💡,💢,💣,💤,💥,💦,💧,💨,💩,💪,💫,💬,💮,💯,💰,💱,💲,💳,💴,💵,💸,💹,💺,💻,💼,💽,💾,💿,📀,📁,📂,📃,📄,📅,📆,📇,📈,📉,📊,📋,📌,📍,📎,📏,📐,📑,📒,📓,📔,📕,📖,📗,📘,📙,📚,📛,📜,📝,📞,📟,📠,📡,📢,📣,📤,📥,📦,📧,📨,📩,📪,📫,📮,📰,📱,📲,📳,📴,📶,📷,📹,📺,📻,📼,🔃,🔊,🔋,🔌,🔍,🔎,🔏,🔐,🔑,🔒,🔓,🔔,🔖,🔗,🔘,🔙,🔚,🔛,🔜,🔝,🔞,🔟,🔠,🔡,🔢,🔣,🔤,🔥,🔦,🔧,🔨,🔩,🔪,🔫,🔮,🔯,🔰,🔱,🔲,🔳,🔴,🔵,🔶,🔷,🔸,🔹,🔺,🔻,🔼,🔽,🕐,🕑,🕒,🕓,🕔,🕕,🕖,🕗,🕘,🕙,🕚,🕛,🗻,🗼,🗽,🗾,🗿,😀,😇,😈,😎,😐,😑,😕,😗,😙,😛,😟,😦,😧,😬,😮,😯,😴,😶,🚁,🚂,🚆,🚈,🚊,🚍,🚎,🚐,🚔,🚖,🚘,🚛,🚜,🚝,🚞,🚟,🚠,🚡,🚣,🚦,🚮,🚯,🚰,🚱,🚳,🚴,🚵,🚷,🚸,🚿,🛁,🛂,🛃,🛄,🛅,🌍,🌎,🌐,🌒,🌖,🌗,🌘,🌚,🌜,🌝,🌞,🌲,🌳,🍋,🍐,🍼,🏇,🏉,🏤,🐀,🐁,🐂,🐃,🐄,🐅,🐆,🐇,🐈,🐉,🐊,🐋,🐏,🐐,🐓,🐕,🐖,🐪,👥,👬,👭,💭,💶,💷,📬,📭,📯,📵,🔀,🔁,🔂,🔄,🔅,🔆,🔇,🔉,🔕,🔬,🔭,🕜,🕝,🕞,🕟,🕠,🕡,🕢,🕣,🕤,🕥,🕦,🕧'.split(',');

const exception = [];
emojis.forEach((emoji) => {
  const match = emojiRegex().exec(emoji);
  if (!match) { exception.push(emoji) }
});
console.log('Exception length', exception.length);
console.log(JSON.stringify(exception));

6.1.0

Exception length 0

[]

6.1.3

Exception length 72

["✂","✈","✉","✏","✒","✔","✖","✳","✴","❄","❇","❤","➡","Ⓜ","🅰","🅱","🅾","🅿","🈂","🈷","©","®","‼","⁉","8⃣","9⃣","7⃣","6⃣","1⃣","0⃣","2⃣","3⃣","5⃣","4⃣","#⃣","™","ℹ","↔","↕","↖","↗","↘","↙","↩","↪","▪","▫","▶","◀","◻","◼","☀","☁","☎","☑","☺","♠","♣","♥","♦","♨","♻","⚠","⤴","⤵","⬅","⬆","⬇","〰","〽","㊗","㊙"]

Using https://github.com/Kikobeats/emojis-list as spec

6.1.0

Exception length 118

["🇦","🇧","🇨","🇩","🇪","🇫","🇬","🇭","🇮","🇯","🇰","🇱","🇲","🇳","🇴","🇵","🇶","🇷","🇸","🇹","🇺🇳","🇺","🇻","🇼","🇽","🇾","🇿","🕺","🖤","🗨","🛑","🛒","🛴","🛵","🛶","🤙","🤚","🤛","🤜","🤝","🤞","🤠","🤡","🤢","🤣","🤤","🤥","🤦‍♀️","🤦‍♂️","🤦","🤧","🤰","🤳","🤴","🤵","🤶","🤷‍♀️","🤷‍♂️","🤷","🤸‍♀️","🤸‍♂️","🤸","🤹‍♀️","🤹‍♂️","🤹","🤺","🤼‍♀️","🤼‍♂️","🤼","🤽‍♀️","🤽‍♂️","🤽","🤾‍♀️","🤾‍♂️","🤾","🥀","🥁","🥂","🥃","🥄","🥅","🥇","🥈","🥉","🥊","🥋","🥐","🥑","🥒","🥓","🥔","🥕","🥖","🥗","🥘","🥙","🥚","🥛","🥜","🥝","🥞","🦅","🦆","🦇","🦈","🦉","🦊","🦋","🦌","🦍","🦎","🦏","🦐","🦑","♀","♂","⚕",""]

6.1.3

Exception length 209

["🅰","🅱","🅾","🅿","🈂","🈷","🌡","🌤","🌥","🌦","🌧","🌨","🌩","🌪","🌫","🌬","🌶","🍽","🎖","🎗","🎙","🎚","🎛","🎞","🎟","🏍","🏎","🏔","🏕","🏖","🏗","🏘","🏙","🏚","🏛","🏜","🏝","🏞","🏟","🏳","🏵","🏷","🐿","👁‍🗨","👁","📽","🕉","🕊","🕯","🕰","🕳","🕶","🕷","🕸","🕹","🖇","🖊","🖋","🖌","🖍","🖥","🖨","🖱","🖲","🖼","🗂","🗃","🗄","🗑","🗒","🗓","🗜","🗝","🗞","🗡","🗣","🗨","🗯","🗳","🗺","🛋","🛍","🛎","🛏","🛠","🛡","🛢","🛣","🛤","🛥","🛩","🛰","🛳","‼","⁉","™","ℹ","↔","↕","↖","↗","↘","↙","↩","↪","#⃣","⌨","⏏","⏭","⏮","⏯","⏱","⏲","⏸","⏹","⏺","Ⓜ","▪","▫","▶","◀","◻","◼","☀","☁","☂","☃","☄","☎","☑","☘","☠","☢","☣","☦","☪","☮","☯","☸","☹","☺","♀","♂","♠","♣","♥","♦","♨","♻","⚒","⚔","⚕","⚖","⚗","⚙","⚛","⚜","⚠","⚰","⚱","⛈","⛏","⛑","⛓","⛩","⛰","⛱","⛴","⛷","⛸","✂","✈","✉","✏","✒","✔","✖","✝","✡","✳","✴","❄","❇","❣","❤","➡","⤴","⤵","*⃣","⬅","⬆","⬇","0⃣","〰","〽","1⃣","2⃣","㊗","㊙","3⃣","4⃣","5⃣","6⃣","7⃣","8⃣","9⃣","©","®",""]

does not match trailing VARIATION SELECTOR-16

Following up on this tweet:

I'm using emoji-regex to identify if the last symbol of a string is an emoji. While this works for most emojis, it does not for ⚽️. As it turns out this is happening because macOS inserts U+26BD followed by U+FE0F and that trailing variation selector is not part of the emoji-regex match.

While I don't think this is a bug in emoji-regex I do believe emoji-regex could help avoid this situation by including the unnecessary variation selector in the match.

Some Apple emoji do not match

☺️, 1️⃣, and ❤️ for example.

I'm sure there's a reason for this, but my app needs them to match.

Would you consider an optional parameter that would also match these characters? Maybe something like:

emojiRegex({includeNotTechnicallyEmojiCharacters: true});

I'd be happy to put the PR in, just wanted to check if this is a desirable feature.

README update for ES6

Would be awesome if you could add ES6 syntax to the README for convenience.

Update?

This project seems to be out of date. Any plans to update it to the latest emoji standard? If not, know of an alternative library?

Does not match 🛎 or 🕹

See example here https://runkit.com/4ver/issue-with-emoji-regex-and-matching

Don't work some emojies

🅰️🅱 - \uD83C\uDD70\uFE0F\uD83C\uDD71\uFE0F
👁 - \uD83D\uDC41

Why?

does not match 👁 and 🕶 on MacOS Mojave

Tested the regex expression using http://regex101.com/. Can closeout if it is not an actual issue.

Some profession emojis don't match

Hi Mathias,

Profession emojis which include gender type don't match, such as:
https://emojipedia.org/female-construction-worker/
https://emojipedia.org/female-health-worker/

Is it about the generated regex or emoji version or anything else?

Thanks!

browser compatibility

Since the project specifically mentions nodejs so I wanted to confirm if this can be used on client side (ie. major desktop+mobile browsers).

My use case is to insert a space next to the character typed in by user as soon as they do that. Can this module help me with that or I'm on the wrong track?

Emoji with U+fe0f match only first character

Another Emoji that don't match properly:

☝️ Index Pointing Up U+261d U+fe0f

it match only first character ☝

The regex matches number string like "1"

import emojiRegex from 'emoji-regex';
console.log(emojiRegex.test('1'));     // output: true

add a regexp to just match emojis, complete emojis

@mathiasbynens could be possible please to have a regexp to just match emojis? for example
0 is not 0️⃣
🐱‍👤 is 🐱‍👤
🐱‍👤 is not 🐱+👤
Thanks!

IE 10 IE 11 doesn't support this regex

Optimize output for multi-code-point emoji

I have a patch ready for this, that I’ll commit once @alexeld bumps the regex-trie version number.

With this change, the index.js file size goes from 7079 bytes down to 2867 bytes! 👍

Not sure if this regex does what you want

I was looking through some project regex and found this one here. It seems to be trying to use \p{ to express a unicode value but this notation is not supported in the JS regex dialect and will most likely not behave as expected when interacting with the u flag.

emoji-regex/text match string contains number

When the string has number inside, emoji-regex/text matches it

version 6.4.0

var emojiRegex = require('emoji-regex/text');
const matchExpected = emojiRegex().exec('foo');
console.log(matchExpected)
// null

const matchUnExpected = emojiRegex().exec('foo123');
console.log(matchUnExpected)
// [ '1', index: 3, input: 'foo123' ]
``

does not match ⌫

the regex does not match this special character: https://www.compart.com/en/unicode/U+232B
Julien

Some emojis ending with `\ufe0f` are not completely matched

Male detective emoji, 🕵️ "\u{1f575}\ufe0f", when matched with emoji regex, not all of its codepoints are consumed, leaving \ufe0f behind. The emoji is typed with control+cmd+space shortcut of Mac.

"\u{1f575}\ufe0f".match(emojiRegex(), "").length
//> 1

Google Sheets functionality?

Hi Mathias — I have scoured the internet and this project seems very close to what I am trying to solve with Google Sheets. When I export my spreadsheets to a third party app that crunches the data, it will always spit out an error code if my written content has emojis. The app isn't coded to parse them!

I'm looking for a way to regex Find/Replace all emojis from Google Sheets. I am so sorry if this is not exactly an "issue," but I am very new to this—please take pity!

Using regex to replace emoji characters

When trying to use emoji-regex to replace certain characters, some data is left in the resulting string:

"❤️".replace(emojiRegex(), ' ').length; // 2

Although it will match, it doesn't appear to match the entire "string".

Many sequences are only being partially matched

This seems to be a similar case to #13, so possibly it's a more-specific instance of devongovett/regexgen#10.

Sequences such as 👩‍👧‍👦 (U+1F469 U+200D U+1F467 U+200D U+1F466 aka "family: woman, girl, boy") match all but the last symbol:

const wgb = '\u{1F469}\u200D\u{1F467}\u200D\u{1F466}';
wgb.match(emojiRegex());
/*
Result:
[
  '👩‍👧',  U+1F469 U+200D U+1F467
  '👦'   U+1F466
]
*/

I decided to check all sequences using the same looping tests as the other symbols:

        // Test a ZWJ emoji sequence (`emoji-zwj-sequences.txt`).
        test('\u{1F3CA}\u{1F3FD}\u200D\u2640\uFE0F');

+       // Test all emoji sequences
+       const sequences = require('unicode-tr51/sequences.js');
+       for (const sequence of sequences) {
+               test(sequence);
+       }

This produces 86 failures, all to do with partial matches.

doesn't match 🌶

ES6 import

Is it possible to write

import emojiRegex from 'emoji-regex'

instead of

const emojiRegex = require('emoji-regex')

?
If so, could you add this to the readme! Cheers!!

wrapping emojis, possible?

Hi! Im trying to wrap emojis in a span tag, but some emojis break.

When I do

console.log('🐱‍👤'.replace(emoji_text_regexp, '$1')

I get

🐱‍👤

instead of

🐱‍👤

is possible to use this library to wrap emojis with their modifiers? Thanks!

Match doesn't include entire emoji for some emojis.

When matching a string for emojis, the returned match does not include the entire emoji.

For example,

let emojiString = 'peace hand -> ✌️';
let match = EmojiRegex.exec(emojiString);

console.log(✌️===match[0]); //false

Instead match[0] returns

✌

The match doesn't include the entire emoji.
An emoji consists of 2 16-bit code points, but for some emojis, only the first code point is matched and returned, giving these weird emojis in browsers.

(A couple of minutes in https://thekevinscott.com/emojis-in-javascript/ , and I can pretend to know anything about emojis 👍 )

support for modifiers/variations

hi,

is there any chance this project will support variations?

what we are using it for is to check if a text has emojis in it then wrap them with spans with the proper fonts so they work on more browsers properly (ie, edge)

because variations are not supported, it s considering them 2 characters with mixed results in various browsers (firefox sees it properly, chrome renders two characters)

cheers

Package name dependency

First, I want to thank you for great projects :)
I am using for emoji usage analysis!!

There is a tiny problem when installing... because of same name of module and package ( emoji-regex) ///

when I tried to install with "npm install emoji-regex"
code ENOSELF
npm ERR! Refusing to install package with name "emoji-regex" under a package
npm ERR! also called "emoji-regex". Did you name your project the same
npm ERR! as the dependency you're installing?
npm ERR!
npm ERR! For more information, see:
npm ERR! https://docs.npmjs.com/cli/install#limitations-of-npms-install-algorithm

npm ERR! A complete log of this run can be found in:

I write this because there is no issue regarding this....
Also For those who experiencing same problems :
When switching module (in package.json) into different name or project name(re- git clone is easiest) , it can work.

Free to comment!

some emoji can't match

you can try '\u263a'.match(regex) and '\u263a\ufe0f'.match(regex)

result:

'\u263a\ufe0f' can match, but '\u263a' can‘t

A common emoji (U+2764) cannot match

this heart is black, but in wechat, when you spell "xin"(tip: heart by English), it's a red heart, its unicode is 2764

How can I use this library to match all-emoji strings? i.e. strings consisting of emoji characters only

I'm writing a chat-like application, and I need to check if a string consists only of Emoji characters, so that I could render the emoji in large text.

Currently I'm using a function like the following to do this:

function isEmojiOnly(text, emojiRegex) {
  const textOnly = text.replace(/\s/g, "")
  return textOnly.replace(emojiRegex, "") === ""
}

isEmojiOnly("Hello") -----> false
isEmojiOnly("Hello 🙂") -----> false (since it also contains other non-emoji characters)
isEmojiOnly("👍") -----> true
isEmojiOnly("👍👏") -----> true

Does the emoji-regex library offer any provision to do this more optimally?