Comments (10)
@NeelShah18
Hey
I'm running into the same error.
Code throwing the error:
def replace_emoticon(line):
emoticons = emot.emoticons(line)
try:
values = emoticons["value"]
while len(values) != 0:
value = values.pop(0)
emoji = render_emoji(value) # other function; replaces emoticon with emoji if possible
if emoji is not None:
line = line.replace(value, emoji)
except TypeError:
print("emoji error")
print(line)
input()
return line
lines have the shape: id \t sentiment \t text (German)
examples:
375830740166246400 \t neutral \t Experte befürchtet höhere Benzinpreise wegen Meldestelle URL
emot.emoticons(text) returns, as stated, only [{'flag': False}].
As well if you just pass the text part. Even if you copy it into an editor and insert it in the python editor directly to emot:
emot.emoticons("neutral Experte befürchtet höhere Benzinpreise wegen Meldestelle URL")
[{'flag': False}]
Using python 3.6, encoding all utf8 and that should be fine
from emot.
Thank you for the issue. Can you please provide the code to regenerate the error.
from emot.
@PaprikaSteiger Yes, that's true. I just check the Unicode library and "neutral" is not in the list. I believe unicode_library is a little old and I need to add more data there. Only reason why it is not able to detect "neutral" is because it is not on the list. If you can tell me what is most common emoticons use in german language I can add it.
Presently, I am working on speeding this library for large scale analysis and also planning to add more data their.
To solve this issue you can add the keyword in https://github.com/NeelShah18/emot/blob/master/emot/emo_unicode.py dictionary and it wll detect it automatically. One if for emoticons and one for emoji.
from emot.
@ViajeroHerrante this is the same issue for you as well. If you can give me the most common source of emoticons I can add it in the dictionary. I am also working on optimize it for large scale analysis in real-time. Please suggest me good dource of emoticons and emoji. Thank you.
Really apprecaite you interest in library. I am working to solve that problem as well.
from emot.
@NeelShah18
The Tweet collection I used can be found here:
https://github.com/WladimirSidorenko/CGSA/blob/master/data/SB10k/corpus_v1.0.cgsa.tsv
They contain some emoticons (like -.- o0 :S), which I personally use as well, that are not contained in the dictionary. I guess you can just feed the text of the tweets, see which emoticon it gets. For control I just used this regex to detect some suspicious patterns:
r"(^.*?)([[\]^`'?´¨}$£ö=)(/&%*\"+¦@#¬|¢{\-_:;,.<>\\]{3,7})((?:\s.*$)|$)"
# the "ö" was used in the linked twitter corpus, though I don't thinkt that it is used that often in emoticons.
In the end, I chose a different approach for my personal task. Now I use a json file mapping emoticons to corresponding emoji, in an attempt to standardize emoticons. Something like that would be a nice feature, too. I could provide you with the small json file I started.
from emot.
@PaprikaSteiger yes, that will be great help.
from emot.
Hey whats up! Sorry for the abscense and for don't answer, I was too busy and I forgot it. I put you all words or collection of characters that gives the error. I send you my regards and I'll hope you be ok. :D
good
tooth
fooled
book
poor
bluetooth
too
looks
cool
)
d807
expect
bluetooths
:
-good
expectations
oozes
experienced
tool
look
bluetoooth
good7
room
expected
experience
poorly
sooner
inexpensive
boost
wood
looking
expensive
tools
waterproof
took
shooters
loop
reboots.overall
smoothly
explain
good.4
soon
supertooth
indoors
smoother
wooden
floor
boot
booking
good..
loops
looses
loose
hook
from emot.
@NeelShah18 I think the issue is when catching exception here and here, instead of using append, we can use __entities["flag"] = False
or what was done here.
from emot.
@PaprikaSteiger yes, that will be great help.
Sorry as well for my delay. Here comes the emoticon to emoji dictionary with which I worked in the end.
The original is an emoticon-to-emoji map for JavaScript: https://github.com/banyan/emoji-emoticon-to-unicode[06.10.2020]
I changed the formatting and added some other emoticons.
I hope it is of any help
All the best
paprikasteiger_emoticon_emoji.zip
from emot.
@PaprikaSteiger , @KevinTeukengBecho @ViajeroHerrante Yep, I did the investigation and I decided to move with new type of detection to avoid this issue. We have new emoticons dataset as well and new template to make easy of adding new emoji or emoticon.
For now, I have push some changes to new branch "version_3.0".
I will try to finish it this week and publish new version.
Hope this helps. sorry for late reply.
from emot.
Related Issues (20)
- About u":‑<":"Frown, sad, andry or pouting", HOT 5
- Hi I have a question, HOT 2
- Return type different in emoticons function when an exception is raised HOT 1
- Missing 🤨 and 🦟 emojis HOT 2
- Inconsistent output types obtained HOT 1
- Inconsistent Results HOT 4
- The regex for the "(?_?)" emoticon is wrong, it just matches ")" HOT 1
- How can I replace the emoji by their meaning? HOT 5
- Getting error while using the latest version HOT 17
- emot.emoticons() function returning different data type
- Emoticon identified where it should not have been HOT 2
- EMOTICON cannot be imported HOT 2
- standardize the variables
- Heart emoticon, i.e. <3, is not found
- Plz use MIT or Apache2.0 Licence HOT 2
- <3 not found HOT 3
- Reversed emoticons HOT 1
- Stronger emoticons HOT 4
- If input includes "oo", output is corrupted HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from emot.