Comments (28)
@melones did you ever resolve this? I am having the same exact issue.
from gammu.
See also http://serverfault.com/questions/549277/postgresql-crashes-gammu-smsd-with-encoding-error
from gammu.
The problem probably lies in a way Gammu internally stores Unicode characters, it unfortunately stores them as UCS-2-BE, what can not correctly store new chars which need more than two bytes to store.
from gammu.
Can we set some replacement char (eg.: "?") in gammu to prevent the infinite loop when such situation happens?
from gammu.
Actually in SMS itself, the UCS2 is used as well, so it's probably somehow encoded and Gammu should probably detect this when encoding string to utf-8...
from gammu.
Okay, even when the specification says UCS2, it seems that current phones actually use UTF-16 instead. Gammu stores this internally just fine, the only problem is in decoding these to UTF-8.
from gammu.
So basically EncodeUTF8 function needs to be adjusted to actually decode UTF-16, see
https://en.wikipedia.org/wiki/UTF-16#Code_points_U.2B010000_to_U.2B10FFFF
from gammu.
Ok, can we count on you with that adjustment? :)
from gammu.
I will try to find time to look deeper into this next week, but I can not promise anything right now.
from gammu.
Should be fixed in git, it would be great if somebody could verify this in real world.
from gammu.
Great! Thanks nijel! I will verify that next week.
from gammu.
Hello nijel,
it seems that the bug wasn't completely fixed. I have setup test environment, compiled the latest gammu-smsd 1.34.0 and tested it again.
After sending "thumb up" icon (the same as in first report in this bug) from smarthpone to gammu-smsd I get the following error:
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Received message from: +481234567
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Read 1 messages
Thu 2015/01/29 09:51:59 gammu-smsd[4424]: Execute SQL: INSERT INTO inbox ("ReceivingDateTime", "Text
", "SenderNumber", "Coding", "SMSCNumber", "UDH", "Class", "TextDecoded", "RecipientID") VALUES ('20
15-01-29 08:41:54 GMT', 'E00E', '+481234567', 'Unicode_No_Compression', '+48790998250', '', -1, '
Ç ', 'phone1')
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Error: ERROR: invalid byte sequence for encoding "UTF8": 0xf08e808e
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: SQL failure: 2
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Error writing to database (SMSDSQL_SaveInboxSMS)
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Error processing SMS: Unknown error (UNKNOWN[27])
And gammu-smsd goes into infinite loop trying to save the sms message to database but without result (the same error is continuosly written into log).
from gammu.
Well with MySQL it's different issue, it's utf8 type can not handle all utf8 chars. They have introduced utf8mb4 type to "fix" this see https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html
Gammu should probably default to this type...
from gammu.
Hi nijel,
that's right. But that's a solution limited to MySQL databases. All other (especially Postgres) will still suffer from infinite-loop problem. Maybe this could be fixed easier with replacement char '?' for all non-utf8 chars?
Something like this (pseudocode):
if (!IsUTF8(character)) then character = '?'
What do you think?
from gammu.
Postgres will not get this error at all...
This problem is MySQL only - the char is valid UTF-8, but MySQL supports only subset of UTF-8 by default...
from gammu.
Nijel. I experience this error on Postgres database. Both error logs attached above (from gammu 1.33 and gammu 1.34) come from gammu-smsd and Postgres-db - not MySQL.
from gammu.
Hmm, actually there is bug in the code, working on that...
from gammu.
Hi nijel,
I cloned the master branch from git repo, compiled and tested the fix. Still there is a problem with infinite-loop in gammu-smsd.
I test this issue with 2 different smarphones: I send a thumb-up emoji from both smartphones to gammu-smsd powered device. With earlier (release 1.34.0 version) with both smartphones gammu-smsd went into infinite-loop after receiving emoji. With fixed version 1.34.90 (including latest commit) now message from first smartphone is received successfully, but message from second smartphone causes again infinite-loop in gammu-smsd. So we have progress, but bug is not eliminated completely :)
Here's the details:
Gammu-smsd version 1.34.90
Built 09:32:57 on Feb 13 2015 using GCC 4.1
Gammu-Log from first smartphone (when message is received correcly):
Fri 2015/02/13 09:43:15 gammu-smsd[15693]: Read 1 messages
Fri 2015/02/13 09:43:15 gammu-smsd[15693]: Execute SQL: INSERT INTO inbox ("ReceivingDateTime", "Text", "SenderNumber", "Coding", "SMSCNumber", "UDH", "Class", "TextDecoded", "RecipientID") VALUES ('2015-02-13 08:43:10 GMT', 'E00E', '+482234567', 'Unicode_No_Compression', '+482234567', '', -1, '', 'phone1')
Fri 2015/02/13 09:43:15 gammu-smsd[15693]: Execute SQL: UPDATE phones SET "Received" = "Received" + 1 WHERE "IMEI" = '012142342342342'
Fri 2015/02/13 09:43:15 gammu-smsd[15730]: Starting run on receive: /mnt/ramdisk/htdocs/scripts/daemon.sh 39
Fri 2015/02/13 09:43:22 gammu-smsd[15693]: Process finished successfully
Gammu-Log from second smartphone (when gammu-smsd is going crazy):
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Read 1 messages
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Execute SQL: INSERT INTO inbox ("ReceivingDateTime", "Text", "SenderNumber", "Coding", "SMSCNumber", "UDH", "Class", "TextDecoded", "RecipientID") VALUES ('2015-02-13 08:44:13 GMT', 'D83DDC4D', '+481234567', 'Unicode_No_Compression', '+481234567', '', -1, '👍���', 'phone1')
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Error: ERROR: invalid byte sequence for encoding "UTF8": 0xedb18d
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: SQL failure: 2
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Error writing to database (SMSDSQL_SaveInboxSMS)
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Error processing SMS: Unknown error. (UNKNOWN[27])
from gammu.
Have you also changed the MySQL table to be utf8mb4? I believe this will be the problem here now as the character is valid utf-8, but can not fit into stadard MySQL's utf8 data type...
from gammu.
No, I'm using PostgreSQL database.
from gammu.
Ah, I've found another bug in the code, fixed it and added testcase for that...
from gammu.
I compiled and tested the package in several scenarios and now all my integration tests pass. You can close the issue. Thanks a lot!
from gammu.
Thanks for verifying that!
from gammu.
Hi @nijel,
Preface: This is more of a request for comments than anything else, but you're welcome to do whatever you want with this patch. Saw the mention of UTF-16 and figured I should post.
A while back, I noticed that the multipart splitting code (in the outgoing Unicode-required case) wasn't UTF-16 aware – specifically, it would happily split a UTF-16 surrogate pair in the middle, which meant that the character is visually destroyed if the recipient is unable to reassemble the message (say, if the UDH was stripped by a telco – I've seen this happen in the wild between the US carriers AT&T and T-Mobile). Most emoji ends up as a four-byte UTF-16 surrogate pair, so this issue was pretty easy for us to hit.
There was also a similar set of issues with a subset of joinable characters, specifically where the non-joined components don't make much sense in isolation: for instance, the "regional indicator symbol" flags, or the "combining diacritical marks"; all cases where multiple characters really only make sense when viewed as a single glyph.
This is admittedly a bit of an edge case, but we've been using libgammu
in developing-country situations where telco behavior/filtering is highly unpredictable, so it made sense at the time for us to patch it. What we've done is decrease the segment size for a single message segment iff splitting it at exactly 70 bytes would cut a (UTF-16) character in half. It seems to be working well for us so far.
Here's a raw diff against an old version, just so you can see what we're doing:
https://raw.githubusercontent.com/browndav/medic-os/master/platform/source/medic-core-1.6.0/patches/gammu-utf16-sms-multipart.diff
I need to port this forward to the latest version for our internal use regardless; if you're at all interested, I'd be happy to create a new issue and send a PR, or fix it up in whatever way's needed to merge.
Thanks!
from gammu.
Let's discuss this separately: #95
from gammu.
Problem came back for multipart messages. More info here: #95
from gammu.
See #281 for that. There is no reason to comment that on more issues...
from gammu.
I got this error when using Mysql ODBC Unicode Driver
on windows machine. Solved by changed it to ANSI Driver
from gammu.
Related Issues (20)
- Set USE_SMSTEXTMODE feature for Sim800 modem HOT 4
- GAMMU - 8-bit SMS HOT 4
- Unable to parse delivery report from SR memory HOT 4
- Gammu-smsd cant receive multipart sms and save to file HOT 1
- Failed to generate SHM key HOT 1
- Problem with configuriation file during install HOT 1
- Gammu can send SMS, but can't configure diverts HOT 2
- Can't install Gammu on 64-bit Bullseye! Impossible to find gammu package HOT 1
- Encoding of incoming SMS are not unicode encoded
- SIM card not connecting to network HOT 1
- Running RunOnReceive command many times after long SMS, merging long SMS in one message HOT 3
- test failure because of a global-buffer-overflow
- AT+CPMS to set 3x storage instead of 2x?
- gammu and Wavecom (MULTIBAND 900E 1800) from ostent
- gammu and wavecom (MULTIBAND 900E 1800)
- Nokia 130 support
- Feature request: move sms to file / folder
- sms at script fails
- gammu-smsd cannot recover when modem is disconnected and reconnect HOT 1
- Onda DM4000 Getmanufacturer issues
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gammu.