Git Product home page Git Product logo

Comments (28)

DanielViglione avatar DanielViglione commented on May 30, 2024

@melones did you ever resolve this? I am having the same exact issue.

from gammu.

nijel avatar nijel commented on May 30, 2024

See also http://serverfault.com/questions/549277/postgresql-crashes-gammu-smsd-with-encoding-error

from gammu.

nijel avatar nijel commented on May 30, 2024

The problem probably lies in a way Gammu internally stores Unicode characters, it unfortunately stores them as UCS-2-BE, what can not correctly store new chars which need more than two bytes to store.

from gammu.

melones avatar melones commented on May 30, 2024

Can we set some replacement char (eg.: "?") in gammu to prevent the infinite loop when such situation happens?

from gammu.

nijel avatar nijel commented on May 30, 2024

Actually in SMS itself, the UCS2 is used as well, so it's probably somehow encoded and Gammu should probably detect this when encoding string to utf-8...

from gammu.

nijel avatar nijel commented on May 30, 2024

Okay, even when the specification says UCS2, it seems that current phones actually use UTF-16 instead. Gammu stores this internally just fine, the only problem is in decoding these to UTF-8.

from gammu.

nijel avatar nijel commented on May 30, 2024

So basically EncodeUTF8 function needs to be adjusted to actually decode UTF-16, see
https://en.wikipedia.org/wiki/UTF-16#Code_points_U.2B010000_to_U.2B10FFFF

from gammu.

melones avatar melones commented on May 30, 2024

Ok, can we count on you with that adjustment? :)

from gammu.

nijel avatar nijel commented on May 30, 2024

I will try to find time to look deeper into this next week, but I can not promise anything right now.

from gammu.

nijel avatar nijel commented on May 30, 2024

Should be fixed in git, it would be great if somebody could verify this in real world.

from gammu.

melones avatar melones commented on May 30, 2024

Great! Thanks nijel! I will verify that next week.

from gammu.

melones avatar melones commented on May 30, 2024

Hello nijel,

it seems that the bug wasn't completely fixed. I have setup test environment, compiled the latest gammu-smsd 1.34.0 and tested it again.

After sending "thumb up" icon (the same as in first report in this bug) from smarthpone to gammu-smsd I get the following error:

Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Received message from: +481234567
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Read 1 messages
Thu 2015/01/29 09:51:59 gammu-smsd[4424]: Execute SQL: INSERT INTO inbox ("ReceivingDateTime", "Text
", "SenderNumber", "Coding", "SMSCNumber", "UDH", "Class", "TextDecoded", "RecipientID") VALUES ('20
15-01-29 08:41:54 GMT', 'E00E', '+481234567', 'Unicode_No_Compression', '+48790998250', '', -1, '
Ç ', 'phone1')
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Error: ERROR: invalid byte sequence for encoding "UTF8": 0xf08e808e
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: SQL failure: 2
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Error writing to database (SMSDSQL_SaveInboxSMS)
Thu 2015/01/29 09:42:51 gammu-smsd[3934]: Error processing SMS: Unknown error (UNKNOWN[27])

And gammu-smsd goes into infinite loop trying to save the sms message to database but without result (the same error is continuosly written into log).

from gammu.

nijel avatar nijel commented on May 30, 2024

Well with MySQL it's different issue, it's utf8 type can not handle all utf8 chars. They have introduced utf8mb4 type to "fix" this see https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

Gammu should probably default to this type...

from gammu.

melones avatar melones commented on May 30, 2024

Hi nijel,
that's right. But that's a solution limited to MySQL databases. All other (especially Postgres) will still suffer from infinite-loop problem. Maybe this could be fixed easier with replacement char '?' for all non-utf8 chars?

Something like this (pseudocode):
if (!IsUTF8(character)) then character = '?'

What do you think?

from gammu.

nijel avatar nijel commented on May 30, 2024

Postgres will not get this error at all...

This problem is MySQL only - the char is valid UTF-8, but MySQL supports only subset of UTF-8 by default...

from gammu.

melones avatar melones commented on May 30, 2024

Nijel. I experience this error on Postgres database. Both error logs attached above (from gammu 1.33 and gammu 1.34) come from gammu-smsd and Postgres-db - not MySQL.

from gammu.

nijel avatar nijel commented on May 30, 2024

Hmm, actually there is bug in the code, working on that...

from gammu.

melones avatar melones commented on May 30, 2024

Hi nijel,

I cloned the master branch from git repo, compiled and tested the fix. Still there is a problem with infinite-loop in gammu-smsd.

I test this issue with 2 different smarphones: I send a thumb-up emoji from both smartphones to gammu-smsd powered device. With earlier (release 1.34.0 version) with both smartphones gammu-smsd went into infinite-loop after receiving emoji. With fixed version 1.34.90 (including latest commit) now message from first smartphone is received successfully, but message from second smartphone causes again infinite-loop in gammu-smsd. So we have progress, but bug is not eliminated completely :)

Here's the details:
Gammu-smsd version 1.34.90
Built 09:32:57 on Feb 13 2015 using GCC 4.1

Gammu-Log from first smartphone (when message is received correcly):
Fri 2015/02/13 09:43:15 gammu-smsd[15693]: Read 1 messages
Fri 2015/02/13 09:43:15 gammu-smsd[15693]: Execute SQL: INSERT INTO inbox ("ReceivingDateTime", "Text", "SenderNumber", "Coding", "SMSCNumber", "UDH", "Class", "TextDecoded", "RecipientID") VALUES ('2015-02-13 08:43:10 GMT', 'E00E', '+482234567', 'Unicode_No_Compression', '+482234567', '', -1, '', 'phone1')
Fri 2015/02/13 09:43:15 gammu-smsd[15693]: Execute SQL: UPDATE phones SET "Received" = "Received" + 1 WHERE "IMEI" = '012142342342342'
Fri 2015/02/13 09:43:15 gammu-smsd[15730]: Starting run on receive: /mnt/ramdisk/htdocs/scripts/daemon.sh 39
Fri 2015/02/13 09:43:22 gammu-smsd[15693]: Process finished successfully

Gammu-Log from second smartphone (when gammu-smsd is going crazy):
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Read 1 messages
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Execute SQL: INSERT INTO inbox ("ReceivingDateTime", "Text", "SenderNumber", "Coding", "SMSCNumber", "UDH", "Class", "TextDecoded", "RecipientID") VALUES ('2015-02-13 08:44:13 GMT', 'D83DDC4D', '+481234567', 'Unicode_No_Compression', '+481234567', '', -1, '👍���', 'phone1')
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Error: ERROR: invalid byte sequence for encoding "UTF8": 0xedb18d
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: SQL failure: 2
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Error writing to database (SMSDSQL_SaveInboxSMS)
Fri 2015/02/13 09:44:24 gammu-smsd[15693]: Error processing SMS: Unknown error. (UNKNOWN[27])

from gammu.

nijel avatar nijel commented on May 30, 2024

Have you also changed the MySQL table to be utf8mb4? I believe this will be the problem here now as the character is valid utf-8, but can not fit into stadard MySQL's utf8 data type...

from gammu.

melones avatar melones commented on May 30, 2024

No, I'm using PostgreSQL database.

from gammu.

nijel avatar nijel commented on May 30, 2024

Ah, I've found another bug in the code, fixed it and added testcase for that...

from gammu.

melones avatar melones commented on May 30, 2024

I compiled and tested the package in several scenarios and now all my integration tests pass. You can close the issue. Thanks a lot!

from gammu.

nijel avatar nijel commented on May 30, 2024

Thanks for verifying that!

from gammu.

 avatar commented on May 30, 2024

Hi @nijel,

Preface: This is more of a request for comments than anything else, but you're welcome to do whatever you want with this patch. Saw the mention of UTF-16 and figured I should post.

A while back, I noticed that the multipart splitting code (in the outgoing Unicode-required case) wasn't UTF-16 aware – specifically, it would happily split a UTF-16 surrogate pair in the middle, which meant that the character is visually destroyed if the recipient is unable to reassemble the message (say, if the UDH was stripped by a telco – I've seen this happen in the wild between the US carriers AT&T and T-Mobile). Most emoji ends up as a four-byte UTF-16 surrogate pair, so this issue was pretty easy for us to hit.

There was also a similar set of issues with a subset of joinable characters, specifically where the non-joined components don't make much sense in isolation: for instance, the "regional indicator symbol" flags, or the "combining diacritical marks"; all cases where multiple characters really only make sense when viewed as a single glyph.

This is admittedly a bit of an edge case, but we've been using libgammu in developing-country situations where telco behavior/filtering is highly unpredictable, so it made sense at the time for us to patch it. What we've done is decrease the segment size for a single message segment iff splitting it at exactly 70 bytes would cut a (UTF-16) character in half. It seems to be working well for us so far.

Here's a raw diff against an old version, just so you can see what we're doing:
https://raw.githubusercontent.com/browndav/medic-os/master/platform/source/medic-core-1.6.0/patches/gammu-utf16-sms-multipart.diff

I need to port this forward to the latest version for our internal use regardless; if you're at all interested, I'd be happy to create a new issue and send a PR, or fix it up in whatever way's needed to merge.

Thanks!

from gammu.

nijel avatar nijel commented on May 30, 2024

Let's discuss this separately: #95

from gammu.

melones avatar melones commented on May 30, 2024

Problem came back for multipart messages. More info here: #95

from gammu.

nijel avatar nijel commented on May 30, 2024

See #281 for that. There is no reason to comment that on more issues...

from gammu.

handiwijoyo avatar handiwijoyo commented on May 30, 2024

I got this error when using Mysql ODBC Unicode Driver on windows machine. Solved by changed it to ANSI Driver

image

from gammu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.