Git Product home page Git Product logo

Comments (9)

beckchr avatar beckchr commented on September 18, 2024

Could you please check your id3 version (i.e. which branch in your code is executed)? For id3v1, text is assumed to be encoded with ISO-8859-1.

from mp3agic.

daniilshevelev avatar daniilshevelev commented on September 18, 2024

It's ID3v2.
From what I understand the unicode strings are UTF in Java...
Is it possible to fix this? I like the library but this is a show stopper for me and jig3lib reads it properly, although it does have some other issues.

from mp3agic.

mpatric avatar mpatric commented on September 18, 2024

If you supply an extract of the mp3 from the beginning (making sure you get the whole ID3v2 tag), I can take a look. You can mail it directly to me at [email protected]. To get the first 100k (say) from the file, you can use head on Linux or OS X:

head -c 100000 myfile.mp3 > extract.mp3

from mp3agic.

mpatric avatar mpatric commented on September 18, 2024

Hi - I'm looking at the mp3 extract you provided. On a mac using preview, as well as looking at it in iTunes, the title and artist don't show in Russian. I'm getting: Ñêîëüçêèå Óëèöû for the title and Áè-2 for the artist (see attached image too). So, I'd think something is not quite right with the encoding of these fields. I'll investigate further..

extract-01

from mp3agic.

mpatric avatar mpatric commented on September 18, 2024

There are all sorts of things wrong with the ID3v2 tag on this file.

The artist and title fields in this file declare their encoding as 0, which is ISO-8859-1 (the byte that determines the encoding is the 11th byte of the frame, 0 = ISO-8859-1, 1 = UTF-16LE, 2 = UTF-16BE, 3 = UTF-8). This encoding does not support the Russian character set.

The ID3v2 tag itself is declared as version 2.3, which does not support UTF16BE or UTF-8 (only ISO-8859-1 and UTF-16LE).

I thought perhaps the string was one of the other encodings and this 11th byte had been set incorrectly; however there is no byte order marker (BOM), which unicode strings should include. I tried decoding it as UTF-16LE, UTF-16BE and UTF-8, but it's not any of those.

I really don't know how jig3lib manages to make sense of this header. There may be some unofficial extension to the spec being used, but it's probably not in accordance with the proper id3v2 spec at http://id3.org.

from mp3agic.

mpatric avatar mpatric commented on September 18, 2024

Ah ha! The encoding appears to be: windows-1251 (also known as cp1251)

There is no mention in the official id3 specification of this encoding being valid for id3v2. After a quick search I see there are a few mentions of id3v2 tags encoded with windows-1251 (even though not officially correct). We could build in support for it, but the question is how to detect it.

from mp3agic.

mpatric avatar mpatric commented on September 18, 2024

Building in support for windows-1251 is probably not the right thing to do (even if it's just read support). ID3v2 tags with windows-1251 encoded strings (or any other encoding that's not one of the 4 supported encodings) are not valid.

Furthermore, it appears programatically differentiating windows-1251 from iso-8859-1 is not easy.

Some interesting comments here: http://superuser.com/questions/495775/how-to-translate-wacky-metadata-to-readable-format

from mp3agic.

daniilshevelev avatar daniilshevelev commented on September 18, 2024

Well, to be honest the song did not have tags to begin with and I just edited it in Windows Explorer by changing properties. It does show up correctly in Media Player but not VLC.
I think it'd be rather common issue for Windows users. I remember having similar issues with other files in iTunes (which I no longer use after it erased my library after some random update).

from mp3agic.

mpatric avatar mpatric commented on September 18, 2024

You should be able to transcode the windows-1251 string into a UTF16 string in java. I don't have time to try it now, but something like this:

String utfString = new String(s.getBytes("ISO-8859-1"), "windows-1251");

from mp3agic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.