Comments (10)
I wonder. #4 was a similar problem that was fixed using a different "locale". You can change the locale using an environment variable. You can set an environment variable for a single program call on the command line:
$ LC_CTYPE=UTF-8 python3 mastodon-backup-to-text.py [email protected]
Some background: "locale" is supposed to control how users want to see dates, characters, and the like. The default for your terminal could be C, which is the dumbest variant of them all. Obviously, it is sometimes the default. Check your locale settings! Here's mine:
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
Thus, if you can't print UTF-8 characters, you probably have LC_CTYPE
set to C
. When I set it to C
on my system, I get a similar error:
$ LC_CTYPE=C mastodon-archive text [email protected]
Traceback (most recent call last):
File "/usr/local/bin/mastodon-archive", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/site-packages/mastodon_archive/__init__.py", line 65, in main
args.command(args)
File "/usr/local/lib/python3.6/site-packages/mastodon_archive/text.py", line 70, in text
print("%s boosted" % status["account"]["display_name"])
UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f41d' in position 15: ordinal not in range(128)
I will add this to the README.
from mastodon-archive.
New troubleshooting section with macOS specific information. Let me know if this helps.
from mastodon-archive.
Interesting.
My "International" settings look the same as yours in the Terminal profile...
Text encoding: Unicode (UTF-8)
But I still have this:
locale
LANG="en_US.utf8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
What does it look like in your Terminal > Preferences > Encoding tab? I have Unicde (UTF-8)
checked there. Should I turn off all others?
from mastodon-archive.
No, I have many selected. The ones that are selected here simply appear in "text encoding menus" whatever these are. The important part might be the checkbox below: Set locale environment variables on startup. You have it checked? When I uncheck it and open a new Terminal window:
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
(I got the idea from here, but perhaps there's more involved?)
from mastodon-archive.
The ones that are selected here simply appear in "text encoding menus"
Ah, right. I see that now.
The important part might be the checkbox below: Set locale environment variables on startup. You have it checked?
Yes.
But curiously, if I uncheck the box, and open a new Terminal window, I still have the same category settings:
locale
LANG="en_US.utf8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
Even if I change the order of my system language choices...
- EN / EN-US / FR-FR
- EN-US / EN / FR-FR
- FR-FR / EN / EN-US
I still get the same Terminal values as above. Maybe I need to reboot the machine?
I'm not sure why my LC_ALL
value is set to "C", but I suspect that might be a problem here because it overrides all the others, according to man locale
docs. I can't figure out how to edit that to empty, LC_ALL=""
.
from mastodon-archive.
This might be answering my question.
from mastodon-archive.
Crud. Not working. To summarize my status...
Sys Prefs > Language and Regions > Preferred languages: EN-US, FR-FR, EN
Terminal > Preferences > Profile > Advanced > International > Text encoding: Unicode (UTF-8)
and "Set local environment variables" is checked.
On command-line...
Because I used export
command to create these:
env | grep '^LC_'
LC_ALL=en_US.utf8
LC_CTYPE=en_US.utf8
But still just outputs this:
locale
LANG="en_US.utf8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
Finally, this does not work, apparently:
LC_CTYPE=UTF-8 python3 mastodon-backup-to-text.py [email protected]
Traceback (most recent call last):
File "mastodon-backup-to-text.py", line 87, in <module>
status["created_at"]))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)
Neither if I use LC_CTYPE=en_US.utf8
instead.
It's a can of encod-a-lingo worms.
The only thing I haven't tried is put the following at the end of my .bash_profile, which I think is was suggested somewhere I read:
LC_ALL=en_US.utf8
LC_CTYPE=en_US.utf8
And this seems to suggest a reboot is indeed needed on system language changes, so I'll explore that later.
from mastodon-archive.
A system reboot should not be necessary, as long as you are fiddling with environment variables. If you put it in your .bash_profile
(or your .bashrc
) I think simply opening a new Terminal window should do it. My intuition tells me that capitalization and spelling might be more important, though. You use LC_CTYPE=en_US.utf8
but what if you used LC_CTYPE=en_US.UTF-8
instead, in your experiments?
Too bad you already verified that LC_CTYPE=UTF-8 python3 mastodon-backup-to-text.py [email protected]
doesn't work for you. This was my only hope.
I don't think setting up the Mac System Language will make a difference.
from mastodon-archive.
I decided to get rid of string printing once and for all. 5c2c21f introduces a different solution which will always force UTF-8 output, no matter what the system says about your terminal. If you want to give it a try, you need to install version 0.0.3. Please be aware that the installation instructions changed, and the calling conventions changed! I think you'll need to do the following:
# delete all the mastodon-backup*.py files you previously installed
pip3 install mastodon-archive
# in the correct directory
mastodon-archive text [email protected]
from mastodon-archive.
That worked. Thanks! Nice changes.
from mastodon-archive.
Related Issues (20)
- HTML export to use cropped, centered images HOT 4
- HTML Export posts videos with a <img> tag HOT 6
- HTML Export pages should lazy load videos and images to be usable as static sites HOT 14
- Add pagination numbers in html exports
- TypeError: 'NoneType' object is not subscriptable HOT 8
- Backup other account's public posts HOT 1
- Don't burst the API rate limiting HOT 1
- Archiving context HOT 3
- No main.py created in windows?? HOT 7
- Install "the Debian way" HOT 8
- getting mastodon-backup packaged (for Linux) HOT 19
- Generate tarball compatible to Mastodon's export HOT 5
- Specifying bookmarks or statuses with text search resets other specified collections HOT 2
- Feature: Backup / Export Filters HOT 8
- incremental runs of 'mastodon-archive' giving 404s on deleted mentions/replies on STDERR HOT 7
- Compress JSON HOT 8
- Consider making --with-followers and --with-following on by default? HOT 3
- Improve followers feature : consider more interactions HOT 4
- Report fails if no bookmark in archive (old archive) HOT 4
- Version check fails on Pleroma instance HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mastodon-archive.