Git Product home page Git Product logo

rexport's Introduction

Export your personal Reddit data: saves, upvotes, submissions etc. as JSON.

Setting up

  1. The easiest way is pip3 install --user git+https://github.com/karlicoss/rexport.

    Alternatively, use git clone --recursive, or git pull && git submodule update --init. After that, you can use pip3 install --editable ..

  2. To use the API, you need to register a custom ā€˜personal scriptā€™ app and get client_id and client_secret parameters.

    See more here.

  3. To access userā€™s personal data (e.g. saved posts/comments), Reddit API also requires username and password.

    Yes, unfortunately it wants your plaintext Reddit password, you can read more about it here.

Exporting

Usage:

Recommended: create secrets.py keeping your api parameters, e.g.:

username = "USERNAME"
password = "PASSWORD"
client_id = "CLIENT_ID"
client_secret = "CLIENT_SECRET"

If you have two-factor authentication enabled, append the six-digit 2FA token to the password, separated by a colon:

password = "PASSWORD:343642"

The token will, however, be short-lived.

After that, use:

python3 -m rexport.export --secrets /path/to/secrets.py

That way you type less and have control over where you keep your plaintext secrets.

Alternatively, you can pass parameters directly, e.g.

python3 -m rexport.export --username <username> --password <password> --client_id <client_id> --client_secret <client_secret>

However, this is verbose and prone to leaking your keys/tokens/passwords in shell history.

You can also import export.py as a module and call get_json function directly to get raw JSON.

I highly recommend checking exported files at least once just to make sure they contain everything you expect from your export. If not, please feel free to ask or raise an issue!

API limitations

WARNING: reddit API limits your queries to 1000 entries.

I highly recommend to back up regularly and keep old exports. Easy way to achieve it is command like this:

python3 -m rexport.export --secrets /path/to/secrets.py >"export-$(date -I).json"

Or, you can use arctee that automates this.

Check out these links if youā€™re interested in getting older data thatā€™s inaccessible by API:

Example output

See example-output.json, itā€™s got some example data you might find in your data export. Iā€™ve cleaned it up a bit as itā€™s got lots of different fields many of which are probably not relevant.

However, this is pretty API dependent and changes all the time, so better check with Reddit API if you are looking to something specific.

Using the data

You can use rexport.dal (stands for ā€œData Access/Abstraction Layerā€) to access your exported data, even offline. I elaborate on motivation behind it here.

  • main usecase is to be imported as python module to allow for programmatic access to your data.

    You can find some inspiration in =my.= package that Iā€™m using as an API to all my personal data.

  • to test it against your export, simply run: python3 -m rexport.dal --source /path/to/export
  • you can also try it interactively: python3 -m rexport.dal --source /path/to/export --interactive

Example output:

Your most saved subreddits:
[('orgmode', 50),
 ('emacs', 36),
 ('QuantifiedSelf', 33),
 ('AskReddit', 33),
 ('selfhosted', 29)]

rexport's People

Contributors

karlicoss avatar miguelrochefort avatar redthing1 avatar seanbreckenridge avatar shreyasminocha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

rexport's Issues

Syntax error

When I run the script I get a syntax error

pi@raspberrypi:~/projects/rexport$ python rexport.py --secrets reddit_secrets.py
  File "rexport.py", line 14
    profile: Dict
           ^
SyntaxError: invalid syntax

GitHub Actions build is failing:

Error: Unable to process command '::add-path::/home/runner/.local/bin' successfully.
Error: The `add-path` command is disabled.
Please upgrade to using Environment Files or opt into unsecure command execution by setting the `ACTIONS_ALLOW_UNSECURE_COMMANDS` environment variable to `true`.
For more information see: github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands

N00b having Trouble with commands

Hey - I'm not sure how to create the initial secrets.py file. Could you please update the readme with the exact commands I should be using to export the data?

Thanks!

Character encoding error on Windows

I get this error when I run export.py or list(my.reddit.saved()) on Windows:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\migue\.config\my\my\config\repos\rexport\dal.py", line 157, in saved
    for s in self._accumulate(what='saved'):
  File "C:\Users\migue\.config\my\my\config\repos\rexport\dal.py", line 139, in _accumulate
    for f, r in self.raw():
  File "C:\Users\migue\.config\my\my\config\repos\rexport\dal.py", line 133, in raw
    yield f, json.load(fo)
  File "C:\Users\migue\Anaconda3\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "C:\Users\migue\Anaconda3\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 38807: character maps to <undefined>

fails to handle poll data

    raise RuntimeError(f"Unexpected type: {type(d)}")
RuntimeError: Unexpected type: <class 'praw.models.reddit.poll.PollData'>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.