Git Product home page Git Product logo

bookmarks-converter's Introduction

Bookmarks Converter


image image image

Bookmarks Converter is a package that converts the webpage bookmarks from DataBase/HTML/JSON to DataBase/HTML/JSON. It can be used as a module or using the CLI.

  • The Database files supported are custom sqlite database files created by the SQLAlchemy ORM model found in the .models.py.

  • The HTML files supported are Netscape-Bookmark files from either Chrome or Firefox. The output HTML files adhere to the firefox format.

  • The JSON files supported are the Chrome .json bookmarks file, the Firefox .json bookmarks export file, and the custom json file created by this package.

To see example of the structure or layout of the DataBase, HTML or JSON versions supported by the packege, you can check the corresponding file in the data folder found in the github page data or the bookmarks_file_structure.md.


Table of Contents


Python and OS Support

The package has been tested on Github Actions with the following OSs and Python versions:

OS \ Python 3.12 3.11 3.10 3.9
macos-latest
ubuntu-latest
windows-latest

Dependencies

The package relies on the following libraries:


Install

Bookmarks Converter is available on PYPI

python -m pip install bookmarks-converter

Test

To test the package you will need to clone the git repository.

# Cloning with HTTPS
git clone https://github.com/radam9/bookmarks-converter.git

# Cloning with SSH
git clone [email protected]:radam9/bookmarks-converter.git

then you create and install the dependencies using Poetry.

# navigate to repo's folder
cd bookmarks-converter
# install the dependencies
poetry install
# run the tests
poetry run pytest

Usage as Module

from bookmarks_converter import BookmarksConverter

# initialize the class passing in the path to the bookmarks file to convert
bookmarks = BookmarksConverter("/path/to/bookmarks_file")

# parse the file passing the format of the source file; "db", "html" or "json"
bookmarks.parse("html")

# convert the bookmarks to the desired format by passing the fomrat as a string; "db", "html", or "json"
bookmarks.convert("json")

# at this point the converted bookmarks are stored in the 'bookmarks' attribute.
# which can be used directly or exported to a file.
bookmarks.save()

Usage as CLI

# Activate the virtual environment if the "bookmarks-converter" package was installed inside one.

# run bookmarks-converter with the desired settings

# bookmarks-converter input_format output_format filepath
bookmarks-converter db json /path/to/file.db

# use -h for to show the help message (shown in the code block below)
bookmarks-converter -h

The help message:

usage: bookmarks-converter [-h] [-V] input_format output_format filepath

Convert your browser bookmarks file from (db, html, json) to (db, html, json).

positional arguments:
  input_format   Format of the input bookmarks file. one of (db, html, json).
  output_format  Format of the output bookmarks file. one of (db, html, json).
  filepath       Path to bookmarks file to convert.

optional arguments:
  -h, --help     show this help message and exit
  -V, --version  show program's version number and exit

License

MIT License

bookmarks-converter's People

Contributors

dependabot[bot] avatar github-actions[bot] avatar radam9 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bookmarks-converter's Issues

Feature Request: back-convert from JSON/db format to Chrome/Firefox format

This is a feature request to implement converting back to Chrome or Firefox-formatted JSON from the currently-implemented JSON and/or sqlite3 format.

The reason for this is simply to keep bookmarks in sync. If someone is using your utility to manage and back up bookmarks, it would be amazing to be able to convert directly to the Chrome and Firefox JSON formats in order to replace their respective Bookmarks files programatically, without having to go through the browser and import an HTML file manually (which could also result in duplicates).

An example workflow would be as follows:

  • User exports Bookmarks JSON file from Chrome/Firefox to bookmarks.db
  • User manages/updates bookmarks directly in bookmarks.db (which can be easily implemented in other applications that could require the bookmarks-converter project in requirements.txt)
  • User converts bookmarks.db back to Chrome/Firefox JSON format using bookmarks-converter
  • User replaces Chrome/Firefox Bookmarks JSON file with resulting file from bookmarks-converter

Bingo-bango, bookmarks are kept in sync.

This could also be used as a sync mechanism to keep Chrome and Firefox bookmarks in-sync by running a cron script using bookmarks-converter to convert from the Firefox native JSON format to the universal Bookmarkie JSON format, and then from the Bookmarkie format to the Chrome native JSON format (and vice-versa).

I feel like this could be an absolute game-changer and would cause this project to absolutely explode, as it is currently the only project on the Internet that implements its current abilities, and with the addition of being backward-compatible, it would make this project absolutely unstoppable.

I also feel like this could be implemented fairly-easily since you have the knowledge of the various necessary JSON formats. I would try and help, but I feel like you could do this in a fraction of the time it would take me. The basic changes (at least for Chrome), would be to [re-]re-structure the root object (rename to roots), remove the extraneous fields, add a checksum (hashlib.sha256(f"fake_placeholder_hash".encode('utf-8')).hexdigest()), convert the 3 child items to dictionary objects, so that roots.children is a dictionary instead of a list, and renaming the appropriate keys back to the Chrome-specific naming conventions.

By implementing this back-conversion functionality, you could additionally implement direct Chrome-to-Firefox and Firefox-to-Chrome functionalities that essentially do the exact same thing but hide the middle step from the user. You would convert Firefox JSON to universal JSON/sqlite, then universal JSON/sqlite to Chrome JSON (and vice-versa).

I sincerely hope that you take this into serious consideration, as I believe it could be implemented quite easily. Please let me know of any potential caveats that you could see that could prevent this functionality.

Import custom JSON based on file system hierarchy?

I am building a bookmark manager and would like to include your project, but when I generate custom JSON and try to convert it, it messes everything up.

I have a feeling this comes from the IDs, which appear to determine folder hierarchy.

Would there be a way to generate custom IDs to auto-generate the hierarchy? The folder stack appears completely correct, but the final output is unable to be imported without looking messed up.

I have attached a few example files for reference.

bookmarks_troubleshooting.zip

json input file from Android Chromium browser, wrong file structure

My best regards. I've been using Adblock browser (a Chromium browser for Android) for long and I wanted to export its bookmarks. Unfortunately, it has the feature disabled unlike other Chromium browsers available for Android (one ot the reasons behind the switch) so I've manually saved the bookmarks file (bookmarks_unedited.json).
It clearly seems to me a json but bookmarks-converter doesn't accept it as valid. I've then checked the expected file structure here and have found out that it isn't the same so I've tried to edit the original file obtaining bookmarks.json but it still complains.
I've then found this javascript (thanks to one of the advices given here) which has worked like a charm solving my problem but I'd still like to understand what I'm doing wrong here, clearly not an issue with bookmarks-converter itself and I care about underlining it.
Thanks in advance and have a nice day!! :)

Adding support for Safari Bookmarks

I'm interested in helping adding support for safari bookmarks.

I have done some research, and here's what I know so far:

  • The default location that safari store bookmarks on macOS is ~/Library/Safari/Bookmarks.plist (ref)
  • The file is packed with Apple's binary property list format, which can be breakdown following this awesome guide
  • Luckily, we also have plistlib which should be very helpful for parsing the file

Seeing the code, I found that there seems to be a well-organized structure of how this library is orchestrated between different components/classes. I was hoping if there are some manual to guide me on steps to add support for this.

Thank you

Cannot parse Chrome Bookmarks file, and _iterate_folder_html doesn't appear to be recursive

core.py looks for a list in the roots dict if the browser is Chrome. However, Bookmarks folder lookes like this:

{ 
  "checksum": "example",
  "roots": {
    "bookmark_bar": { "name": "Bookmarks bar",  "id": "1", "children": [(stuff)], ...},
    "other": { "name": "Other Bookmarks", "id": "2", "children": [], ...},
    "synced": { "name": "Mobile Bookmarks", "id": "3", "children": [], ...}
  }
}

Trying to parse this file yields an error, because like it said, it's looking for a list instead of a dict.

However, if I try to pass in roots['bookmark_bar']['children'] as the actual JSON to parse, this "works", but the html output is nothing that can be imported in any logical order because of a possible oversight (?) in core.py in the _iterate_folder_html method. A short snippet of the final html output I get is as follows (note that my "categories" folder in my "Bookmarks bar" is quite extensive with lots of nested folders):

<TITLE>Bookmarks</TITLE>
--
  | <H1>Bookmarks Menu</H1>
  |  
  | <DL><p>
  | <DT><H3 ADD_DATE="1655413626" LAST_MODIFIED="0" PERSONAL_TOOLBAR_FOLDER="true">Bookmarks bar</H3>
  | <DL><p>
  | <DT><H3 ADD_DATE="1655413626" LAST_MODIFIED="0">categories</H3>
  | <DL><p>
  | <folder4><folder5><folder6><folder7><folder10><folder14><folder15><folder16><folder17><folder18><folder19><folder20><folder22><folder23><folder76><folder94><folder95><folder122><folder123><folder124><folder125><folder126><folder128><folder132><folder133><folder135><folder136><folder137><folder138><folder139><folder140><DT><H3 ADD_DATE="1655413626" LAST_MODIFIED="0">mon</H3>
  | <DL><p>
  | <DT><H3 ADD_DATE="1655413626" LAST_MODIFIED="0">alertsite</H3>
  | <DL><p>
  | <DT><A HREF="http://www.alertsite.com/cgi-bin/helpme.cgi?page=monitoring_locations.html" ADD_DATE="1655413626" LAST_MODIFIED="0" ICON_URI="None" ICON="None">AlertSite Monitoring Locations-IPs</A>
  | </DL><p>

Based on the code, it is my understanding that these placeholders are supposed to get replaced with the actual information that resides in the bookmark stack (which is absolutely correct when viewed with a debugger), but they don't seem to be getting replaced or are overlooked somehow from non-recursion or something.

That being said, initially parsing a bookmarks.html file into the Bookmarkie formatted JSON, and subsequently parsing back to HTML works just fine.

The trouble seems to happen when parsing the native Chrome Bookmarks JSON file, where things seem to get all out of order.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.