Git Product home page Git Product logo

sorter-gpm-takeout's Introduction

Sorter for Google Play Music Takeout

Overview

Google Play Music files can be exported via Google Takeout. However, the resulting file structure in the archives is flattened (see below for details) from the original (notably, no Albums listed) and many, if not most, of the files have truncated names.

In order to bring back structure, this project will iterate through the tracks and rebuild the albums list, possibly renaming the files as desired. To do so, the files in Takeout/Google Play Music/Tracks will be read for clues to piece the files together.

Furthermore, a pattern may be supplied in the configuration file to rename files in a way that better convey a track's metadata.

This project is not ever intended to modify metadata. It is strictly to organize the tracks into separate directories based on album and artist(s) in the tracks' metadata.

⚠ Please ensure all your music is intact before deleting the original Takeout archive files!

File structure

When you extract the files from the archives, you will end up with a file structure like this:

Takeout/
├── archive_browser.html
└── Google Play Music
    ├── Playlists
    │   ├── Playlist A
    │   │   ├── Metadata.csv
    │   │   └── Tracks
    │   │       ├── Track1.csv
    │   │       └── Track2.csv
    │   ├── Shuffle
    │   │   ├── Metadata.csv
    │   │   └── Tracks
    │   │       ├── Track1.csv
    │   │       └── Track2.csv
    │   └── Thumbs Up
    │       ├── Track1.csv
    │       └── Track2.csv
    ├── Radio Stations
    │   ├── My Stations
    │   │   └── Station1.csv
    │   └── Recent Stations
    │       ├── Station1.csv
    │       └── Station2.csv
    └── Tracks
        ├── Track1.csv
        ├── Track2.csv
        ├── Track1.mp3
        └── Track2.mp3

Format Structure

In the configuration file, "format" dictates how a file will be renamed. This value is a Python pre-formatted string, to be used with str.format().

The following fields are available:

  • artist: the primary artist(s) of the track
  • album: the album in which this track belongs
  • title: title of the track
  • disc_num: disc number; goes from 1 * to disc_max
  • disc_max: number of discs that belong to the album
  • track_num: track number; goes from 1 * to track_max
  • track_max: number of tracks on an album's disc containing the track

To use these fields, simply insert the field name anywhere you prefer and surround the field with curly braces, like so: {artist}.

* Unfortunately, these fields can be missing/stripped from Takeout. Disc numbers will be updated to the default of 1 if missing, but track numbers can't be changed. In both cases, you will be alerted of the missing numbers and can take action later. Make sure to check the logs for has missing to pin-down the troublesome files.

Usage

  1. Configure config.json by copying the example and renaming it. You may use tildes (~) to represent your home directory. You can also change the file name format.
  2. Optionally, configure corrections.json. This file has the structure of a dictionary with up to 3 keys ("Artist", "Album", "Title"), with each key having dictionaries that transform values from the key to the value. For example, if a track spells an artist in all caps and normally the artist isn't spelled all caps, you can put the all caps name under "Artist" and assign it the expected value.
  3. Run python main.py.

Requirements

This code is designed around the following:

  • Python 3.7+

Setup

config.json

  • "format": a Python pre-formatted string; see Format Structure
  • "takeout_dir": a string that represents the path to Takeout (must end in Takeout)
  • "dest_dir": a string that represents the destination directory, where subdirectories Artists and Albums will reside

corrections.json (optional)

  • "Artist"
  • "Album"
  • "Title"

These corrections do not take in account any other context, only their associated field. This means that when a value matches, it will be unconditionally substituted with its replacement.

For instance, if you put "abc": "ABC" under "Artist", an artist (or group artist) whose name contains abc will now be changed to ABC, even if abc is in the middle of the name.

If any of the fields are missing, they will be substituted with an empty dictionary within config.py, but will not be used in corrections.

Disclaimer

This project is not affiliated with or endorsed by Google. See LICENSE for more detail.

sorter-gpm-takeout's People

Contributors

dark-nova avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

sorter-gpm-takeout's Issues

Some potential bugs with the current file sorting algorithm

The following bugs may negatively affect library management.

For the time being, I strongly recommend keeping the original archives until these bugs are fixed!

Tracks without album tag

Tracks without an album tag may cause an issue in Sorter.sort. In particular, the following snippet assumes the track has an album:

album_dir = self.make_dirs('Album', self.metadata['album'])

On a related note, because tracks are assumed to have albums (and thus the track moved to its intended destination), the tracks are then linked to individual artists (and a directory named after the mutual artists if applicable). Because the above bug will prevent a track from being moved to an album directory, the artists' directories will not contain a link to the file.

To mitigate both of these issues, the library may undergo another change: create a Tracks directory and link tracks from there to their respective albums and artists.

Sorter.handle_suffix() and suffixes and lists

Sorter.handle_suffix() may have issues with lists of artists with both commas and name suffixes. Perhaps replacing the comma in , Jr. beforehand can help correct this issue. For now though, the function will incorrectly disallow a list of artists being split by commas if , Jr. is present.

Add file-check utility to ensure number of files in destination matches the source Takeout directory

Right now, there is no easy way to verify all the tracks have been successfully sorted into their respective directories. My proposed solution is to count the number of tracks in each subdirectory of dest, where dest is the destination directory root.

Some potential issues, though:

  1. Symbolic links should not be counted, as they are considered duplicate data/shortcuts.
  2. Tracks that don't have an album but have an artist should be counted in a way that ignores symbolic links.
  3. I don't remember if this is accurate, but I recall some files in the Takeout archive were duplicate tracks. I will have to keep track of how many duplicates were encountered.

Improve handling of tracks with no album or artist tags

This enhancement improves upon issue #1 - specifically to sort tracks without an album tag.

For tracks that lack both album and artist tags, they should be placed somewhere, rather than not be acted on.

Relevant lines in question:

try:
album_dir = self.make_dirs('Album', self.metadata['album'])
except KeyError:
# This isn't an album track, but when no album is present,
# the group artists are used instead while retaining
# the original variable name.
album_dir = self.make_dirs('Artist', self.metadata['artist'])
has_album = False

This may also fix KeyErrors with the current try: except KeyError: block, but I am uncertain.

Things to do with this issue:

  1. Check each tag for emptiness along with the try: block. (Not sure if eyed3 treats literal empty tags as missing.)
  2. If a tag doesn't have an album or artist tag, move it to a separate directory. (Something like dest/no_tags, where dest is the destination directory root, will do.)

Playlists are not managed by this project.

Because the project only seeks to sort files by metadata and end users can choose any music player afterwards, playlists are not supported.

Please make sure to copy your Playlists directory from Takeout if you intend to remove the archives and originals.

Improve handling of missing (zeroed) disc and track numbers

If disc number is missing, simply use 1 in place of it.

However, this can't be applied to tracks. Unfortunately, there may be a problem assigning track numbers. I think the best thing that can be done in regards to missing track numbers is to let the user know about this.

Regardless, in both cases, the user should be alerted to these specific metadata fields.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.