Git Product home page Git Product logo

import-mailbox-to-gmail's Introduction

Import .mbox files to Google Workspace (formerly G Suite / Google Apps)

This script allows Google Workspace admins to import mbox files in bulk for their users.

DISCLAIMER: This is not an official Google product.

If you want to migrate from Mozilla Thunderbird, try mail-importer.

You only authorize it once using a service account, and then it can import mail into the mailboxes of all users in your domain.

A. Creating and authorizing a service account for Gmail API

  1. Go to the Developers Console and log in as a domain super administrator.

  2. Create a new project.

  • If you have not used the API console before, select Create a project from the Select a project dropdown list.
  • If this is not your first project, use the Create Project button.
  1. Enter "Gmail API" (or any name you prefer) as the project name and press the Create button. If this is your first project you must agree to the Terms of Service at this point.

  2. Click the Enable and manage APIs link in the Use Google APIs box.

  3. Enable the Gmail API - Select the Gmail API link and press the Enable API button. You can leave the default APIs enabled - it doesn't matter.

  4. Click the 3-line icon () in the top left corner of the console.

  5. Click IAM & Admin and select Service accounts.

  6. Click Create service account.

  7. Enter a name (for example, "import-mailbox") in the Name field.

  8. Check the Furnish a new private key box and ensure the key type is set to JSON.

  9. Check the Enable G Suite Domain-wide Delegation box and enter a name in the Product name for the consent screen field.

  10. Click Create. You will see a confirmation message advising that the Service account JSON file has been downloaded to your computer. Make a note of the location and name of this file. This JSON file contains a private key that potentially allows access to all users in your domain. Protect it like you'd protect your admin password. Don't share it with anyone.

  11. Click Close.

  12. Click the View Client ID link in the Options column.

  13. Copy the Client ID value. You will need this later.

  14. Go to the Domain-wide Delegation page of the Admin console for your Google Workspace domain.

  15. Under Client ID, enter the Client ID collected in step 15.

  16. Under OAuth Scopes, enter the following:

https://www.googleapis.com/auth/gmail.insert, https://www.googleapis.com/auth/gmail.labels
  1. Click Authorize.

You can now use the JSON file to authorize programs to access the Gmail API "insert" and "label" scopes of all users in your Google Workspace domain.

B. Importing mbox files using import-mailbox-to-gmail.py

Important: If you're planning to import mail from Apple Mail.app, see the notes below.

  1. Download the script - import-mailbox-to-gmail.py.

  2. Download and install Python 2.7 (not Python 3.x) for your operating system if needed.

  3. Open a Command Prompt (CMD) window (on Windows) / Terminal window (on Linux).

  4. Install the Google API Client Libraries for Python and their dependencies by running, all in one line:

    Mac/Linux:

    sudo pip install --upgrade google-api-python-client PyOpenSSL
    

    Windows:

    C:\Python27\Scripts\pip install --upgrade google-api-python-client PyOpenSSL
    

    Note: On Windows, you may need to do this on a Command Prompt window that was run as Administrator.

  5. Create a folder for the mbox files, for example C:\mbox.

  6. Under that folder, create a folder for each of the users into which you intend to import the mbox files. The folder names should be the users' full email addresses.

  7. Into each of the folders, copy the mbox files for that user. Make sure the file name format is <LabelName>.mbox. For example, if you want the messages to go into a label called "Imported messages", name the file "Imported messages.mbox".

Your final folder and file structure should look like this (for example):

C:\mbox
C:\mbox\[email protected]
C:\mbox\[email protected]\Imported messages.mbox
C:\mbox\[email protected]\Other imported messages.mbox
C:\mbox\[email protected]
C:\mbox\[email protected]\Imported messages.mbox
C:\mbox\[email protected]\Other imported messages.mbox

IMPORTANT: It's essential to test the migration before migrating into the real users' mailboxes. First, migrate the mbox files into a test user, to make sure the messages are imported correctly.

  1. To start the migration, run the following command (one line):

    Mac/Linux:

    python import-mailbox-to-gmail.py --json Credentials.json --dir C:\mbox
    

    Windows:

    C:\Python27\python import-mailbox-to-gmail.py --json Credentials.json --dir C:\mbox
    
  • Replace import-mailbox-to-gmail.py with the full path of import-mailbox-to-gmail.py - usually ~/Downloads/import-mailbox-to-gmail.py on Mac/Linux or %USERPROFILE%\Downloads\import-mailbox-to-gmail.py on Windows.
  • Replace Credentials.json with the path to the JSON file from step 12 above.
  • Replace C:\mbox with the path to the folder you created in step 5.

The mbox files will now be imported, one by one, into the users' mailboxes. You can monitor the migration by looking at the output, and inspect errors by viewing the import-mailbox-to-gmail.log file.

Options and notes

  • Use the --from_message parameter to start the upload from a particular message. This allows you to resume an upload if the process previously stopped. (Affects all users and all mbox files)

    e.g. ./import-mailbox-to-gmail.py --from_message 74336

  • If any of the folders have a ".mbox" extension, it will be dropped when creating the label for it in Gmail.

  • To import mail from Apple Mail.app, make sure you export it first - the raw Apple Mail files can't be imported. You can export a folder by right clicking it in Apple Mail and choosing "Export Mailbox".

  • This script can import nested folders. In order to do so, it is necessary to preserve the email folders' hierarchy when exporting them as mbox files. In Apple Mail.app, this can be done by expanding all subfolders, selecting both parents and subfolders at the same time, and exporting them by right clicking the selection and choosing "Export Mailbox".

  • If any of the folders have a ".mbox" extension and a file named "mbox" in them, the contents of the "mbox" file will be imported to the label named as the folder. This is how Apple Mail exports are structured.

  • To run under Docker:

    1. Build the image:
     docker build -t google/import-mailbox-to-gmail .
    
    1. Run the import command:
     docker run --rm -it \
         -v "/local/path/to/auth.json:/auth.json" \
         -v "/local/path/to/mbox:/mbox" \
         google/import-mailbox-to-gmail --json "/auth.json" --dir "/mbox"
    

    Note -v is mounting a local file/directory /local/path/to/auth.json in the container as /auth.json. The command is then using it within the container --json "/auth.json". For more help, see Volume in Docker Run.

import-mailbox-to-gmail's People

Contributors

bryant1410 avatar charles-rumley avatar eesheesh avatar gdoucet avatar isleshocky77 avatar jnnnthnn avatar mohan3d avatar morrislaptop avatar pmox avatar prosbloom225 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

import-mailbox-to-gmail's Issues

cannot import name SignedJwtAssertionCredentials

python ./import-mailbox-to-gmail.py --json credentials.json --dir ./mbox/
Traceback (most recent call last):
File "./import-mailbox-to-gmail.py", line 33, in
from oauth2client.client import SignedJwtAssertionCredentials
ImportError: cannot import name SignedJwtAssertionCredentials

I'm no developer, but Google seems to think SignedJwtAssertionCredentials is deprecated? Running on Mac OS X 10.11.3

os.walk error

(Python2.7) root@LT2226:/mnt/c/python# python import-mailbox-to-gmail.py --json /MBOX/Credentials.json --dir /MBOX --logging_level INFO
15:31:08 INFO [email protected] *** Starting import-mailbox-to-gmail 1.5 on Python 2.7.12 (default, Dec 4 2017, 14:50:18)
[GCC 5.4.0 20160609] ***
15:31:08 INFO [email protected] Arguments:
15:31:08 INFO [email protected] auth_host_name: 'localhost'
15:31:08 INFO [email protected] auth_host_port: [8080, 8090]
15:31:08 INFO [email protected] dir: '/MBOX'
15:31:08 INFO [email protected] fix_msgid: True
15:31:08 INFO [email protected] from_message: 0
15:31:08 INFO [email protected] httplib2debuglevel: 0
15:31:08 INFO [email protected] json: '/MBOX/Credentials.json'
15:31:08 INFO [email protected] log: 'import-mailbox-to-gmail-535.log'
15:31:08 INFO [email protected] logging_level: 'INFO'
15:31:08 INFO [email protected] noauth_local_webserver: False
15:31:08 INFO [email protected] num_retries: 10
15:31:08 INFO [email protected] replace_quoted_printable: True
Traceback (most recent call last):
File "import-mailbox-to-gmail.py", line 415, in
main()
File "import-mailbox-to-gmail.py", line 341, in main
for username in next(os.walk(args.dir))[1]:
StopIteration

any ideas?

Make a new release to include the fix for issue #9

I read about nested labels support in issue #9 and expected that by downloading and running the script linked from README.md, such support would work.

However, this is currently not the case, as the latest release linked from README.md is 1.3, which was built before the fix for #9.

Please consider making a new release and update the download link in README.md.

Stuck at last step: No module named apiclient

I am stuck at step B.8. (last step)

XXX-MacBook-Air:Downloads XXX$ ./import-mailbox-to-gmail.py --json Credentials.json --dir ~/Downloads/mbox/
Traceback (most recent call last):
File "./import-mailbox-to-gmail.py", line 31, in
from apiclient import discovery
ImportError: No module named apiclient

I followed the fix proposed by http://stackoverflow.com/questions/18267749/importerror-no-module-named-apiclient-discovery:
sudo pip install --upgrade google-api-python-client

XXX-MacBook-Air:Downloads XXX$ sudo pip install --upgrade google-api-python-client
Password:
The directory '/Users/lguillaumont/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/Users/lguillaumont/Library/Caches/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Requirement already up-to-date: google-api-python-client in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Requirement already up-to-date: oauth2client<4.0.0,>=1.5.0 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from google-api-python-client)
Requirement already up-to-date: six<2,>=1.6.1 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from google-api-python-client)
Requirement already up-to-date: httplib2<1,>=0.8 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from google-api-python-client)
Requirement already up-to-date: uritemplate<1,>=0.6 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from google-api-python-client)
Requirement already up-to-date: pyasn1>=0.1.7 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from oauth2client<4.0.0,>=1.5.0->google-api-python-client)
Requirement already up-to-date: pyasn1-modules>=0.0.5 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from oauth2client<4.0.0,>=1.5.0->google-api-python-client)
Requirement already up-to-date: rsa>=3.1.4 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from oauth2client<4.0.0,>=1.5.0->google-api-python-client)
Requirement already up-to-date: simplejson>=2.5.0 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from uritemplate<1,>=0.6->google-api-python-client)

However, I believe the code should be upgraded (see second answer):

bad

from apiclient.discovery import build

good

from googleapiclient.discovery import build

Could you help fixing that?

Issue when execute the commands

Hello,

I have a problem with python2, I use Mac and I have install python2 and python3. In python3 your script runs Ok but depend the time the Script Stopped and in the log I haven't error.

When I tried execute the comand in python2 I receive this error;
python /Users/jacobogarrido/import_mbox_to_gmail/import-mailbox-to-gmail.py --json Credentials.json --dir /Users/jacobogarrido/import_mbox_to_gmail
Traceback (most recent call last):
File "/Users/jacobogarrido/import_mbox_to_gmail/import-mailbox-to-gmail.py", line 31, in
from apiclient import discovery
ImportError: No module named apiclient

Could you help me please???

Many thanks

Restart hung process

Hi, my computer hibernated overnight and the process has hung after loading 15000 mail messages. Is there any way to re-start the process from the point it has hung?

Generate .EXEs to make life easier for Windows admins

PyInstaller makes it pretty simple to generate single-file .EXEs (32 and 64 bit) for Windows users. This is much easier than walking them through installing Python, pip-installing dependencies and then running the script.

Basic steps to setup a Windows buld machine would be:

  1. On x64 machine, install Python 2.7.11 for both x64 and x32. Our .bat file expects the paths c:\python27 and c:\python27-32 respectively (that can be changed as needed).
  2. Run

c:\python27\scripts\pip install google-api-python-client
c:\python27-32\sripts\pip install google-api-python-client

to install the API Client, oauth2client and other necessary libaries.
3. Install 7-zip to handle .zip of .exe and license so it can be added to a release on GitHub. http://www.7-zip.org/download.html
4. Now from the import-mailbox-to-gmail source folder it should be possible to run:
build.bat 1.3

to compile the .EXEs and zip them so that they are ready for release.

Pull requests inbound...

Label Creation & Skipping Issue

I suspect I am having some label creation and skipping issues due to the structure of the .mbox files.. I would love to get your feedback.

When I run the import-mailbox-to-gmail script, I noticed 2 main things:

  1. The below 4 mbox labels are not created in Gmail. Instead individual labels for each .emlx file are created.
  2. The emails themselves are not being uploaded (messages or attachments).

I basically have 4 .mbox "files" which I copied from the Apple Mail library. I only want these to be uploaded to Gmail under these labels for one particular account - [email protected]
Deleted Messages.mbox
Drafts.mbox
INBOX.mbox
Sent Messages.mbox

I have these mbox files stored in the following directory:
C:\mbox\[email protected]

I'm not sure how .mbox should be structured for this script to work. When I open up any of the above .mbox "files", there is a sub-folder structure of info.plist, a Messages folder (.emlx files) and an Attachments folder (raw files).

I also had a look at the log and posted below all warnings, errors and (sample) skipping I found.
It would be great to get your input and your suggestions for fixing. Thanks so much.

WARNING autodetect@init.py file_cache is unavailable when using oauth2client >= 4.0.0
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/googleapiclient/discovery_cache/init.py", line 41, in autodetect
    from . import file_cache
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in
    'file_cache is unavailable when using oauth2client >= 4.0.0')
ImportError: file_cache is unavailable when using oauth2client >= 4.0.0

ERROR [email protected] Can't create label 'Drafts' for user [email protected]
Traceback (most recent call last):
  File "/Users/accounts/Documents/user/gmail_import_files/import-mailbox-to-gmail.py", line 153, in get_label_id_from_name
    body=label_object).execute(num_retries=args.num_retries)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/googleapiclient/http.py", line 842, in execute
    raise HttpError(resp, content, uri=self.uri)
HttpError: <HttpError 400 when requesting https://www.googleapis.com/gmail/v1/users/user1%40domain.com/labels?alt=json returned "Invalid label name">
15:35:47 ERROR [email protected] Labels under 'Drafts.mbox' may not nest correctly
15:35:47 INFO [email protected] Skipping '/Users/accounts/Documents/user1/mbox/[email protected]/.DS_Store' because it doesn't have a .mbox extension
15:35:47 INFO [email protected] Skipping '/Users/accounts/Documents/user/mbox/[email protected]/Deleted Messages.mbox/.DS_Store' because it doesn't have a .mbox extension
15:35:47 INFO [email protected] Skipping '/Users/accounts/Documents/user/mbox/[email protected]/Deleted Messages.mbox/Info.plist' because it doesn't have a .mbox extension

15:35:47 INFO [email protected] Done importing user [email protected]. Labels: 0 succeeded, 0 with some errors, 0 failed. Messages: 0 succeeded, 0 failed.
15:35:47 INFO [email protected] *** Done importing all users from directory '/Users/accounts/Documents/user/mbox'
15:35:47 INFO [email protected] *** Import summary:
15:35:47 INFO [email protected]     1 users imported with no failures
15:35:47 INFO [email protected]     0 users imported with some failures
15:35:47 INFO [email protected]     0 users failed
15:35:47 INFO [email protected]     0 labels (mbox files) imported with no failures
15:35:47 INFO [email protected]     0 labels (mbox files) imported with some failures
15:35:47 INFO [email protected]     0 labels (mbox files) failed
15:35:47 INFO [email protected]     0 messages imported successfully
15:35:47 INFO [email protected]     0 messages failed

Instructions are out of date or n/a

I have a googlealumni.com account (but obviously I am not admin of the whole googlealumni.com-domain) and would like to import some mbox:es into that account. Following the instructions for this project...well, it breaks down completely quite early on. Either they are outdated or things work very differently if you are just a user rather than an admin of a GSuite account.

It would be very nice if the instructions were updated/adopted to regular users too.

Filename too long

Hi there,

I'm setting this up on a macOS Sierra system and have an issue when running the actual import command.

I get the following trace:

$ ./import-mailbox-to-gmail.py --json "/Users/tal/Downloads/Chrome/Josephine\ Email\ Import-1185bda2bfc5.json" --dir mbox
./import-mailbox-to-gmail.py: line 18: Import mbox files to a specified label for many users.

Liron Newman [email protected]

Copyright 2015 Google Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the License);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an AS: File name too long
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
from: can't read /var/mail/apiclient
from: can't read /var/mail/apiclient.http
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
from: can't read /var/mail/apiclient.http
from: can't read /var/mail/oauth2client.service_account
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
import: unable to grab mouse `': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9202.
./import-mailbox-to-gmail.py: line 38: APPLICATION_NAME: command not found
./import-mailbox-to-gmail.py: line 39: APPLICATION_VERSION: command not found
./import-mailbox-to-gmail.py: line 41: SCOPES: command not found
./import-mailbox-to-gmail.py: line 42: https://www.googleapis.com/auth/gmail.labels]: No such file or directory
./import-mailbox-to-gmail.py: line 45: syntax error near unexpected token `('
./import-mailbox-to-gmail.py: line 45: `parser = argparse.ArgumentParser('

Here is my tree structure, which I believe follows the instructions:

tal at Tals-MacBook-Air in ~/Projects/josephine-gmail-import
$ tree .
.
├── import-mailbox-to-gmail.py
└── mbox
    └── [email protected]
        └── All mail Including Spam and Trash.mbox

2 directories, 2 files

I have Python and Pip installed via Homebrew.

tal at Tals-MacBook-Air in ~/Projects/josephine-gmail-import
$ pip --version
pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)
tal at Tals-MacBook-Air in ~/Projects/josephine-gmail-import
$ python --version
Python 2.7.12
tal at Tals-MacBook-Air in ~/Projects/josephine-gmail-import
$ which python
/usr/local/bin/python

Any ideas? Happy to provide more system background etc. Thanks!

How to show original sent date

Hi, importing mbox files, the Gmail message is not showing the original sent date, but importing one. Is there a way to fix this issue? I read this old post on StackOverflow, but I don't know if it is still possibile to implement that solution..
Thanks

Import Failed Messages?

Hey, I just used this software and it worked pretty much perfectly! I have a question though. My internet dropped during part of the sync, and thus, quite a few messages failed to sync. Anyone have an idea of how to sync these failed messages? I have the log file so I could filter/find which ones failed, but I'm not sure as to how to remove the successful ones from the mbox file in order to re-sync without doubling the messages.

Duplicate labels created when importing nested mboxes

Thanks for the incredibly useful tool!

Issue

While importing nested folders is not documented in README.md, it seems to work nonetheless, with a small caveat.

While using this tool to migrate an user's multi-level nested mboxes from a macOS Mail install to Gmail, I noticed that while the hierarchy was ultimately correctly reproduced on Gmail, a duplicate of each label was also created, with the entire hierarchy flattened at the root of the label list on Gmail. This means that one has to manually go in and delete all of these duplicates or write a script to do so programmatically.

The source of this seems to be the call to get_label_id_from_name() in process_mbox_files() while looping on dirs before files (at line 185). Since the dirs outputted by os_walk only contain the name of the dir, not the full path thereof, a bunch of 0-level labels are created which are not later populated. Indeed, the second call to get_label_id_from_name(), which properly includes the nested hierarchy, re-creates the label and populates those.

In my specific case, commenting out the loop on dirs enabled me to avoid this problem. However, the first call to get_label_id_from_name() in this label may have another purpose that I have not encountered (e.g. triggering "Labels under '%s' may not nest correctly" error logging, or dealing with the way other clients export mboxes). As such, it should be possible to edit this first call to ressemble the second one. I have opened a PR re/ this issue at #37.

Note on nested folders exporting with macOS Mail

One can export macOS Mail folders while preserving their hierarchy by expanding all subfolders in the UI, selecting both subfolders and parents at the same time, and exporting all of those together. If one does so, the resulting file hierarchy looks as follows (for 2 levels of depth) :

Parent_Name (dir)
-- Child_Name (dir)
---- GrandChild_Name.mbox
------ mbox (file)
------ table_of_contents (file)
-- Child_Name.mbox (dir)
---- mbox (file)
---- table_of_contents (file)
Parent_Name.mbox (dir)
-- mbox (file)
-- table_of_contents (file)

macOS mail creates mbox files even if a folder has no emails stored inside of it (i.e. the mbox file for the parents, in the corresponding .mbox directory, will have a size of 0 octets).

An interesting property of this for users of this tool is that given that os.walk is used for listing mbox files, mbox files contained in a nested folder will be detected. Since the get_label_id_from_name function is called with the var labelname, which itself is based on the full path minus the folder path, the nested labels created by the get_label_id_from_name() function will be named like so: Parent_Name/Child_Name. The result in the Gmail UI is that if both the Parent_Name and Parent_Name/Child_Name labels are created (which will happen even if the parent has 0 messages, since a mbox file is still created by macOS Mail), the two labels are properly collapsed both in the Gmail UI and over IMAP. This looks intentional on the author's part but is not documented in the README.md.

Add Dockerfile or instructions

Any appetite for adding Dockerfile or at least documenting in the README? I had conflicts while loading this in my current system so decided to do it through docker instead.

Setup

$ mkdir -p /opt/import-mailbox-to-gmail
$ cd /opt/import-mailbox-to-gmail
$ git clone https://github.com/google/import-mailbox-to-gmail.git build

Create Dockerfile

# ./build/Dockerfile

FROM python:2

WORKDIR /usr/src/app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY import-mailbox-to-gmail.py .

ENTRYPOINT [ "python", "./import-mailbox-to-gmail.py" ]

Build

NOTE I had to add oauth2client to requirements.txt for it to work properly

$ cd ..
$ docker build -t import-mbox-to-gmail build/

Get Data

$ mkdir -p data/auth
$ cp ~/Downloads/mail-import-013546-135458687.json data/auth/ # Authentication file
$ mkdir -p data/mbox/[email protected]
$ cp ~/Downloads/label-1.mbox data/mbox/[email protected]/

Run

$ docker run --rm -it -v "$PWD/data:/data" \
       import-mbox-to-gmail \
          --json /data/auth/mail-import-013546-135458687.json \
          --dir /data/mbox

Package 'rsa' requires a different Python: 2.7.18 not in '>=3.5, <4'

Steps to reproduce:

  • checkout master
  • docker build -t google/import-mailbox-to-gmail .
  • Build stops with #8 9.284 ERROR: Package 'rsa' requires a different Python: 2.7.18 not in '>=3.5, <4'

My suspicion is that some of the dependencies have dropped support for Python 2. Any way to specify a version of the required libraries or run with Python 3?

Imports Duplicate Messages

It seems like this tool imports duplicate messages? If a message already exists in the account and in the mbox file, the message is imported anyways. Does it do any checks on the destination account when importing?

Lots of 400 errors with no further details

I just imported about 15 years of mail history using this script, except I only imported about half of it because I got so many 400 errors. Here's an example:

2017-07-10T01:19:04 (+0000) 7625 ERROR process_mbox_files (import-mailbox-to-gmail.py:257) Failed to import mbox message
Traceback (most recent call last):
  File "./import-mailbox-to-gmail.py", line 250, in process_mbox_files
    media_body=media).execute(num_retries=args.num_retries)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
HttpError: <HttpError 400 when requesting https://www.googleapis.com/upload/gmail/v1/users/autarch%40urth.org/messages/import?internalDateSource=dateHeader&neverMarkSpam=true&uploadType=multipart&processForCalendar=false&fields=id&alt=json returned "Bad Request">

This doesn't really tell me anything useful.

Was this a transient error? Should I retry my entire import? Is something broken in the mbox file? What's going on.

Overall the google importing experience has been disappointing. I wrote a bit about this at https://blog.urth.org/2017/07/04/going-full-gmail/ if you're interested in some of the other problems I had.

Apple mail export .mbox problem?

I am trying to migrate mbox files that I exported from the Apple mail software.
However, the system says can't process mbox files for user [email protected]
And then says:

IOErrr: [Errno 21] Is a directory: "Users/.../..../ xxxxx.mbox'

I have the fear that the Apple does not extract compatible mbox files for the migration. Any help is welcome.

Seems not to install correctly!

Hi there,

I hope my post is not too stupid for you guys.

I am writing to ask what is wrong in the installation as I get this error at the command line:
fatal error C1083

What could have gone wrong?

Is there a way I can just reinstall the whole thing?

Jalal

oauth2client deprecated

Symptom

Nowadays pip install --upgrade google-api-python-client PyOpenSSL will cause our script to fail like this:

Traceback (most recent call last):
  File "./import-mailbox-to-gmail.py", line 36, in <module>
    from oauth2client.service_account import ServiceAccountCredentials
ImportError: No module named oauth2client.service_account

Cause

The google-api-client no longer requires/installs oauth2client package.

Workaround

pip install --upgrade google-api-python-client PyOpenSSL oauth2client

Discussion

https://google-auth.readthedocs.io/en/latest/oauth2client-deprecation.html

It looks like oauth2client.service_account.ServiceAccountCredentials.from_json_keyfile_name() has moved to google.oauth2.service_account.Credentials.from_service_account_file() but the new Credentials class doesn't define an authorize() function. It appears we'll need to use google_auth_httplib2.AuthorizedHttp(credentials) instead. But I'm not sure.

Additionally, I can't find any replacement for oauth2client.tools.argparser so I guess a workaround would be to disable the inheritance of the --auth* --noauth* args? Yuk!

Can manage to get it installed on Fedora 22

Hi
your tool is amazing and it works fine for me on Windows 7.
Thanks for that
Nevertheless it doesn't work on Fedora 22. It seems that I can't manage to install sudo pip install --upgrade google-api-python-client PyOpenSSL

thank for your help.

this is my error trace :

gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -DUSE__THREAD -I/usr/include/ffi -I/usr/include/libffi -I/usr/include/python2.7 -c c/_cffi_backend.c -o build/temp.linux-x86_64-2.7/c/_cffi_backend.o

c/_cffi_backend.c:2:20: erreur fatale: Python.h : Aucun fichier ou dossier de ce type

compilation terminée.

error: command 'gcc' failed with exit status 1

----------------------------------------
Command "/usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-BUYb8R/cffi/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-1FFo7X-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-BUYb8R/cff

No emails get imported

tried to import mbox file downloaded from user's content archive, no errors found and no emails get imported either.
Is it not recognizing google mbox archive?

2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:326) *** Starting import-mailbox-to-gmail 1.5 on Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 29 2018, 20:59:26)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] ***
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:327) Arguments:
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) auth_host_name: 'localhost'
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) auth_host_port: [8080, 8090]
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) dir: '/mbox_import/[email protected]/'
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) fix_msgid: True
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) from_message: 0
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) httplib2debuglevel: 0
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) json: 'mboximport-api-c0cc83f43193.json'
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) log: 'import-mailbox-to-gmail-13278.log'
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) logging_level: 'INFO'
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) noauth_local_webserver: False
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) num_retries: 10
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:329) replace_quoted_printable: True
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:389) *** Done importing all users from directory '
/mbox_import/[email protected]/'
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:390) *** Import summary:
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:392) 0 users imported with no failures
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:394) 0 users imported with some failures
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:396) 0 users failed
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:398) 0 labels (mbox files) imported with no failures
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:400) 0 labels (mbox files) imported with some failures
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:402) 0 labels (mbox files) failed
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:404) 0 messages imported successfully
2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:406) 0 messages failed

2018-10-05T10:01:42 (-0400) 13278 INFO main (import-mailbox-to-gmail.py:410) Finished.

Error encoding mail

Hello,

(Sorry for my bad english)
For some e-mail, i've an UnicodeDecodeError, it seems the e-mail is already in UTF-8.

13:07:01 INFO [email protected] Processing message 17 in label 'dlabbe-import'
13:07:01 ERROR [email protected] Failed to import mbox message
Traceback (most recent call last):
  File "import-mailbox-to-gmail.py", line 213, in process_mbox_files
    message_data = io.BytesIO(message.as_string().encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2147: ordinal not in range(128)

Expected a single 'From' header

Multiple messages fail to import with the following traceback

2017-01-13T16:50:15 (-0500) 65284 ERROR process_mbox_files (import-mailbox-to-gmail.py:233) Failed to import mbox message
Traceback (most recent call last):
  File "import-mailbox-to-gmail.py", line 226, in process_mbox_files
    media_body=media).execute(num_retries=args.num_retries)
  File "/Library/Python/2.7/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/googleapiclient/http.py", line 838, in execute
    raise HttpError(resp, content, uri=self.uri)
HttpError: <HttpError 400 when requesting https://www.googleapis.com/upload/gmail/v1/users/**********/messages/import?internalDateSource=dateHeader&neverMarkSpam=true&uploadType=multipart&processForCalendar=false&fields=id&alt=json returned "Expected a single 'From' header">

NameError: name 'unicode' is not defined

When running the script, I receive the following error:

PS D:\import> python import-mailbox-to-gmail.py --json Credentials.json --dir D:\import\mbox
20:11:09 INFO [email protected] *** Starting import-mailbox-to-gmail 1.5 on Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] ***
20:11:09 INFO [email protected] Arguments:
20:11:09 INFO [email protected]   auth_host_name: 'localhost'
20:11:09 INFO [email protected]   auth_host_port: [8080, 8090]
20:11:09 INFO [email protected]   dir: 'D:\\import\\mbox'
20:11:09 INFO [email protected]   fix_msgid: True
20:11:09 INFO [email protected]   from_message: 0
20:11:09 INFO [email protected]   httplib2debuglevel: 0
20:11:09 INFO [email protected]   json: 'Credentials.json'
20:11:09 INFO [email protected]   log: 'import-mailbox-to-gmail-1712.log'
20:11:09 INFO [email protected]   logging_level: 'INFO'
20:11:09 INFO [email protected]   noauth_local_webserver: False
20:11:09 INFO [email protected]   num_retries: 10
20:11:09 INFO [email protected]   replace_quoted_printable: True
Traceback (most recent call last):
  File "import-mailbox-to-gmail.py", line 416, in <module>
    main()
  File "import-mailbox-to-gmail.py", line 342, in main
    for username in next(os.walk(unicode(args.dir)))[1]:
NameError: name 'unicode' is not defined

Is it possible I get the error because I try running the script with a Gmail account, not Workspace?

Problem on Mac

Hello there,

First of all, I want to thank you for this. I hope I can get it to work, secondly I'd like to apologise as I'm sure there is nothing wrong with it, it's just that I am a total novice / newbie and don't understand simple terminal commands.

I have set everything up; step A is complete and I think working. Section B, I have all my mbox files ready and the file structure in place. I've created the test user.

Now, when I try to upload I get this error:

-bash: ./import-mailbox-to-gmail.py: No such file or directory

What could be causing that? Have I set up step A correctly?

Could you possibly give me the exact terminal command? This is currently what I am trying to use:

./import-mailbox-to-gmail.py --json /Users/jasonbradbury/Desktop/MBOXTEST/Gmail API-c1434271fb44.json --dir /Users/jasonbradbury/Desktop/MBOXTEST/jason@h*********t.co.uk/CVs.mbox

Thanks so much in advance!

Jason

Attachments

After testing a few messages I noticed attachements to messages (pdfs, csv's, xlsx files etc.) aren't attached to messages after import to gmail.

Is this a design intention or am I missing a step? Or is it a restriction with how the Google Gmail API handles attachments for security purposes and what not and excludes them from messages?

Make import faster

Problem

Importing masses of email takes very long, since emails are imported one-by-one sequentially. The GMail API takes up to 5sec to handle a single email. This is independent from network conditions, I test from EC2 and Google Compute Engine.

Possible solutions

Batch request could be used. Upside would be that it still works somewhat sequentially, by importing chunks of mails. Downside is that this lib would require a serious rework, since we could no longer rely on the handy service wrapper.

Or API calls could be fired in parallel (see GMail quotas). This - of course - requires additional error and retry logic but is still the "lower hanging fruit", IMO.

I'm aware that this project is not actively maintained. Still, this might be a good starting point for a intern coming to the GMail team.

Syntax error line 5

Hi.
There is no log. The stdout shows the following (Debian Jessie)

eas@debianMaster:~$ ./import-mailbox-to-gmail.py --json /home/eas/TJemail-842823h79543.json --dir /home/eas/mbox/
./import-mailbox-to-gmail.py: línea 5: error sintáctico cerca del elemento inesperado `newline'
./import-mailbox-to-gmail.py: línea 5: `<!DOCTYPE html>'

In english:

./import-mailbox-to-gmail.py: line 5: syntax error unexpected element 'newline'
./import-mailbox-to-gmail.py: line 5: `<!DOCTYPE html>'

Any ideas?

thanks!

Access Token / Credentials Issue

Please help with what might seem like a simple issue. I followed the directions very easily, but I am getting an error as you can see where the credentials cannot be accessed. Please advise if I am doing something incorrectly.

C:\Users\Sandrews>C:\Python27\import-mailbox-to-gmail.py --json C:\Users\Sandrews\Downloads\115607657907XXXXXXXXX.json --dir C:\mbox
13:40:20 INFO [email protected] *** Starting import-mailbox-to-gmail 1.3 on Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:42:59) [MSC v.1500 32 bit (Intel)] ***
13:40:20 INFO [email protected] Arguments:
13:40:20 INFO [email protected] auth_host_name: 'localhost'
13:40:20 INFO [email protected] auth_host_port: [8080, 8090]
13:40:20 INFO [email protected] dir: 'C:\mbox'
13:40:20 INFO [email protected] fix_msgid: True
13:40:20 INFO [email protected] httplib2debuglevel: 0
13:40:20 INFO [email protected] json: 'C:\Users\Sandrews\Downloads\115607657907XXXXXXXXX.json'
13:40:20 INFO [email protected] log: 'import-mailbox-to-gmail-29668.log'
13:40:20 INFO [email protected] logging_level: 'INFO'
13:40:20 INFO [email protected] noauth_local_webserver: False
13:40:20 INFO [email protected] num_retries: 10
13:40:20 INFO [email protected] replace_quoted_printable: True
13:40:20 INFO [email protected] Processing user [email protected]
13:40:20 ERROR [email protected] Can't get access token for user [email protected]
13:40:20 ERROR [email protected] Can't process user [email protected]
Traceback (most recent call last):
File "C:\Python27\import-mailbox-to-gmail.py", line 303, in main
credentials = get_credentials(username)
File "C:\Python27\import-mailbox-to-gmail.py", line 122, in get_credentials
scopes=SCOPES).create_delegated(username)
File "C:\Python27\lib\site-packages\oauth2client\service_account.py", line 223, in from_json_keyfile_name
revoke_uri=revoke_uri)
File "C:\Python27\lib\site-packages\oauth2client\service_account.py", line 172, in _from_parsed_json_keyfile
'Expected', client.SERVICE_ACCOUNT)
ValueError: ('Unexpected credentials type', None, 'Expected', 'service_account')
13:40:20 INFO [email protected] *** Done importing all users from directory 'C:\mbox'
13:40:20 INFO [email protected] *** Import summary:
13:40:20 INFO [email protected] 0 users imported with no failures
13:40:20 INFO [email protected] 0 users imported with some failures
13:40:20 INFO [email protected] 1 users failed
13:40:20 INFO [email protected] 0 labels (mbox files) imported with no failures
13:40:20 INFO [email protected] 0 labels (mbox files) imported with some failures
13:40:20 INFO [email protected] 0 labels (mbox files) failed
13:40:20 INFO [email protected] 0 messages imported successfully
13:40:20 INFO [email protected] 0 messages failed

13:40:20 INFO [email protected] *** Check log file import-mailbox-to-gmail-29668.log for detailed errors.
13:40:20 INFO [email protected] Finished.

readme.md has stale instructions for setting up credentials

Steps 4-15 seem stale because the UI I'm seeing doesn't match up with the steps. There's an "API Manager" and on the left there's Overview and Credentials. Clicking on Overview there's Popular APIs including Gmail.

Clicking on Credentials, it's "New credentials" pop-down, then "Service account key". To create a service account key, there's a service account pop-up with options "App Engine default service account" and "Compute Engine default service account" and "New Service Account" The readme doesn't make it clear which of those three I'm supposed to pick. So I chose "New Service Account" and then Close as stated in step 10.

Step 11 makes no sense because there is no service account email address where I'm left off at the end of step 10. I'm still in credentials, and there is a list of service account keys, with one key that has an ID with a long number, but there's no email address.

Step 12 there is nothing called "Client ID" on this page. There is an "ID" field with a long number, but this number plugged into Step 14 yields "This client name has not been registered with Google yet."

If I go back to API Manager > Credentials, on the right is "Manage service accounts" if I click that there's a list of accounts "App engine default service account" and "Compute Engine default service account" and "gmailmigrate" which is the one I created in step 9, and here there is an email address, but it's not clickable. Nothing on this line is clickable. There is still no Client ID or Client Name anywhere on this page.

So...stuck.

Ok finally this is what worked:

API Manager > Credentials > Manage service accounts, find "gmailmigrate" and on far right side is a hamburger dots pull down menu to edit the service account, do that. Check "Enable Google Apps Domain-wide Delegation" give it a name, click save. And NOW there is a "View Client ID" option for "gmailmigrate" and that number works in step 13.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.