munir131 / attachment-downloader Goto Github PK
View Code? Open in Web Editor NEWGmail attachment downloader
License: MIT License
Gmail attachment downloader
License: MIT License
I get this error with any search with more than a few emails:
GaxiosError: Too many concurrent requests for user
at Gaxios.<anonymous> (/attachment-downloader/node_modules/gaxios/build/src/gaxios.js:72:27)
at Generator.next (<anonymous>)
at fulfilled (/attachment-downloader/node_modules/gaxios/build/src/gaxios.js:16:58)
at processTicksAndRejections (internal/process/task_queues.js:89:5) {
It'll be great if this library could throttle itself to stay within Gmail's API limits!
Hello!
Is it possible to download attachments linked from google drive?
Seems like now it doesn't.
I get the following error:
node index.js --label TODL
TypeError: Cannot read property 'length' of undefined
at /home/micha/src/attachment-downloader/index.js:186:34
at arrayMap (/home/micha/src/attachment-downloader/node_modules/lodash/lodash.js:639:23)
at Function.map (/home/micha/src/attachment-downloader/node_modules/lodash/lodash.js:9554:14)
at pluckAttachment (/home/micha/src/attachment-downloader/index.js:180:22)
at /home/micha/src/attachment-downloader/index.js:52:30
This fixed the error, but I don't know if it's the "correct" fix:
diff --git a/index.js b/index.js
index 418aeb0..4985763 100644
--- a/index.js
+++ b/index.js
@@ -178,7 +178,7 @@ function saveFile(fileName, content) {
function pluckAttachment(mails) {
return _.compact(_.map(mails, (m) => {
- if (!m.data) {
+ if (!m.data || !m.data.payload.parts) {
return undefined;
}
const attachment = {
Use case: I only want to save pdfs. Right now, I'm running cd files && rm -- ^*.pdf
in zsh after running this script, in order to remove the unwanted files.
This could instead be filtered during the query using filename:pdf
as per the docs.
There is, however, one problem with only modifying the query: if an email has more than one attachment, both will still get downloaded. To make this function as expected, there would also need to be logic in function fetchAndSaveAttachment()
to filter out unwanted files.
Would be nice to combine date range filtering in addition to labels. ie: before:2004/04/18
, after:04/16/2004
. Might require some refactoring for the CLI to support a combination of filters, perhaps a series of questions:
before
after
if none are selected, then they'll get 'all'
Docs:
https://developers.google.com/gmail/api/guides/filtering
https://support.google.com/mail/answer/7190
Since the goal is to only get emails with attachments, the query can be structured with has:attachment
, as per these docs.
The result shouldn't be any different, but it will probably have significant performance improvement to include that.
Planning to use UI based application for end user. So, no need to code and enhance UX. Currently, I plan to use Electron as we can get two benefits
My gmail currently has ~300 emails. The files
directory was empty even after the script said Done
.
I modified return getAllMails(auth, 500)
to return getAllMails(auth, 100)
, and was able to get files.
I see two problems with the code:
I think that this is causing the bug.
https://github.com/munir131/attachment-downloader/blob/master/index.js#L285-L292
If there's only one page, messageIds
will be an empty array. It also looks like the last page is being skipped even when a user has over 500 emails, since messageIds.concat(
is only being called when there's a nextPageToken
.
Perhaps
if (response.data) {
messageIds = messageIds.concat(response.data.messages)
if (response.data.nextPageToken) {
spinner.text = "Reading page: " + ++pageCounter
resolve(getAllMails(auth, maxResults, response.data.nextPageToken))
}
}
spinner.text = "All pages are read"
resolve(messageIds)
function getListOfMailIdByLabel(...)
looks like it may not have the same issue, since message ids are being added to the array in both conditions.
When I tried my above code, there were more attachments, which confirms that the current code is missing a page.
maxResults
isn't being passed down to the recursive function, but instead, 500
is being used for all subsequent pages after the initial.
https://github.com/munir131/attachment-downloader/blob/master/index.js#L288
Should be resolve(getAllMails(auth, maxResults, response.data.nextPageToken))
is there a download command? I check help but doesnt show it
Enter the code from that page here: 4/twF41gClfys6SrAwY7WFACfLEFMsSH1JnIWZMrJW5tEbQkxMISh_Nj4
Error retrieving access token { FetchError: request to https://oauth2.googleapis.com/token failed, reason: connect ETIMEDOUT 172.217.160.74:443
at ClientRequest.<anonymous> (E:\θΏ
ι·δΈθ½½\gmail_bulk\attachment-downloader\node_modules\node-fetch\lib\index.js:1444:11)
at ClientRequest.emit (events.js:189:13)
at TLSSocket.socketErrorListener (_http_client.js:392:9)
at TLSSocket.emit (events.js:189:13)
at emitErrorNT (internal/streams/destroy.js:82:8)
at emitErrorAndCloseNT (internal/streams/destroy.js:50:3)
at process._tickCallback (internal/process/next_tick.js:63:19)
message:
'request to https://oauth2.googleapis.com/token failed, reason: connect ETIMEDOUT 172.217.160.74:443',
type: 'system',
errno: 'ETIMEDOUT',
code: 'ETIMEDOUT',
config:
{ method: 'POST',
url: 'https://oauth2.googleapis.com/token',
data:
'code=4%2FtwF41gClfys6SrAwY7WFACfLEFMsSH1JnIWZMrJW5tEbQkxMISh_Nj4&client_id=280431205846-gtd3mva42iv230vhas4do6hsdj5f0jbv.apps.googleusercontent.com&client_secret=X1tQT4lWJsKYAKjLCiTzF1A4&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&grant_type=authorization_code&code_verifier=',
headers:
{ 'Content-Type': 'application/x-www-form-urlencoded',
'User-Agent': 'google-api-nodejs-client/3.1.0',
Accept: 'application/json' },
params: [Object: null prototype] {},
paramsSerializer: [Function: paramsSerializer],
body:
'code=4%2FtwF41gClfys6SrAwY7WFACfLEFMsSH1JnIWZMrJW5tEbQkxMISh_Nj4&client_id=280431205846-gtd3mva42iv230vhas4do6hsdj5f0jbv.apps.googleusercontent.com&client_secret=X1tQT4lWJsKYAKjLCiTzF1A4&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&grant_type=authorization_code&code_verifier=',
validateStatus: [Function: validateStatus],
responseType: 'json' }
Initial plan is to organise it more better.
hello, firstly I'd like to say nice project!
I tried using it and it found the label I was looking for, it found all the emails under the label but it didn't find any attachments
I traced the problem to index.js
, line 189 in the function pluckAllAttachments
if (!p.body || !p.body.attachmentId) {
return undefined;
}
It seems not all messages that have attachments, have an attachmentId
. Sometimes they are found in the same object but in the data
key. See gmail api https://developers.google.com/gmail/api/reference/rest/v1/users.messages.attachments
it seems the messages that are causing the problem have nested parts which contain the relevant attachmentId
The messages in question were forwarded with an iPad FWIW
EDIT: removed something I was wrong about, added what the actual problem was π
I received a Google email with the subject line "Migrate your OAuth out-of-band flow to an alternative method before Oct. 3, 2022" on May 3, 2022. It identified to OAuth credentials created for this tool as being impacted.
https://developers.googleblog.com/2022/02/making-oauth-flows-safer.html#disallowed-oob
Please consider updating the tool to use an allowed OAuth mechanism.
I tried both "from:" and label approach, the result is same.
If the email has multiple attachments (in my case, they are all images), only the first one will be downloaded.
Also, when I'm using the "from:" filter, it's even worse: I found at least one email wasn't downloaded at all (it's exactly the 51th one out of 51 emails that isn't downloaded, not sure if relevant.)
Is it possible or easy to add an option to select multiple labels and create folders under ./files?
We have a few hundred labels to go through one by one and create a folder for each and then download the files to be sorted.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.