rholder / esthree Goto Github PK
View Code? Open in Web Editor NEWAn S3 client that just works
License: Apache License 2.0
An S3 client that just works
License: Apache License 2.0
This looks similar to what's happening in aws/aws-sdk-java#1092:
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
MacBook-Pro:$ java -version
java version "10.0.2" 2018-07-17
Java(TM) SE Runtime Environment 18.3 (build 10.0.2+13)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
I'll update .travis.yaml
to include JDK 9, 10, and 11, and I'll add a test to show this breaking for the missing modules if it's not immediately obvious from that update.
This seems like a silly error to see when you don't have any credentials available:
java.util.concurrent.ExecutionException: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
Handle the case where we just want to hit a public AWS bucket.
Add Javadoc to the gh-pages
branch for 0.1.1 and 0.2.0 releases.
The progress bar is pretty simple right now and doesn't do any estimation to determine how long anything might take. Add something that gives an ETA.
esthree get -v s3://potato
Produces:
java.lang.NullPointerException
at com.github.rholder.esthree.cli.GetCommand.run(GetCommand.java:67)
at com.github.rholder.esthree.Main.main(Main.java:66)
That's not very helpful. Make this message more helpful.
A 0 byte file is being created whenever even a bogus get command is issued. Don't create a file unless we're sure something on the other end actually exists.
Add a command to list all buckets (or just bolt it on to ls
).
In the case where I esthree get
a file with the same filename as an existing local file, it opens the existing file for writing without truncating or otherwise removing the contents. It then writes data starting at the beginning of the file. This means if the s3 file is smaller than the existing file length, the end of the original file stays around. This makes it rather confusing when looking at log files that suddenly contain the tail end of an old log file with the same name. 🦀
Example:
# aaa contains 400 `a`'s. bbb contains 200 `b`'s
$ ls -tlr
-rw-rw-r-- 1 megs megs 400 Jun 26 17:25 aaa.txt
-rw-rw-r-- 1 megs megs 200 Jun 26 17:25 bbb.txt
# push bbb up with the same filename as aaa
$ esthree put bbb.txt s3://<blah>/aaa.txt
[==================================================] 100% 200 B / 200 B 00:00:00
$ esthree get s3://<blah>/aaa.txt
[==================================================] 100% 200 B / 200 B 00:00:00
# note aaa still is 400 bytes
$ ls -tlr
-rw-rw-r-- 1 megs megs 200 Jun 26 17:25 bbb.txt
-rw-rw-r-- 1 megs megs 400 Jun 26 17:26 aaa.txt
# and there you have it. 200 bytes of b's and the rest of the file untouched
$ cat aaa.txt
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Add the ability to create a bucket.
esthree help
gives help
esthree --help
throws an error
esthree put help
throws an error
esthree put --help
gives help
Also seems to do this for get
and get-multi
didn't try the others.
It looks like esthree
is susceptible to aws/aws-sdk-java#444 as evidenced by this error:
java.util.concurrent.ExecutionException: com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication requires a valid Date or x-amz-date header (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: AAAAREDACTEDAAAA), S3 Extended Request ID: AAAAREDACTEDAAAA
This was observed from:
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
Update joda-time
or the SDK to fix this in the next release.
Tried 5 times and failed tells me not a lot about what went wrong.
ORIGINAL ISSUE BELOW
Ran into this testing a script locally that does pushd /mnt/PLACE
evidence
catalog ❯ cat ~/.aws/credentials
[default]
aws_secret_access_key = lolnope
aws_access_key_id = STILL NOPE
catalog-import git:master ❯ esthree get s3://SECRET_BUCKET/SECRET FILE
[===> ] 7% 5.99 MB / 83.72 MB 00:01:07
# Look it works
git:master ❯ pushd /mnt/PLACE ⏎ ✭
/mnt/catalog ~/work/ops-bucket/catalog-import ~
catalog ❯ esthree get s3://SECRET_BUCKET/SECRET FILE
com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 5 attempts.
catalog ❯ popd ⏎
~/work/ops-bucket/catalog-import ~
catalog-import git:master ❯ esthree get s3://SECRET_BUCKET/SECRET FILE
[===> ] 7% 5.99 MB / 83.72 MB 00:01:07
# Look it works again
It'd be great to be able to pass the AWS key/secret along with the region (if that's not already an option) via the CLI. Having to have a config file set up on remote managed nodes is just one more step I have to jump through when using Chef or Fabric.
Thanks for your contributions to the community!
Would be a nice feature to have for certain things (like retrieving redshift results for instance).
Right now I'm using s3cmd's sync option, but would love my s3 client of choice to cover this use case.
Long time listener, first time caller, here ...
I am currently working-around this easily enough by checking for an empty stdout.
Here's the behavior I'd like it to emulate:
$ ls filename-that-does-not-exist ; echo $?
ls: filename-that-does-not-exist: No such file or directory
1
And here's what esthree
does:
$ esthree ls s3:/my-bucket-name/filename-that-does-not-exist ; echo $?
0
To be clear, I don't care about the error message (which may be printed to stdout or stderr, I don't care) ... I only care about the return code.
Apologies that this isn't a pull request ... :(
But I just wanted to record this nit somewhere.
When the AWS region isn't known it starts trying to guess based on jodatime's ZoneInfoMap. But that file doesn't exist inside the binary and this appears:
java.io.IOException: Resource not found: "org/joda/time/tz/data/ZoneInfoMap" ClassLoader: sun.misc.Launcher$AppClassLoader@2396649c
at org.joda.time.tz.ZoneInfoProvider.openResource(ZoneInfoProvider.java:210)
at org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:127)
at org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:86)
at org.joda.time.DateTimeZone.getDefaultProvider(DateTimeZone.java:514)
at org.joda.time.DateTimeZone.getProvider(DateTimeZone.java:413)
at org.joda.time.DateTimeZone.forID(DateTimeZone.java:216)
at org.joda.time.DateTimeZone.getDefault(DateTimeZone.java:151)
at org.joda.time.chrono.ISOChronology.getInstance(ISOChronology.java:79)
at org.joda.time.DateTimeUtils.getChronology(DateTimeUtils.java:266)
at org.joda.time.format.DateTimeFormatter.selectChronology(DateTimeFormatter.
at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:
at com.amazonaws.util.DateUtils.parseRFC822Date(DateUtils.java:193)
at com.amazonaws.services.s3.internal.ServiceUtils.parseRfc822Date(ServiceUti
at com.amazonaws.services.s3.internal.AbstractS3ResponseHandler.populateObjec
at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3Object
at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3Object
at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:9
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.jav
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:46
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3714)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:114
at com.github.rholder.esthree.command.Get$1.call(Get.java:113)
at com.github.rholder.esthree.command.Get$1.call(Get.java:108)
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(Attem
at com.github.rholder.retry.Retryer.call(Retryer.java:110)
at com.github.rholder.esthree.command.Get.retryingGet(Get.java:108)
at com.github.rholder.esthree.command.Get.call(Get.java:78)
at com.github.rholder.esthree.cli.GetCommand.run(GetCommand.java:96)
at com.github.rholder.esthree.Main.execute(Main.java:89)
at com.github.rholder.esthree.Main.main(Main.java:42)
Let's add that to the build so we never see this again.
When using get-multi
, the percentage appears correctly, but the current / total number of bytes shown appears to only be relevant for the current chunk.
e.g.
esthree put path/to/local/file.zip s3://bucket/contained-in/
currently keys the object s3://bucket/contained-in/
would like s3://bucket/contained-in/file.zip
Add the ability to remove a bucket. Also add a --fast
option or some such to perform a multi-threaded delete of every item in a bucket.
Bump up the version from 1.10 to 2.2.1.
Don't rely on the JCommander usage()
message for displaying hierarchical help menus because it's hard to read.
Since joda-time
uses a pile of timezone data files that are loaded at runtime, they need to be explicitly listed in order for autojar
to include them. To work around this, a portion of the most up to date joda-time
jar is extracted and then checked in directly into a known location in this repository. Every time joda-time
updates (as per the AWS SDK's upstream dependency) the new timezone data has to be manually updated. Automate this process so we never have to think about it again.
Add the ability to remove an object from a bucket.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.