juliacloud / awss3.jl Goto Github PK
View Code? Open in Web Editor NEWAWS S3 Simple Storage Service interface for Julia.
License: Other
AWS S3 Simple Storage Service interface for Julia.
License: Other
AWS provides the delimiter
query parameter to separate "folders" into a "CommonPrefixes" list so you can optionally perform a recursive search. AWSS3 is neither recursing nor returning the CommonPrefixes so anything in subfolders is mysteriously hidden.
Based on the current interface it seems to me like we should list all descendant objects, which means removing the delimiter argument. What do you think?
For comparison, here's the aws-cli
behaviour:
$ aws s3 sync path/to/dir s3://mybucket-13puwo3rvcn01/does/not/exist --dryrun
(dryrun) upload: path/to/dir/file1.jl to s3://mybucket-13puwo3rvcn01/does/not/exist/file1.jl
(dryrun) upload: path/to/dir/file2.jl to s3://mybucket-13puwo3rvcn01/does/not/exist/file2.jl
$ aws s3 sync help
DESCRIPTION
Syncs directories and S3 prefixes. Recursively copies new and updated
files from the source directory to the destination. Only creates fold-
ers in the destination if they contain one or more files.
$ aws --version
aws-cli/1.17.3 Python/3.7.3 Darwin/17.7.0 botocore/1.14.3
Compare the behaviour with S3Path
:
julia> local_path = abs(p"path/to/dir");
julia> s3path = p"s3://mybucket-13puwo3rvcn01/does/not/exist";
julia> sync(local_path, s3path)
ERROR: ArgumentError: S3Path folders must end with '/': s3://mybucket-13puwo3rvcn01/does/not/exist
^ i'm not sure if we want this behaviour or not. I think S3Path
is consistent on this, but it is inconsistent with aws-cli
.
Anyway, this bring us on to:
julia> s3path = p"s3://mybucket-13puwo3rvcn01/does/not/exist/";
julia> sync(local_path, s3path)
ERROR: The parent of s3://mybucket-13puwo3rvcn01/does/not/exist/ does not exist. Pass `recursive=true` to create it.
^ This i think is just a bug?
Using AWSS3 v0.6.9
, FilePathsBase v0.7.0
I'm using asyncmap
with concurrency 20 to download about 10,000 files with average size of 100k from S3, sometimes it finished but more often than not it just hangs forever. I've created a public bucket to reproduce the issue:
using AWSCore
using AWSS3
function main()
aws = aws_config()
get(i) = begin
try
data = s3_get(aws, "public-test-bucket-quatrix", "signals/10737/1574792548712.wav")
@info "got" i=i data=length(data)
catch e
@error "error" exception=e
end
end
asyncmap(get, 1:6000, ntasks=20)
end
main()
You might need to run this more than once for it to fail, it seems that the first time it finishes pretty fast and running the second time is a lot slower, I'm assuming some s3 throttling takes place and eventually the connection becomes unusable and the process hangs on epoll_pwait
forever.
Here's the stacktrace when hitting ctrl+c:
┌ Info: got
│ i = 5995
└ data = 155990
┌ Info: got
│ i = 6000
└ data = 155990
┌ Info: got
│ i = 5998
└ data = 155990
... seem to hang forever
^C
signal (2): Interrupt
in expression starting at /data/repos/research_infra/scripts/reproduce.jl:20
epoll_pwait at /build/glibc-OTsEL5/glibc-2.27/misc/../sysdeps/unix/sysv/linux/epoll_pwait.c:42
uv__io_poll at /workspace/srcdir/libuv/src/unix/linux-core.c:270
uv_run at /workspace/srcdir/libuv/src/unix/core.c:359
jl_task_get_next at /buildworker/worker/package_linux64/build/src/partr.c:448
poptaskref at ./task.jl:660
wait at ./task.jl:667
wait at ./condition.jl:106
_trywait at ./asyncevent.jl:110
sleep at ./asyncevent.jl:128 [inlined]
macro expansion at /home/quatrix/.julia/packages/HTTP/nMACo/src/TimeoutRequest.jl:26 [inlined]
#2 at ./task.jl:333
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2135 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2305
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1631 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:659
unknown function (ip: 0xffffffffffffffff)
unknown function (ip: 0xffffffffffffffff)
Allocations: 38111606 (Pool: 38086378; Big: 25228); GC: 63
readtimeout=10
which doesn't seem to have effect in this case.@async
workers that take jobs from a channel to make sure it's not a asyncmap
issue and was able to reproduce.I'm using Julia 1.3.1 with latest AWSCore and AWSS3
(v1.3) pkg> status
Status `~/.julia/environments/v1.3/Project.toml`
[4f1ea46c] AWSCore v0.6.8
[1c724243] AWSS3 v0.6.8
[cd3eb016] HTTP v0.8.12
Had a really helpful offline conversation with @rofinn about this, in the context of #76 and differing from other interfaces such as aws-cli.
Soem points I took away:
S3Path
s in order to match aws-cli
or any other interface for interacting with S3.As Rory put it:
My guess is that the cli is doing a bunch of special casing to make that work nicely with S3. The goal is to avoid that as much as possible with the path API.
From the S3 docs:The Amazon S3 console treats all objects that have a forward slash (“/”) character as the last (trailing) character in the key name as a folder, for example examplekeyname/. You can’t upload an object that has a key name with a trailing “/” character using the Amazon S3 console. However, you can upload objects that are named with a trailing “/” with the Amazon S3 API by using the AWS CLI, AWS SDKs, or REST API.
An object that is named with a trailing “/” appears as a folder in the Amazon S3 console. The Amazon S3 console does not display the content and metadata for such an object. When you use the console to copy an object named with a trailing “/”, a new folder is created in the destination location, but the object’s data and metadata are not copied.I don’t think the the goal of an S3Path should be to emulate a completely different command line tool over the parent type and existing filepaths API
It's particularly important to be watchful / consistent with S3Path
s, given the weirdness of S3 "directories".
https://docs.aws.amazon.com/AmazonS3/latest/user-guide/using-folders.html
I am attempting to use s3_put with a bucket that requires AES256 SSE (server side encryption). I am using the following syntax but suspect I have it malformed somehow. The error I get back from AWS is AccessDenied, but I am not sure the put request is properly formed.
s3_put(aws, "<bucket_name>", "<folder>/<file_name>", data; metadata = Dict("x-amz-server-side-encryption" => "AES256"))
when I run the test of AWSS3, the credential ID was printed out in terminal, even though the key was hidden. It might be better to hid the ID too.
Link to job: https://travis-ci.org/JuliaCloud/AWSS3.jl/jobs/637529647#L291
Local: Error During Test at /home/travis/build/JuliaCloud/AWSS3.jl/test/s3path.jl:257
Got exception outside of a @test
SystemError (with /tmp3f62dbb2-569a-4d65-8380-39f976e2c405): mkdir: Permission denied
@samoconnor Not sure if it is the right place to ask, but is there support available for reading compressed files on s3.
If it is not there, say, do I consider the response to the call
s3_get(aws,"bucket","key")
as a binary stream and process it as such?
Hi!
I'm unfortunately having some trouble with a local endpoint we're using here. This is what I tried:
aws = AWSCore.aws_config(creds = AWSCore.AWSCredentials("XXXXXXXXXXX",
"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"),
endpoint = "https://s3.my.local.endpoint.de")
s3_create_bucket(aws, "test.bucket")
s3_list_buckets(aws) # works well, shows me "test.bucket"
s3_put(aws, "test.bucket", "testkey", "Hello!") # fails
The error message I get is:
IOError(EOFError() during request(https://s3.my.local.endpoint.de/test.bucket/testkey))
in s3_put at AWSS3/src/AWSS3.jl:619
in s3_put at AWSS3/src/AWSS3.jl:619
in #s3_put#27 at AWSS3/src/AWSS3.jl:651
in at base/<missing>
in #s3#1 at AWSS3/src/AWSS3.jl:81
in macro expansion at Retry/src/repeat_try.jl:206
in macro expansion at AWSS3/src/AWSS3.jl:107
in do_request at AWSCore/src/AWSCore.jl:399
in macro expansion at Retry/src/repeat_try.jl:206
in macro expansion at AWSCore/src/AWSCore.jl:421
in http_request at AWSCore/src/http.jl:22
in macro expansion at Retry/src/repeat_try.jl:206
in macro expansion at AWSCore/src/http.jl:31
in at base/<missing>
in #request#1 at HTTP/src/MessageRequest.jl:44
in at base/<missing>
in #request#1 at HTTP/src/ExceptionRequest.jl:19
in at base/<missing>
in #request#1 at HTTP/src/ConnectionRequest.jl:32
in at base/<missing>
in #request#1 at HTTP/src/StreamRequest.jl:53
in macro expansion at base/task.jl:302
in macro expansion at HTTP/src/StreamRequest.jl:57
in startread at HTTP/src/Streams.jl:146
in readheaders at HTTP/src/Messages.jl:468
in readuntil at HTTP/src/IOExtras.jl:171
It looks like the package is using path-style buckets, while our endpoint uses DNS-style buckets. Is that something that could easily be changed? Or might the problem be something else?
Any help is appreciated :-) Thanks a lot!
Don't know if it's the right place for this, but when looking through my pull request JuliaCloud/AWSCore.jl#79 and the subsequent merge, I noticed that the package verification of AWSS3 fails. I checked the logs there and saw it apparently timed out. Curious whether it was due to my commits, I checked the last merges back until JuliaCloud/AWSCore.jl#72 and for all of them, the same thing happened during the verification of AWSS3.
Hello all,
when I run the following code:
using AWSS3
using AWSCore
aws = aws_config(region="us-west-2")
objects = s3_list_objects(aws, "landsat-pds")
it get's stuck and I have to restart the REPL. If I run
aws s3 ls landsat-pds
in the Linux Terminal it gives me the correct output:
PRE L8/
PRE c1/
PRE runs/
PRE tarq/
PRE tarq_corrupt/
PRE test/
2017-05-17 15:42:27 23767 index.html
2016-08-19 19:12:04 105 robots.txt
2019-04-08 13:07:12 39 run_info.json
2019-04-08 16:59:25 52 run_list.txt
2018-08-29 02:45:15 45603307 scene_list.gz
This URL (https://landsat-pds.s3.us-west-2.amazonaws.com/) is also working and open to the public. I was trying for a while and couldn't pin point the issue here, since debugging inside the s3() function also didn't work for me.
When I replace the bucket with a private one that I created on my AWS account for testing purposes, the Julia code works just fine.
Can anyone reproduce or explaion this behaviour?
Thanks in advance,
Martin
PS: Credentials were set up before by running
aws configure
and they are loaded by aws_config
correctly
When first looking at the README I thought s3_get
would take a source and a destination (similar to aws s3 cp
). When that didn't work I thought it was s3_get(config, bucket, key, dest)
and which resulted in the following StackOverflow
:
julia> using AWSCore, AWSS3
julia> aws = AWSCore.aws_config();
julia> s3_get(aws, "", "", "")
ERROR: StackOverflowError:
Stacktrace:
[1] s3_get(::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[2] #s3_get#3(::Array{Any,1}, ::Function, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[3] s3_get(::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[4] #s3_get#3(::Array{Any,1}, ::Function, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[5] s3_get(::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[6] #s3_get#3(::Array{Any,1}, ::Function, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[7] s3_get(::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
[8] #s3_get#3(::Array{Any,1}, ::Function, ::Dict{Symbol,Any}, ::Vararg{Any,N} where N) at /Users/omus/.julia/v0.6/AWSS3/src/AWSS3.jl:174
...
julia> Path("foo") / "bar" / "baz"
p"foo/bar/baz"
# expected p"s3://foo/bar/baz"
julia> S3Path("s3://foo") / "bar" / "baz"
p"s3://foobar/baz"
I think this is happening because it's just dispatching to join
julia> @which S3Path("s3://foo") / "bar" / "baz"
/(root::AbstractPath, pieces::Union{AbstractString, AbstractPath}...) in FilePathsBase at /Users/nick/.julia/packages/FilePathsBase/oi7XZ/src/path.jl:215
https://github.com/rofinn/FilePathsBase.jl/blob/v0.7.0/src/path.jl#L214-L216
and for join
we identify S3 "directories" by the trailing /
julia> @which join(S3Path("s3://foo"), "bar", "baz")
join(prefix::S3Path, pieces::AbstractString...) in AWSS3 at /Users/nick/.julia/packages/AWSS3/VMuUL/src/s3path.jl:94
https://github.com/JuliaCloud/AWSS3.jl/blob/v0.6.8/src/s3path.jl#L93-L97
As the S3Path
docstring explains, this is to do with S3 "faking" directories
https://github.com/JuliaCloud/AWSS3.jl/blob/v0.6.8/src/s3path.jl#L18-L23
But while the join
behaviour is arguably correct (although i think it's arguable)
julia> join(S3Path("s3://foo"), "bar", "baz")
p"s3://foobar/baz"
it leads to surprising behaviour when using the /
syntax.
I wonder whether is is worth special-casing the /
syntax for S3Paths
to behave like
julia> S3Path("s3://foo") / "bar" / "baz"
p"s3://foo/bar/baz"
Seems inconvenient, can we have this collected by default?
Syncing an S3 directory to a local directory is erroring with:
julia> using AWSS3
julia> using FilePathsBase
julia> src = Path("s3::bucket/folder/")
julia> dest = Path(pwd())
julia> sync(src, dest)
ERROR: ArgumentError: s3::bucket/folder/ does not exist
https://github.com/rofinn/FilePathsBase.jl/blob/v0.8.0/src/path.jl#L538 checks the src directory existence and that returns false for directories on S3:
julia> using AWSS3
julia> using FilePathsBase
julia> src = Path("s3::bucket/folder/")
julia> exists(src)
false
I've no idea what behaviour is expected or intended for joining together S3Path
s
There is no test case for it. But here's what happens right now. We should maybe consider if this is what is wanted, then either add a test or fix it.
julia> abc = S3Path("s3://a/b/c/"); # trailing slash on `c`
julia> xyz = S3Path("s3://x/y/z");
julia> abc / xyz
p"s3://a/b/c/y/z/" # trailing slash
julia> abc = S3Path("s3://a/b/c"); # no trailing slash on `c`
julia> abc / xyz
p"s3://a/b/c/y/z" # no trailing slash
julia> a = S3Path("s3://a");
julia> x = S3Path("s3://x/");
julia> a / x
p"s3://a"
Hi @samoconnor , after a few minutes of reading from S3, I got some HTTP.ClosedError, I have some retry but it looks hanging forever. Do you have any idea to reproduce?
catch an error while getindex in BigArray: HTTP.ClosedError:
Exception: ErrorException("")
error receiving response; connection was closed prematurely with type of HTTP.ClosedError
catch an error while getindex in BigArray: HTTP.ClosedError:
Exception: ErrorException("")
error receiving response; connection was closed prematurely with type of HTTP.ClosedError
catch an error while getindex in BigArray: HTTP.ClosedError:
Exception: ErrorException("")
error receiving response; connection was closed prematurely with type of HTTP.ClosedError
GET on a URL causes Requests.jl to parse the resulting data by its content type. Using GET with a byte range of 0-0 will return a single byte (the first byte in the file), causing such parsing to fail.
Better to use HEAD instead of GET when testing for object existence.
if I set the raw=true in s3_get, will the file decompressed in the S3 server side or client side?
also, when I set the "Content-Encoding" in s3_put, will the data be automatically compressed in the client side, or do I need to compress it before ?
E.g. my bucket has a directory called images
in it and a file also.
But readdir
shows nothing
julia> using AWSS3, FilePathsBase
julia> readdir(S3Path("s3://oxinaboxpublic/"))
0-element Array{SubString{String},1}
julia> readdir(S3Path("s3://oxinaboxpublic/images/"))
4-element Array{SubString{String},1}:
"chainloop.svg"
"chainwithborder.png"
"chainwithborder.svg"
"chainwithborder_grey.png"
And sync
fails:
julia> sync(S3Path("s3://oxinaboxpublic/images/"), p"out")
julia> readdir(p"out")
4-element Array{String,1}:
"chainloop.svg"
"chainwithborder.png"
"chainwithborder.svg"
"chainwithborder_grey.png"
julia> mkdir("outall")
"outall"
julia> sync(S3Path("s3://oxinaboxpublic/"), p"outall")
ERROR: ArgumentError: s3://oxinaboxpublic// does not exist
Stacktrace:
[1] sync(::typeof(FilePathsBase.should_sync), ::S3Path, ::PosixPath; delete::Bool, overwrite::Bool) at /Users/
oxinabox/.julia/packages/FilePathsBase/Oyg1p/src/path.jl:537
[2] sync at /Users/oxinabox/.julia/packages/FilePathsBase/Oyg1p/src/path.jl:537 [inlined]
[3] #sync#16 at /Users/oxinabox/.julia/packages/FilePathsBase/Oyg1p/src/path.jl:532 [inlined]
[4] sync(::S3Path, ::PosixPath) at /Users/oxinabox/.julia/packages/FilePathsBase/Oyg1p/src/path.jl:532
[5] top-level scope at REPL[16]:1
I think something is wrong I guess with how we are interpretting prefixes as directories
Related to #34 maybe
I'm not sure whether to file this issue with HTTP.jl or AWSS3.jl. When using HTTP.jl 0.6.8, everything works fine and I can upload files to S3 (even from docker/AWS Batch). However, upgrading HTTP.jl to 0.6.9 causes this error. It seems that it's related to my application having authenticated earlier (to retrieve data from S3) and then that connection times out later before it has a chance to upload results back to S3.
I am using latest AWSS3 (0.3.7) and AWSCore (0.3.8) on Julia 0.6.2.
ERROR: LoadError: RequestTimeTooSkewed -- The difference between the request time and the current time is too large.
HTTP.ExceptionRequest.StatusError(403, HTTP.Messages.Response:
"""
HTTP/1.1 403 Forbidden
x-amz-request-id: DA48EXXXXXXX
x-amz-id-2: eLxPBqGgXXXXXXXX
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Mon, 07 May 2018 14:30:32 GMT
Connection: close
Server: AmazonS3
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the current time is too large.</Message><RequestTime>20180507T141448Z</RequestTime><ServerTime>2018-05-07T14:30:34Z</ServerTime><MaxAllowedSkewMilliseconds>900000</MaxAllowedSkewMilliseconds><RequestId>DA48EXXXXXXX</RequestId><HostId>eLxPBqGgXXXXXXXXXXX=</HostId></Error>""")
Stacktrace:
[1] #request#1(::Array{Any,1}, ::Function, ::Type{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}, ::HTTP.URIs.URI, ::Vararg{Any,N} where N) at /root/.julia/v0.6/HTTP/src/ExceptionRequest.jl:22
[2] (::HTTP.#kw##request)(::Array{Any,1}, ::HTTP.#request, ::Type{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::String) at ./<missing>:0
[3] #request#1(::VersionNumber, ::String, ::Void, ::Void, ::Array{Any,1}, ::Function, ::Type{HTTP.MessageRequest.MessageLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::String) at /root/.julia/v0.6/HTTP/src/MessageRequest.jl:44
[4] (::HTTP.#kw##request)(::Array{Any,1}, ::HTTP.#request, ::Type{HTTP.MessageRequest.MessageLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::String) at ./<missing>:0
[5] macro expansion at /root/.julia/v0.6/AWSCore/src/http.jl:31 [inlined]
[6] macro expansion at /root/.julia/v0.6/Retry/src/repeat_try.jl:206 [inlined]
[7] http_request(::Dict{Symbol,Any}) at /root/.julia/v0.6/AWSCore/src/http.jl:22
[8] macro expansion at /root/.julia/v0.6/AWSCore/src/AWSCore.jl:421 [inlined]
[9] macro expansion at /root/.julia/v0.6/Retry/src/repeat_try.jl:206 [inlined]
[10] do_request(::Dict{Symbol,Any}) at /root/.julia/v0.6/AWSCore/src/AWSCore.jl:399
[11] macro expansion at /root/.julia/v0.6/AWSS3/src/AWSS3.jl:107 [inlined]
[12] macro expansion at /root/.julia/v0.6/Retry/src/repeat_try.jl:206 [inlined]
[13] #s3#1(::Dict{String,String}, ::String, ::Dict{String,String}, ::String, ::String, ::Bool, ::Bool, ::Function, ::Dict{Symbol,Any}, ::String, ::String) at /root/.julia/v0.6/AWSS3/src/AWSS3.jl:81
[14] (::AWSS3.#kw##s3)(::Array{Any,1}, ::AWSS3.#s3, ::Dict{Symbol,Any}, ::String, ::String) at ./<missing>:0
…
Since s3_get_meta
uses a HEAD
request, it does not get S3 error codes, which are provided in the body of a response. The caller can get back an HTTP status code but has no further information.
The solution is to undo the change that implemented #12 and go back to 0-0
range requests.
I'm finding I can't list buckets, because of this missing method.
julia> s3_list_buckets(aws)
ERROR: MethodError: `getindex` has no method matching getindex(::LightXML.XMLDocument, ::ASCIIString)
in s3_list_buckets at /Users/kevin/.julia/v0.4/AWSS3/src/AWSS3.jl:192
Maybe AWSS3.jl is depending on an untagged (or unavailable?) version of LightXML
?
The previous TravisCI setup no longer exists, we need to create a new one:
After upgrading to Julia 0.6, I got an error from array size after decompression. @samoconnor do you think this is related with previous HTTP.jl issue?
seung-lab/BigArrays.jl#28
My S3Dicts test works though.
https://github.com/seung-lab/S3Dicts.jl/blob/master/test/runtests.jl
Not sure what I am doing wrong.
julia> using AWSCore, AWSS3
julia> aws = aws_config(creds = "li-istvan", region="eu-west-1")
Dict{Symbol,Any} with 2 entries:
:creds => "li-istvan"
:region => "eu-west-1"
julia> s3_list_buckets(aws)
ERROR: MethodError: no method matching check_credentials(::String)
Closest candidates are:
check_credentials(::AWSCredentials; force_refresh) at /Users/l1x/.julia/packages/AWSCore/BzoMV/src/AWSCredentials.jl:108
Stacktrace:
[1] sign_aws4!(::Dict{Symbol,Any}, ::Dates.DateTime) at /Users/l1x/.julia/packages/AWSCore/BzoMV/src/sign.jl:73
[2] sign!(::Dict{Symbol,Any}, ::Dates.DateTime) at /Users/l1x/.julia/packages/AWSCore/BzoMV/src/sign.jl:18
[3] sign! at /Users/l1x/.julia/packages/AWSCore/BzoMV/src/sign.jl:15 [inlined]
[4] macro expansion at /Users/l1x/.julia/packages/AWSCore/BzoMV/src/AWSCore.jl:403 [inlined]
[5] macro expansion at /Users/l1x/.julia/packages/Retry/0jMye/src/repeat_try.jl:192 [inlined]
[6] do_request(::Dict{Symbol,Any}) at /Users/l1x/.julia/packages/AWSCore/BzoMV/src/AWSCore.jl:393
[7] macro expansion at /Users/l1x/.julia/packages/AWSS3/yrZcs/src/AWSS3.jl:111 [inlined]
[8] macro expansion at /Users/l1x/.julia/packages/Retry/0jMye/src/repeat_try.jl:192 [inlined]
[9] #s3#1(::Dict{String,String}, ::String, ::Dict{String,String}, ::String, ::String, ::Bool, ::Bool, ::typeof(AWSS3.s3), ::Dict{Symbol,Any}, ::String, ::String) at /Users/l1x/.julia/packages/AWSS3/yrZcs/src/AWSS3.jl:85
[10] #s3 at ./none:0 [inlined] (repeats 2 times)
[11] s3_list_buckets(::Dict{Symbol,Any}) at /Users/l1x/.julia/packages/AWSS3/yrZcs/src/AWSS3.jl:466
[12] top-level scope at REPL[3]:1
julia>
Version: Version 1.2.0 (2019-08-20)
Hi, @samoconnor , I am trying to use your package. it is pretty handy. the download and upload works great. I still need to setup the content-encoding
as gzip of the files. It seems that it is not supported?
I find an option to setup content-type though.
It appears that the intent is for files with certain well-known extensions (e.g., .html
) to automatically have their Content-Type set. However, as currently implemented, it will set types on all files that end in those letters, even if there is no dot before them.
For example, rather than ".html"
, it seems like it should be "\\.html"
.
AWSSQS.jl had this similar issue where bors
was not a required check before merging.
TODO:
it is wired that some data could not be put in S3.
using AWSS3
cre = AWSCore.aws_config()
v = UInt8[0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x01, 0x32]
s3_put(cre, "bkt", "key", v[1:end]) # this do not work
s3_put(cre, "bkt", "key", v[2:end]) # this works
r=AWSS3.s3(aws, "GET","mybucket" query = q)
where
aws[:region]="us-east-1"
and "mybucket"
is in another region (us-west-2
) returns
"Code" => "PermanentRedirect"
"Message" => "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all fu…
"Bucket" => "mybucket"
"Endpoint" => "mybucket.s3-us-west-2.amazonaws.com"
"RequestId" => ...
"HostId" => ...
In s3_sign_url
, it gathers credentials with:
query = SSDict("AWSAccessKeyId" => aws[:creds].access_key_id,
"x-amz-security-token" => get(aws, "token", ""),
"Expires" => string(expires),
"response-content-disposition" => "attachment")
token
should probably be retrieved from aws[:creds]
instead of just aws
. And it should probably be added to the query
dict optionally, like is done in sign.jl. Perhaps this?
query = SSDict("AWSAccessKeyId" => aws[:creds].access_key_id,
"Expires" => string(expires),
"response-content-disposition" => "attachment")
if aws[:creds].token != ""
query["x-amz-security-token"] = aws[:creds].token
end
It would be good to get a way to set metadata with uploads, as per http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#UserMetadata
@samoconnor thoughts on the best way to implement this?
The compat
section of the Project.toml
for this package is very restrictive, and is wreaking havoc on my environment. Are there specific reasons for all these restrictions (the reason for my skepticism is that this package is largely a simple HTTP wrapper)? If not, I'd beg that we can remove them. Unnecessary compat restrictions are really harmful as they cause packages to be downgraded unnecessarily, if more packages were so restrictive, there'd be a high risk of getting unsatisfiable requirements.
Hello,
I have checked out the latest master of AWSCore and AWSS3, because of the issue with Requests
JuliaWeb/Requests.jl#171
Unfortunately, even the simplest listing the content of the bucket as
using AWSCore
using AWSS3
aws = AWSCore.aws_config()
AWSS3.s3_list_objects("bucket_name")
failst with
ERROR: MethodError: no method matching escape(::Array{Pair,1})
Closest candidates are:
escape(::Any, ::Any, ::Array{String,1}) at /.julia/v0.6/HTTP/src/uri.jl:173
escape(::Any, ::Any, ::Any) at /.julia/v0.6/HTTP/src/uri.jl:171
escape(::Array{UInt8,1}) at /.julia/v0.6/HTTP/src/uri.jl:169
...
Stacktrace:
[1] macro expansion at /.julia/v0.6/AWSS3/src/AWSS3.jl:484 [inlined]
[2] macro expansion at /.julia/v0.6/Retry/src/repeat_try.jl:198 [inlined]
[3] s3_list_objects(::Dict{Symbol,Any}, ::String, ::String) at /.julia/v0.6/AWSS3/src/AWSS3.jl:468
[4] s3_list_objects(::Dict{Symbol,Any}, ::String) at /.julia/v0.6/AWSS3/src/AWSS3.jl:453
[5] s3_list_objects(::String) at/.julia/v0.6/AWSS3/src/AWSS3.jl:491
when I have checked out the latest master of HTTP, I got this error
ERROR: MethodError: no method matching request(::HTTP.Client, ::HTTP.Request, ::HTTP.RequestOptions; stream=false, verbose=false)
Closest candidates are:
request(::HTTP.Client, ::HTTP.Request, ::HTTP.RequestOptions, ::Bool, ::Array{HTTP.Response,1}, ::Int64, ::Bool) at /.julia/v0.6/HTTP/src/client.jl:299 got unsupported keyword arguments "stream", "verbose"
request(::HTTP.Client, ::HTTP.Request; opts, stream, history, retry, verbose, args...) at /.julia/v0.6/HTTP/src/client.jl:341
request(::HTTP.Client, ::Any, ::HTTP.URIs.URI; headers, body, stream, verbose, args...) at /.julia/v0.6/HTTP/src/client.jl:357
...
Stacktrace:
[1] macro expansion at /.julia/v0.6/AWSS3/src/AWSS3.jl:484 [inlined]
[2] macro expansion at /.julia/v0.6/Retry/src/repeat_try.jl:198 [inlined]
[3] s3_list_objects(::Dict{Symbol,Any}, ::String, ::String) at /.julia/v0.6/AWSS3/src/AWSS3.jl:468
It seems to me that the versioning is little bit messed up. Would it be possible to tag new stable versions?
Thanks a lot,
Tomas
Hi, I encountered an error when using the signed URL generated by the s3_sign_url
function. The response to the signed URL is this
<Error>
<Code>InvalidArgument</Code>
<Message>Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4.</Message>
<ArgumentName>Authorization</ArgumentName>
<ArgumentValue>null</ArgumentValue>
<RequestId> ... the request ID ... </RequestId>
<HostId>... the host ID ...</HostId>
</Error>
Correct me if I'm wrong, but from the code it looks like we're using AWS signature V2?
Lines 777 to 800 in c881edf
Maybe we can have a keyword parameter like signature_version
that defaults to v4?
We're about to start using this to replace our use of AWSSDK for S3, so it would be nice to move this into the org, if you (@samoconnor) are okay with that.
using AWSS3 together with FTPClient seems to trigger a problem (in HTTP).
using AWSS3, AWSCore
using FTPClient
aws = AWSCore.default_aws_config()
s3_list_buckets(aws)
gives the following problem
ERROR: error compiling #sslconnection#18: error compiling Type: could not load library "C:\Users\retracted_id\.julia\packages\MbedTLS\X4xar\deps\usr\bin\libmbedtls.dll"
The specified procedure could not be found.
Stacktrace:
[1] #getconnection#11(::Int64, ::Int64, ::Int64, ::Int64, ::Bool, ::Base.Iterators.Pairs{Symbol,Union{Nothing, Int64},Tuple{Symbol,Symbol},NamedTuple{(:iofunction, :verbose),Tuple{Nothing,Int64}}}, ::Function, ::Type{HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}}, ::SubString{String}, ::SubString{String}) at .\none:0
[2] (::getfield(HTTP.ConnectionPool, Symbol("#kw##getconnection")))(::NamedTuple{(:reuse_limit, :iofunction, :verbose, :require_ssl_verification),Tuple{Int64,Nothing,Int64,Bool}}, ::typeof(HTTP.ConnectionPool.getconnection), ::Type{HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}}, ::SubString{String}, ::SubString{String}) at .\none:0
[3] #request#1(::Nothing, ::Type, ::Int64, ::Base.Iterators.Pairs{Symbol,Any,Tuple{Symbol,Symbol,Symbol},NamedTuple{(:iofunction, :verbose, :require_ssl_verification),Tuple{Nothing,Int64,Bool}}}, ::Function, ::Type{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::String) at C:\Users\retracted_id\.julia\packages\HTTP\U2ZVp\src\ConnectionRequest.jl:41
[4] (::getfield(HTTP, Symbol("#kw##request")))(::NamedTuple{(:iofunction, :verbose, :require_ssl_verification),Tuple{Nothing,Int64,Bool}}, ::typeof(HTTP.request), ::Type{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}, ::HTTP.URIs.URI, ::HTTP.Messages.Request, ::String) at .\none:0
[5] #request#1(::Base.Iterators.Pairs{Symbol,Any,Tuple{Symbol,Symbol,Symbol},NamedTuple{(:iofunction, :verbose, :require_ssl_verification),Tuple{Nothing,Int64,Bool}}}, ::Function, ::Type{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}, ::HTTP.URIs.URI, ::Vararg{Any,N} where N) at C:\Users\retracted_id\.julia\packages\HTTP\U2ZVp\src\ExceptionRequest.jl:19
[6] #request at .\none:0 [inlined]
[7] #request#1(::VersionNumber, ::String, ::Nothing, ::Nothing, ::Base.Iterators.Pairs{Symbol,Integer,Tuple{Symbol,Symbol},NamedTuple{(:verbose, :require_ssl_verification),Tuple{Int64,Bool}}}, ::Function, ::Type{HTTP.MessageRequest.MessageLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::String) at C:\Users\retracted_id\.julia\packages\HTTP\U2ZVp\src\MessageRequest.jl:47
[8] (::getfield(HTTP, Symbol("#kw##request")))(::NamedTuple{(:verbose, :require_ssl_verification),Tuple{Int64,Bool}}, ::typeof(HTTP.request), ::Type{HTTP.MessageRequest.MessageLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer}}}}, ::String, ::HTTP.URIs.URI, ::Array{Pair{SubString{String},SubString{String}},1}, ::String) at .\none:0
[9] http_request(::Dict{Symbol,Any}) at C:\Users\retracted_id\.julia\packages\AWSCore\BzoMV\src\http.jl:36
[10] macro expansion at C:\Users\retracted_id\.julia\packages\AWSCore\BzoMV\src\AWSCore.jl:410 [inlined]
[11] macro expansion at C:\Users\retracted_id\.julia\packages\Retry\0jMye\src\repeat_try.jl:192 [inlined]
[12] do_request(::Dict{Symbol,Any}) at C:\Users\retracted_id\.julia\packages\AWSCore\BzoMV\src\AWSCore.jl:393
[13] macro expansion at C:\Users\retracted_id\.julia\packages\AWSS3\eYu6G\src\AWSS3.jl:108 [inlined]
[14] macro expansion at C:\Users\retracted_id\.julia\packages\Retry\0jMye\src\repeat_try.jl:192 [inlined]
[15] #s3#1(::Dict{String,String}, ::String, ::Dict{String,String}, ::String, ::String, ::Bool, ::Bool, ::Function, ::Dict{Symbol,Any}, ::String, ::String) at C:\Users\retracted_id\.julia\packages\AWSS3\eYu6G\src\AWSS3.jl:82
[16] #s3 at .\none:0 [inlined] (repeats 2 times)
[17] s3_list_buckets(::Dict{Symbol,Any}) at C:\Users\retracted_id\.julia\packages\AWSS3\eYu6G\src\AWSS3.jl:463
[18] top-level scope at none:0
whereas the following code works just fine
using AWSS3, AWSCore
aws = AWSCore.default_aws_config()
s3_list_buckets(aws)
Trouble is that in my code I need to download some data from ftp and then deal with s3. any idea how one can fix, have workaround?
I have the following config
[4f1ea46c] AWSCore v0.6.0
[1c724243] AWSS3 v0.5.0
[0d499d91] AWSSDK v0.4.0
[c52e3926] Atom v0.8.5
[3e78a19a] Bukdu v0.4.4
[336ed68f] CSV v0.5.9
[a93c6f00] DataFrames v0.19.0
[01fcc997] FTPClient v1.0.1
[5789e2e9] FileIO v1.0.7
[59287772] Formatting v0.3.5
[cd3eb016] HTTP v0.8.4
[033835bb] JLD2 v0.1.2
[682c06a0] JSON v0.21.0
[e5e0dc1b] Juno v0.7.0
[f269a46b] TimeZones v0.9.1
Julia 1.1.1
Win10 64
This is probably related to bug in HTTP module
We can leverage code from AWSCore
https://github.com/JuliaCloud/AWSCore.jl/blob/master/src/sign.jl#L64
that we can use for s3_sign_url.
Lines 750 to 902 in 2aa207a
It'd be nice if we could set defaults for s3_put
(e.g., encryption, acl rules) once rather than passing arguments every time to s3_put
. I suppose these setting could maybe be encapsulated in the S3Path
type, so that paths operating on a specific bucket will propagate the settings for us?
Currently, the package tests begin by cleaning up all buckets that begin with "ocaws.jl.test". If CI tests are all run on the same account, there is a chance that one set of tests may delete buckets being used in another set, causing unexpected behavior.
@samoconnor It seems that your packages of AWSS3 and AWSSDK is under a lot of changes. Even the release version is not working now. do you have any work around?
Do we prefer AWSSDK.jl? There is pretty nice api there too.
Here is the error I get from Pkg.test.
Warning: AWSCore.do_request() exception: UndefVarError
ERROR: LoadError: UndefVarError: host not defined
Stacktrace:
[1] macro expansion at /Users/jpwu/.julia/v0.6/Retry/src/repeat_try.jl:162 [inlined]
[2] #s3#1(::Dict{String,String}, ::String, ::Dict{String,String}, ::String, ::String, ::Bool, ::Bool, ::Function, ::Dict{Symbol,Any}, ::String, ::String) at /Users/jpwu/.julia/v0.6/AWSS3/src/AWSS3.jl:83
[3] (::AWSS3.#kw##s3)(::Array{Any,1}, ::AWSS3.#s3, ::Dict{Symbol,Any}, ::String, ::String) at ./<missing>:0
[4] s3_list_buckets(::Dict{Symbol,Any}) at /Users/jpwu/.julia/v0.6/AWSS3/src/AWSS3.jl:433
[5] anonymous at ./<missing>:?
[6] include_from_node1(::String) at ./loading.jl:576
[7] include(::String) at ./sysimg.jl:14
[8] process_options(::Base.JLOptions) at ./client.jl:305
[9] _start() at ./client.jl:371
while loading /Users/jpwu/.julia/v0.6/AWSS3/test/runtests.jl, in expression starting on line 43
Julia: v1.2.0
AWSS3.jl: v0.6.4
When you have a file on S3 in a directory with the same subsequent name doing a readdir()
is fine, however a sync()
call will fail with the following error:
ERROR: LoadError: LoadError: ArgumentError: Source path does not exist: s3://mattbr-sample-s3-bucket/subdir1/
Below is a sample script to replicate this issue:
using Test
bucket_name = "mattbr-sample-s3-bucket"
file_name = "subdir1/subdir1/test.txt"
s3_create_bucket(bucket_name)
s3_put(bucket_name, file_name, "test")
@test readdir(S3Path("s3://$bucket_name/subdir1/subdir1/")) == ["test.txt"]
temp_path = p"temp-directory"
mkdir(temp_path)
#@test_throws ArgumentError sync(S3Path("s3://$bucket_name/"), temp_path)
sync(S3Path("s3://$bucket_name/"), temp_path)
#rm(temp_path, force=true, recursive=true)
type Array has no field headers
Stacktrace:
[1] getproperty(::Array{UInt8,1}, ::Symbol) at ./Base.jl:33
[2] s3_upload_part(::Dict{Symbol,Any}, ::XMLDict.XMLDictElement, ::Int64, ::Array{UInt8,1}) at /home/keno/.julia/dev/AWSS3/src/AWSS3.jl:697
Looks like this function expects to get a raw response object back from the API, but instead only gets the body.
read
for SystemPath, IO and String all return a Vector{UInt8}
, while read(::S3Path)
returns a String
. Should the S3Path behavior be changed? Strings would still be accessible through read(::S3Path, String)
.
julia> s3_put(aws, bucket, "empty", "")
ERROR: MissingContentLength -- You must provide the Content-Length HTTP header.
HTTP.StatusError: received a '411 - Length Required' status in response
Stacktrace:
[1] macro expansion at /home/sachs/.julia/v0.6/Retry/src/repeat_try.jl:162 [inlined]
[2] #s3#1(::Dict{String,String}, ::String, ::Dict{String,String}, ::String, ::String, ::Bool, ::Bool, ::Function, ::Dict{Symbol,Any}, ::String, ::String) at /home/sachs/.julia/v0.6/AWSS3/src/AWSS3.jl:83
[3] (::AWSS3.#kw##s3)(::Array{Any,1}, ::AWSS3.#s3, ::Dict{Symbol,Any}, ::String, ::String) at ./<missing>:0
[4] #s3_put#27(::Dict{String,String}, ::Dict{String,String}, ::Function, ::Dict{Symbol,Any}, ::String, ::String, ::String, ::String, ::String) at /home/sachs/.julia/v0.6/AWSS3/src/AWSS3.jl:617
[5] s3_put(::Dict{Symbol,Any}, ::String, ::String, ::String, ::String, ::String) at /home/sachs/.julia/v0.6/AWSS3/src/AWSS3.jl:586 (repeats 2 times)
Current state of this package has tests all in one file, not real structure to them and dependent on previous tests.
The testing here needs to be refactored and re-structured.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.