Comments (15)
The current documented display options are:
Add
--nl
to collapse these to single lines as valid newline-delimited JSON.Add
--array
to output a valid JSON array of objects instead.
Defaulting to pretty-printed invalid newline-delimited JSON was a weird design choice that I made!
% s3-credentials list-buckets
{
"Name": "aws-cloudtrail-logs-462092780466-f2c900d3",
"CreationDate": "2021-03-25 22:19:54+00:00"
}
{
"Name": "simonw-test-bucket-for-s3-credentials",
"CreationDate": "2021-11-03 21:46:12+00:00"
}
Fixing this would require a major version bump if I had hit 1.0 already.
from s3-credentials.
In that case I think the safest default would be to turn the above default into a pretty-printed streaming JSON array:
% s3-credentials list-buckets
[
{
"Name": "aws-cloudtrail-logs-462092780466-f2c900d3",
"CreationDate": "2021-03-25 22:19:54+00:00"
},
{
"Name": "simonw-test-bucket-for-s3-credentials",
"CreationDate": "2021-11-03 21:46:12+00:00"
}
]
Then drop the --array
option but keep --nl
, which would output this (already implemented):
% s3-credentials list-buckets
{"Name": "aws-cloudtrail-logs-462092780466-f2c900d3", "CreationDate": "2021-03-25 22:19:54+00:00"}
{"Name": "simonw-test-bucket-for-s3-credentials", "CreationDate": "2021-11-03 21:46:12+00:00"}
And add a new --csv
option.
from s3-credentials.
Getting this right will mean I can pipe into sqlite-utils insert
easily to create a SQLite database, which would be fun.
Actually this works already:
s3-credentials list-buckets --nl | sqlite-utils insert /tmp/s3.db buckets - --nl
from s3-credentials.
Here's how the current list-buckets
implementation works:
s3-credentials/s3_credentials/cli.py
Lines 556 to 564 in aa69024
I need a new abstraction I can call that knows how to turn an iterator of rows into one of the desired formats.
from s3-credentials.
Most interesting new implementation here will be the code that knows how to output something like this in a streaming fashion, without buffering it all in an array first:
[
{
"Name": "aws-cloudtrail-logs-462092780466-f2c900d3",
"CreationDate": "2021-03-25 22:19:54+00:00"
},
{
"Name": "aws-sam-cli-managed-default-samclisourcebucket-1ksajo4h62s07",
"CreationDate": "2020-06-16 23:13:34+00:00"
},
{
"Name": "blah-bucket-blah",
"CreationDate": "2021-11-10 23:50:08+00:00"
}
]
Trick will be to output [
at the start, then two-space indented (with textwrap
) json.dumps(..., indent=2)
rows with commas after each except the last one - and then a ]
at the end.
from s3-credentials.
Good code to imitate from sqlite-utils
: https://github.com/simonw/sqlite-utils/blob/74586d3cb26fa3cc3412721985ecdc1864c2a31d/sqlite_utils/cli.py#L1589-L1623 - in particular this CSV/TSV bit:
writer = csv_std.writer(sys.stdout, dialect="excel-tab" if tsv else "excel")
writer.writerow(headers)
for row in cursor:
writer.writerow(row)
def output_rows(iterator, headers, nl, arrays, json_cols):
# We have to iterate two-at-a-time so we can know if we
# should output a trailing comma or if we have reached
# the last row.
current_iter, next_iter = itertools.tee(iterator, 2)
next(next_iter, None)
first = True
for row, next_row in itertools.zip_longest(current_iter, next_iter):
is_last = next_row is None
data = row
if json_cols:
# Any value that is a valid JSON string should be treated as JSON
data = [maybe_json(value) for value in data]
if not arrays:
data = dict(zip(headers, data))
line = "{firstchar}{serialized}{maybecomma}{lastchar}".format(
firstchar=("[" if first else " ") if not nl else "",
serialized=json.dumps(data, default=json_binary),
maybecomma="," if (not nl and not is_last) else "",
lastchar="]" if (is_last and not nl) else "",
)
yield line
first = False
if first:
# We didn't output any rows, so yield the empty list
yield "[]"
from s3-credentials.
Actually I probably want to use csv.DictWriter
here:
writer = csv.DictWriter(sys.stdout, headers)
dict_writer.writeheader()
dict_writer.writerows(iterator_of_dictionaries)
from s3-credentials.
This outputs 2-indented JSON in a streaming fashion:
def output_rows_json(iterator):
# We have to iterate two-at-a-time so we can know if we
# should output a trailing comma or if we have reached
# the last row.
current_iter, next_iter = itertools.tee(iterator, 2)
next(next_iter, None)
first = True
for row, next_row in itertools.zip_longest(current_iter, next_iter):
is_last = next_row is None
data = row
line = "{firstchar}{serialized}{maybecomma}{lastchar}".format(
firstchar="[\n" if first else "",
serialized=textwrap.indent(json.dumps(data, indent=2, default=repr), ' '),
maybecomma="," if not is_last else "",
lastchar="\n]" if is_last else "",
)
yield line
first = False
if first:
# We didn't output any rows, so yield the empty list
yield "[]"
Demo:
print("\n".join(output_rows_json([{"id": 1, "name": "Simon"}, {"id": 2, "name": "Cleo"}, {"id": 3, "name": "Azi"}])))
[
{
"id": 1,
"name": "Simon"
},
{
"id": 2,
"name": "Cleo"
},
{
"id": 3,
"name": "Azi"
}
]
from s3-credentials.
Turned that into a TIL: https://til.simonwillison.net/python/output-json-array-streaming
from s3-credentials.
Ran into a problem applying this to list-users
:
% s3-credentials list-users --csv
Path,UserName,UserId,Arn,CreateDate
... many rows follow ...
Traceback (most recent call last):
File "/Users/simon/.local/share/virtualenvs/s3-credentials-J8M1ChYK/bin/s3-credentials", line 33, in <module>
sys.exit(load_entry_point('s3-credentials', 'console_scripts', 's3-credentials')())
File "/Users/simon/.local/share/virtualenvs/s3-credentials-J8M1ChYK/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/Users/simon/.local/share/virtualenvs/s3-credentials-J8M1ChYK/lib/python3.10/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/Users/simon/.local/share/virtualenvs/s3-credentials-J8M1ChYK/lib/python3.10/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/simon/.local/share/virtualenvs/s3-credentials-J8M1ChYK/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/simon/.local/share/virtualenvs/s3-credentials-J8M1ChYK/lib/python3.10/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/Users/simon/Dropbox/Development/s3-credentials/s3_credentials/cli.py", line 495, in list_users
output(iterate(), nl, csv, tsv)
File "/Users/simon/Dropbox/Development/s3-credentials/s3_credentials/cli.py", line 789, in output
writer.writerows(itertools.chain([first], iterator))
File "/Users/simon/.pyenv/versions/3.10.0/lib/python3.10/csv.py", line 157, in writerows
return self.writer.writerows(map(self._dict_to_list, rowdicts))
File "/Users/simon/.pyenv/versions/3.10.0/lib/python3.10/csv.py", line 149, in _dict_to_list
raise ValueError("dict contains fields not in fieldnames: "
ValueError: dict contains fields not in fieldnames: 'PasswordLastUsed'
CSV output failed because one of the later rows had a new unexpected column.
from s3-credentials.
Options for fixing this:
- Silently ignore columns that were not in the first record. Easiest fix.
- Watch out for these warnings and show them at the end, after ignoring them while outputting stuff. Bit ugly.
- For CSV mode load everything into memory first to check for the maximum set of headers. This breaks the goal of having this work efficiently with the streamed data.
- Figure out the full set of possible columns and hard-code that into the application. Probably the best solution?
from s3-credentials.
I considered an option where it spots the error, runs to the end to capture all possible headers, then runs the entire command again - but that wouldn't work because we would already have outputted headers and previous rows to stdout.
from s3-credentials.
I'm going to hard-code in the list of known columns. This also gives me control over the order in which they are output.
For list-users
that's https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.list_users
UserName
UserId
Arn
Path
CreateDate
PasswordLastUsed
PermissionsBoundary
Tags
from s3-credentials.
Fun trick with this:
% s3-credentials list-users --tsv | sqlite-utils memory stdin:tsv 'select * from stdin' -t
UserName UserId Arn Path CreateDate PasswordLastUsed PermissionsBoundary Tags
----------------------------------------------------- --------------------- ------------------------------------------------------------------------------------ ------ ------------------------- ------------------------- --------------------- ------
custom-policy AIDAWXFXAIOZNQQMEOHUA arn:aws:iam::462092780466:user/custom-policy / 2021-11-03 18:31:22+00:00
dogsheep-photos-simon-read AIDAWXFXAIOZKDDGOUY5H arn:aws:iam::462092780466:user/dogsheep-photos-simon-read / 2020-04-18 19:56:54+00:00
from s3-credentials.
OK, this is done for list-users
and list-buckets
and list-bucket
.
list-user-policies
doesn't output JSON at all, it has a weird custom output - so I'm leaving it for the moment.
from s3-credentials.
Related Issues (20)
- Way to make an existing bucket public or private HOT 1
- Convert README into documentation website HOT 3
- Make it easier to add extra policy statements HOT 10
- Provide a `--profile` option to allow AWS profile selection HOT 3
- Using --policy should imply --user-permissions-boundary=none HOT 2
- s3-credentials.AmazonS3FullAccess has MaxSessionDuration 3600, should be 12 hours HOT 5
- KeyError if listing bucket with no items returned
- s3-credentials list-buckets --details should show region and website URL, if configured HOT 2
- `s3-credentials get-objects` command HOT 7
- `get-objects/put-objects` `--skip` and `--skip-hash` options HOT 1
- Add the options to add tags to the created resources HOT 3
- `set-public-policy` command HOT 5
- Add s3:PutObjectAcl to write policies HOT 3
- `s3-credentials delete-objects` command HOT 11
- Mysterious test failure in `test_put_objects` HOT 4
- debug-bucket command HOT 3
- Command to make a bucket public HOT 4
- `s3-credentials create name-of-bucket --create-bucket --public` fails with error HOT 4
- `s3-credentials list-bucket --urls` option HOT 1
- CI failures, including ImportError: cannot import name 'mock_s3' from 'moto' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from s3-credentials.