miqdigital / aws-utils Goto Github PK
View Code? Open in Web Editor NEWThis repository provides utilities which are used at MiQ.
License: GNU General Public License v3.0
This repository provides utilities which are used at MiQ.
License: GNU General Public License v3.0
if you are using this backup of instance under autoscalling it has a default tage which cannot be created manually so the scripts tries to copy and tag that and fails example awsautoscallinggroup tag
I have a requirement to scan a particular JSON file from the S3 bucket. The bucket has 1000s of folders and each of those folders has subfolders and a file named receipt.json. The following is the structure.
bucket
----- folder
----- subfolder
----- receipt.json
The subfolders have multiple files which I do not want to query, with a slight modification to your s3_executor script, I was able to get the script to only check receipt.json
for obj in contents:
key = obj['Key']
**if key.startswith(prefix) and key.endswith("receipt.json"):**
When I run the script with a simple SQL query like SELECT * FROM s3object s
, the script is scanning the receipt.json files and prints output to the console. However, if I modify the SQL query to filter the fields, i get the following error message
SQL - SELECT * FROM s3object[*].manifest s
Error
Performing, with: anr-cribl-data smartstore/cribl_json/db/03/bd/3~19F003DD-45C9-45AC-A3EC-C640F4A52D5A/receipt.json SELECT * FROM s3object[*].manifest s None CSV {'AllowQuotedRecordDelimiter': True, 'QuoteCharacter': '', 'FieldDelimiter': ','}
Traceback (most recent call last):
File "s3_executor.py", line 165, in
perform(bucket, item, sql, comp, content_type, content_options, output_file)
File "/mnt/c/Users/anoop/Documents/query_s3/select_runner.py", line 73, in perform
r = s3.select_object_content(
File "/home/anoop/.local/lib/python3.8/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/anoop/.local/lib/python3.8/site-packages/botocore/client.py", line 676, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidDataSource) when calling the SelectObjectContent operation: Data source type path is not supported. Please check the service documentation and try again.
I have attached the a sample of the json file i am querying.
Any help on this is really appreciated.
hi, an interesting and useful tool to use with s3 select (the main issue with the native awscli is lack of directory support!). however, the str() function is problematic with non-ascii characters and a system not with correct locale (locally I've fixed that by settings UTF-8 for the sys. the general suggestion is to use .encode()
also, for the wiki entry that led me here: https://medium.com/miq-tech/https-medium-com-nagaraj-mediaiq-how-we-use-s3-select-for-schema-validation-and-filtering-data-at-miq-52cf036bf9be . - the file is select_runner.py, not s3_runner.py . (minor issue but might throw people...)
cheers!
The scripts present can be used to automate process for R53 creation via python script and to setup disaster recovery for RDS in AWS. It creates a manual snapshot of the current db present in production(source) region, then copies that snapshot to the DR(destination) region and if the disaster occurs db can be restored from the snapshots present in the DR region.
Pull request for the same is :
#6
Since python2 has been sunsetted and is no more maintained, so we need to migrate all the code written in python2 to python3.
PR has been raised for the same #9
@tan31989
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.