Git Product home page Git Product logo

scar's People

Contributors

adolfo24 avatar amcaar avatar asalic avatar dianamariand92 avatar gmolto avatar imerica avatar jbern16 avatar micafer avatar sergiolangaritabenitez avatar shatgupt avatar srisco avatar winstonn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scar's Issues

Pass Down the IAM Credentials to the Containers

The Lambda function has credentials granted from the IAM role to interact with other AWS services.
These credentials should be made available to the underlying executed container (for example as the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) so that the code running in the container (say AWS CLI) can use these credentials to interact with other AWS services.

Docker images with ENTRYPOINT cannot be used to run user-defined shell-scripts

Docker images created out of Dockerfiles with an ENTRYPOINT cannot be employed to run a shell-script with SCAR.

udocker's -nometa parameter could be used to be able to achieve it:

udocker run --nometa -e AWS_ACCESS_KEY_ID=<AK> -e AWS_SECRET_ACCESS_KEY=<SK> $IMG_CONT_NAME /usr/local/bin/python /usr/local/bin/aws ec2 describe-instances

But that option prevents from volumes being mounted on the container.

Options:

  1. Modify uDocker to support overriding the entrypoint and push the changes upstream.
  2. Create specific images without ENTRYPOINT

Including a Configuration File

Some values are currently hardcoded, such as the IAM Role passed to the Lambda function.

A configuration file (say $HOME/.scar.cfg, to mimic the IM client or $HOME/.scar/config.yml, to mimic EC3) is required to easily define these values.

Since AWS CLI uses a typical INI formatted file it may be appropriate to adopt that style.

Fail Gracefully when Running Non-Existent Lambda Function

When attempting to run a non-existent Lambda function an error message is thrown:

Traceback (most recent call last):
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 321, in <module>
    Scar().execute()
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 210, in execute
    args.func(args)
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 288, in run
    self.lambda_env_variables = self.boto3_client.get_function(FunctionName=args.name)['Configuration']['Environment']
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 557, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceNotFoundException: An error occurred (ResourceNotFoundException) when calling the GetFunction operation: Function not found: arn:aws:lambda:us-east-1:XXXXXX:function:non-existent-function

An error message should be shown instead.

Support Lambda Runtime Python 3.6

uDocker initially supported python 2.7 (branch udocker-fr), but the fork in grycap/udocker already supports Python 3.6.

The SCAR branch [gmolto-test-python-3] includes the support introduced by @micafer for the Python 3.6 Lambda runtime.

Further testing is required.

Unknown parameter in input: "Tags", must be one of: FunctionName, Runtime

Issuing the command:
./scar.py init -n test-alpine-00 -m 128 -t 100 alpine:latest
results in the following error:

Traceback (most recent call last):
  File "./scar.py", line 261, in <module>
    Scar().execute()
  File "./scar.py", line 166, in execute
    args.func(args)
  File "./scar.py", line 195, in init
    Tags=self.lambda_tags)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 518, in _make_api_call
    api_params, operation_model, context=request_context)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 573, in _convert_to_request_dict
    api_params, operation_model)
  File "/Library/Python/2.7/site-packages/botocore/validate.py", line 291, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "Tags", must be one of: FunctionName, Runtime, Role, Handler, Code, Description, Timeout, MemorySize, Publish, VpcConfig, DeadLetterConfig, Environment, KMSKeyArn

Reduce Default Log Expiration in CloudWatch

Current Log Groups created automatically by AWS Lambda set the expiration to "Never". It would be appropriate to set this to a more reasonable value (say 30 days) or be a configurable parameter from the CLI.

Scar shows incorrectly the error in case of function timeout

If the function finishes due to a timeout, scar shows the following error :

Traceback (most recent call last):
  File "scar.py", line 518, in <module>
    CmdParser().execute()        
  File "scar.py", line 515, in execute
    args.func(args)        
  File "scar.py", line 207, in run
    response['LogStreamName'] = parsed_output[2][23:]

Passing environment variables to run function

Environment variables should be supported to be specified as input for the run function. These variables should be passed to the Lambda function and, in turn, the variables should be made available to the Docker container.

Input Files Not Retrieved from S3

Scenario:
scar init -s test/test-file-processor.sh -n lambda-event-test-00 -es scar-test ubuntu:16.04
where test-file-processor.sh is:

INPUT_DIR="/tmp/$REQUEST_ID"
OUTPUT_DIR="/tmp/$REQUEST_ID/output"
echo "BEGIN: Invoked File Processor. Files available in $INPUT_DIR"
echo "`ls -l $INPUT_DIR`"
echo "Creating output folder in $OUTPUT_DIR"
cp $INPUT_DIR/* $OUTPUT_DIR
echo "END: Invoked File Processor. Files generated in $OUTPUT_DIR"

and scar-test is an existing bucket.

When uploading a file:

aws s3 cp requirements.txt s3://scar-test/input/req5.txt

The Lambda function is triggered but the only file found in /tmp/$REQUEST_ID is event.json.

Removing a Lambda with Deleted Logs Throws an Error

Deleting a Lambda function whose Logs in CloudWatch were deleted throws the error:

Traceback (most recent call last):
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 496, in <module>
    CmdParser().execute()
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 493, in execute
    args.func(args)
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 225, in rm
    cw_response = AwsClient().get_log().delete_log_group(logGroupName='/aws/lambda/' + args.name)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 557, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceNotFoundException: An error occurred (ResourceNotFoundException) when calling the DeleteLogGroup operation: The specified log group does not exist.

Illegal unquoted character ((CTRL-CHAR, code 9)): has to be escaped using backslash to be included in string value

When trying to use scar run passing the following script (via -p): https://gist.github.com/gmolto/863c9f53e22715972959a3a7b5a82ee4

The following error raises:

Traceback (most recent call last):
  File "./scar.py", line 278, in <module>
    Scar().execute()
  File "./scar.py", line 167, in execute
    args.func(args)
  File "./scar.py", line 258, in run
    Payload=script)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 557, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestContentException: An error occurred (InvalidRequestContentException) when calling the Invoke operation: Could not parse request body into json: Illegal unquoted character ((CTRL-CHAR, code 9)): has to be escaped using backslash to be included in string value
 at [Source: [B@1fa37422; line: 1, column: 493]

Allow Specifying a Shell-Script During Init

The scar init function should also support the -p option, as in scar run. This would allow specifying a shell-script that would be executed anytime that the Lambda function is invoked.

The shell-script could be uploaded in the deployment package and an environment variable should be defined so that SCAR Supervisor knows that this shell-script has to be executed upon each invocation.

This will be useful when using for example S3 as a source events, in order to use a shell-script to manage the events triggered.

Concurrent Executions from the CLI Cause a Hang in One of the Requests

Steps to reproduce:

  1. Open two terminals
  2. Run scar run lambda-docker-cowsay from the terminals closely at the same time

Results
One the of executions hangs. When stopped with CTRL-C the error is:

^CTraceback (most recent call last):
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 404, in <module>
    Scar().execute()
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 210, in execute
    args.func(args)
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 335, in run
    LogType=log_type)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 544, in _make_api_call
    operation_model, request_dict)
  File "/Library/Python/2.7/site-packages/botocore/endpoint.py", line 141, in make_request
    return self._send_request(request_dict, operation_model)
  File "/Library/Python/2.7/site-packages/botocore/endpoint.py", line 168, in _send_request
    request, operation_model, attempts)
  File "/Library/Python/2.7/site-packages/botocore/endpoint.py", line 204, in _get_response
    proxies=self.proxies, timeout=self.timeout)
  File "/Library/Python/2.7/site-packages/botocore/vendored/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Python/2.7/site-packages/botocore/vendored/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/Library/Python/2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/Library/Python/2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 372, in _make_request
    httplib_response = conn.getresponse(buffering=True)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1132, in getresponse
    response.begin()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 453, in begin
    version, status, reason = self._read_status()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 409, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 480, in readline
    data = self._sock.recv(self._rbufsize)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 734, in recv
    return self.read(buflen)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 621, in read
    v = self._sslobj.read(len or 1024)

N.B. Lambda function created with:

scar init -n lambda-docker-cowsay -m 128 -t 300 chuanwen/cowsay

Boto3 should support multi-threading when configured as indicated in: https://boto3.readthedocs.io/en/latest/guide/resources.html#multithreading

Container name should be unique

Currently, the container name in the scarsupervisor.py function is hardcoded. This may cause trouble with the sometimes-shared /tmp directory across Lambda function invocations.

The name should be an UUID. Probably something like:

import uuid

uuid.uuid4()
UUID('bd65600d-8669-4903-8a14-af88203add38')

Create init function

Add functionality to initialize the lambda functions and load the containers.

Error When Omitting the Function Name

When omitting the function name in scar init the following error arises:

Error: Function name 'scar_function' is not valid

Either a default name should be created for the function and shown to the user or output a message that clearly specifies that a function name should be chosen by the user.

Creating a lambda function when the corresponding log group exists throws an error

Traceback (most recent call last):
File "scar.py", line 504, in
CmdParser().execute()
File "scar.py", line 501, in execute
args.func(args)
File "scar.py", line 79, in init
'createdby' : 'scar' }
File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 253, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 557, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceAlreadyExistsException: An error occurred (ResourceAlreadyExistsException) when calling the CreateLogGroup operation: The specified log group already exists

Error when Omitting Lambda Function Name

To reproduce:

scar init ubuntu:14.04

The following error arises.

Traceback (most recent call last):
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 496, in <module>
    CmdParser().execute()
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 493, in execute
    args.func(args)
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 77, in init
    logGroupName='/aws/lambda/' + args.name,
TypeError: cannot concatenate 'str' and 'NoneType' objects

Modify Default Values of Memory and TimeOut

The current default values are:

  • Memory 128 MB
  • Timeout (3 seconds).

The default memory is appropriate, to cut down costs but the default timeout should be increased to the maximum value (300 seconds / 5 minutes) to avoid reaching the timeout in the first invocation.

Pass the Event to the Container

The Lambda function receives an Event and Context object. At least the Event should be passed down to the running container to allow the following scenario:

  • A bunch of files are uploaded to an S3 bucket. Notifications are triggered and Lambda functions are invoked. The application in the running container receives the Event and processes the file.

Ideas for implementation. The SCAR supervisor Lambda function ...

  • stores the Event in /tmp//event.txt (to avoid collisions due to /tmp sharing)
  • passes an environment variable SCAR_EVENT_FILE to the running container pointing to that file

Manage the Input and Output Files in S3 Event from the Lambda Function

In order to support the use cases where command-line applications retrieve and upload data from S3 into the execution environment, the AWS CLI should be made available inside the deployment package and, in turn, made it available to the container so that aws commands can be properly used, in conjunction with the credentials defined in the environment variables.

This would require having installed Python, unless the aws tool from Tim Kay is integrated (which btw requires perl).

Client-Side Validation of Lambda Function Name

Lambda function names have to comply with a regular expression ((arn:(aws|aws-us-gov):lambda:)?([a-z]{2}(-gov)?-[a-z]+-\d{1}:)?(\d{12}:)?(function:)?([a-zA-Z0-9-]+)(:($LATEST|[a-zA-Z0-9-]+))?)

A client-side validation should be performed to avoid cases in which you specify a dot as part of the name and the error raises after having waited for the deployment package to be uploaded.

Define S3 as a Source of Events to a Lambda Function

It would be convenient to define an S3 bucket as a source of events to invoke the Lambda function.

From the CLI one can achieve this vĂ­a:

aws s3api put-bucket-notification-configuration --bucket scar-data --notification-configuration file://notif-config.json

Where the file is:

{
   "LambdaFunctionConfigurations": [
 {
     "Id": "scar-bucket-configuration",
     "LambdaFunctionArn": "arn:aws:lambda:us-east-1:XXXXXXX:function:<lambda-function>",
     "Events": [
        "s3:ObjectCreated:*"
     ],
     "Filter": {
        "Key": {
            "FilterRules": [
                {
                   "Name": "prefix",
                   "Value": "input/"
                }
            ]
        }
     }
 }
]
}

Similar functionality can be achieved from Boto3 using: https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.put_bucket_notification_configuration

This functionality will easily achieve a mechanism to upload a file to a bucket (into the input folder) and trigger the execution of the Lambda function which will in turn execute the container.

Error when using alpine Docker image

The execution of the alpine:latest Docker image in AWS Lambda results in the following error:

Error: command not found or has no execute bit set:  ['/bin/sh', '/tmp/udocker/script.sh']
Info: creating repo: /tmp/home/.udocker
Warning: running as uid 0 is not supported by this engine
Warning: non-existing user will be created

The same Lambda function works well with ubuntu:16.04 and centos:7

The weird point is that udocker works fine when executing a container out of alpine:latest on a local machine.

Scar fails when an error occurs in Lambda execution

When executing a Lambda function ( i.e. "scar run"), whose execution fails, SCAR terminates abruptly instead of showing an info message to the user.

An example: I have executed a lambda function that failed due to no space left on device (Lambda space restrictions). I received this in the SCAR CLI:

$ ./scar.py run scar-mesos-test -p sleep.sh

Traceback (most recent call last):
File "./scar.py", line 393, in
Scar().execute()
File "./scar.py", line 208, in execute
args.func(args)
File "./scar.py", line 334, in run
response['LogStreamName'] = parsed_output[1][23:]
IndexError: list index out of range

A friendly and informative message for the user would be appreciated.

Avoid Double Pulling of the Docker Image

Apparently, the udocker pull command downloads the image even if the Docker image has already been pulled. Since eventually /tmp may be shared across invocations of the Lambda functions it would be worth the check if the image is already registered before pulling the image again.

Unable to Install a Package inside the Docker Container

(This is not an issue with scar itself but most probable due to the execution mode used in uDocker. Included here for reference purposes.

Trying to execute the following script

#! /bin/sh
echo "Hello world from the container"
apt-get update && apt-get install -y cowsay fortune
export PATH=$PATH:/usr/games
fortune | cowsay

on an ubuntu:16.04 fails with the following error:

debconf: delaying package configuration, since apt-utils is not installed
Fetched 8733 kB in 1s (4403 kB/s)
E: Can not write log (Is /dev/pts mounted?) - posix_openpt (2: No such file or directory)
dpkg: error: requested operation requires superuser privilege
E: Sub-process /usr/bin/dpkg returned an error code (2)
/tmp/udocker/script.sh: 5: /tmp/udocker/script.sh: fortune: not found
/tmp/udocker/script.sh: 5: /tmp/udocker/script.sh: cowsay: not found
Info: creating repo: /tmp/home/.udocker
Warning: running as uid 0 is not supported by this engine
Warning: non-existing user will be created

Steps to reproduce this issue:
scar.py init -t 60 -m 128 -n test-ubuntu-1604 ubuntu:16.04
scar.py run -p test/test-script.sh test-ubuntu-1604

Allow Running the Lambda Function with no shell-script

The run function currently requires a user-supplied shell-script. If omitted, the following error arises:

Traceback (most recent call last):
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 321, in <module>
    Scar().execute()
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 210, in execute
    args.func(args)
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 296, in run
    script = "{ \"script\" : \"" + escape_string(args.payload.read()) + "\"}"
AttributeError: 'NoneType' object has no attribute 'read'

Passing this script should be optional. An example of Docker image not requiring the execution of a script, for testing, is: chuanwen/cowsay

Error when Removing Non-Existent Lambda Function

Attempting to remove a non-existent Lambda function results in a nasty error:

scar rm i-do-not-exist
Traceback (most recent call last):
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 496, in <module>
    CmdParser().execute()
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 493, in execute
    args.func(args)
  File "/Users/gmolto/Documents/GitHub/grycap/scar/scar.py", line 223, in rm
    lambda_response = AwsClient().get_lambda().delete_function(FunctionName=args.name)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Library/Python/2.7/site-packages/botocore/client.py", line 557, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceNotFoundException: An error occurred (ResourceNotFoundException) when calling the DeleteFunction operation: Function not found: arn:aws:lambda:us-east-1:974349055189:function:test2

Adapt Lambda Function Code to Python 3.6

The current Lambda function does not work under the Python 3.6 runtime environment. It throws this error:

AttributeError: module 'urllib' has no attribute 'URLopener'

We should deprecate Python 2.7 and stick with Python 3.6 unless there is a sound reason.

Support Passing Arguments to the Docker Container

The run operation should allow specifying arguments to be passed to the Docker container.

Just as Docker / uDocker support this:

docker run --rm chuanwen/cowsay /usr/games/cowsay "Hello"

SCAR should support:

scar run lambda-docker-cowsay /usr/games/cowsay "Hello"

Two-level Caching to Speed Up Container Deployment

Images are currently always retrieved from Docker Hub.
A two-level caching mechanism involving Amazon S3 and the /tmp directory in the Lambda function is required to accelerate the staging of Docker images into the Lambda function runtime environment before uDocker can start the container.

The Lambda function should check if the Docker image is already available in /tmp. Otherwise, it should check if it is in a user-defined bucket in S3. Otherwise, it should revert to Docker Hub but populating the intermediate caches for future hits. For parameter sweep application this is a must since many cache hits are expected. A naming scheme for the Docker images is required.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.