Git Product home page Git Product logo

aws-lambda-r's People

Contributors

anatofan avatar mikebadescu avatar teodorciuraru avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

aws-lambda-r's Issues

use a bash ini processor to load & save settings

Problem

  • if we have scripts that create resources on AWS (e.g. AMIs), we need to save their IDs

Solution

Note: remember to add the third party references in README and NOTICE files, see

open list

Open list / wishes / references. Once fully defined, move them to separate issues

reconsider building packages via docker?

I think the tooling has improved considerably and its fairly straightforward to compile binaries within a docker image without having to spin up an ec2. I'm considering using this for working with some of my colleagues that work exclusively in R and was hoping to make it as easy and painless as possible for them to get an RScript scraper deployed with precompiled binaries

I used this repo to build some python library binaries in the past: https://github.com/AlJohri/aws-lambda-lxml. the whole thing boiled down to:

docker run -v $(pwd)/"$version/py36/":/outputs -it lambci/lambda:build-python3.6 pip install lxml=="4.2.4" -t /outputs/
tar -czvf "4.2.4/py36/lxml-4.2.4.tgz" "4.2.4/py36/lxml"
aws s3 cp 4.2.4/py36/lxml-4.2.4.tgz s3://mybucket/lambda-compiled-binaries/py36/

see build.sh

would you consider a PR switching to docker instead of ec2?

setup lambda / api scripts

Problem

  • the scripts rely on certain AWS roles, lambda authentication function etc (one time setup)

TODO

  • decide what to automate
    • e.g., no need to create VPC and subnet --> write instructions
  • one-time scripts to create roles (everything related to IAM)
    • instructions / doc / script how to remove
  • one-time scripts to setup lambda
    • instructions / doc / script how to remove
  • one-time scripts to setup API gateway
    • instructions / doc / script how to remove

secrets in echo/log

Problem:

  • what is printed to console might get logged

TODO:

  • make a list of all places where we display secrets; the most important being AWS secrets
  • remove secrets from echo

fail if AWS cannot update lambda / API gateway

Problem

  • not all AWS CLI commands are checked to see if execution was successful

TODO

  • add exist status check after each AWS command (original script 14)
  • fail w/ message if not successful

fail if R cannot install packages

Problem:

  • if R fails to install any package, the script does not stop

TODO

  • check if installation of R packages was successful; fail if any package fails installation

Which is the best way to install R packages?

On the Amazon AMI I can install from inside R, but then I have this problem:
Installing package into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
Warning in install.packages("jsonlite", repos = "http://cran.r-project.org") :
'lib = "/usr/lib64/R/library"' is not writable
Would you like to use a personal library instead? (y/n)

I can install it this way also:
wget https://cran.r-project.org/src/contrib/jsonlite_1.5.tar.gz
sudo R CMD INSTALL jsonlite_1.5.tar.gz

which probably is the best way, or run R as sudo..

Understand EC2 needs

Hello,

Very nice job!

I'm confused about EC2. Reading your documentation, I can't figure out why do I need a EC2 instance to build the package. Is it possible to do that locally? Is there some reason for that?

As I understand, after setting up all variables, I just need to run 01_main.sh script, right?

Best regards,
Clovis

use Python 3.6

Problem:

  • This version uses Python 2.7 and rpy2==2.8.4

Solution:

  • Switch to Python 3.6

TODO:

  • install Python 3.6, pip3, update pip, install rp2 for python 3
  • make it work / test with R 3.4.x

check size of files within zip

Problem

  • there is no checking of the file size, it may be too big 250MB to run on Lambda
  • the script does not stop

TODO

  • check total deployment package size (outside zip)
  • fail if too large

AMI management

Problem:

  • lambda deployment too slow

Solution:

  • use 2 AMI: amazon & project

TODO:

  • new feature branch
  • in default settings, use 2 variabiles
    • ALX_AMI_ID="ami-657bd20a" : last version of Amazon Linux, w/o updates
    • EC2_AMI_ID=""
  • if EC2_AMI_ID not defined or there is no AMI with this ID -->
    • check if there is a file settings/settings_ami.txt and whether it contains a string --> EC2_AMI_ID=string
    • again, if EC2_AMI_ID not defined or there is no AMI with this ID --> EC2_AMI_ID=""
  • if EC2_AMI_ID is defined --> done, skip the steps below
  • consider options not to update with yum or update R packages
  • a script based on scripts 11 and 13:
    • if ALX_AMI_ID definit but no AMI --> exit 1
    • start with ALX_AMI_ID + update + yum install + virtualenv + pip + cp + cp + install R packages + cp
    • use $PRJ_NAME
    • do not copy files from project; do not create zip
    • create an AMI from this instance
    • write AMI ID in file settings/settings_ami.txt
    • terminate this EC2
  • main script: check if file exists settings/settings_ami.txt and if it contains a string --> EC2_AMI_ID=string
    • if EC2_AMI_ID not defined or no AMI --> exit 1
  • deployment package
    • connect to EC2_AMI_ID
    • copy files (script 06)
    • create zip

lambda only, no api

Problem

  • not everybody wants to use API Gateway; lambda is sufficient

TODO

  • split Lambda and API script into two
  • option in settings to (or not to) continue with API Gateway setup
  • test of lambda function from AWS CLI

install R packages from github

Problem

  • we have only one list of packages to install from CRAN

TODO

  • another array option with a list of packages to install from github
    • need to use name/repo format
  • install devtools if github array not empty
  • determine size overhead due to devtools
  • script to install from github

Running setup_user.sh PRJ_NAME: command not found

Great project!

I am trying to configure the setup_user.sh script file by following the example file.

I added:

#!/bin/bash

# user settings and secrets
# customize and rename to `setup_user.sh`
# overwrites `secrets_default.sh`, `settings_default.sh`, `setup_auto.sh`
PRJ_NAME ="myproj"
PRJ_BRANCH="myproj"

Then I run scripts/01_main.sh

Got error:

\e[32mINFO :\e[39m Checking project directories
\e[32mINFO :\e[39m Loading default settings and secrets
/Users/macos/Documents/r-ec2/myproj/settings/setup_user.sh: line 6: PRJ_NAME: command not found
\e[31mERROR:\e[39m PRJ_NAME is \e[95mMISSING\e[39m. Exiting.

Is there a simple way to configure the setup_user script?

fail if yum cannot install linux packages

Problem:

  • script continues if yum install fails
  • no option in settings to specify what to install

TODO:

  • put yum packages in an array in settings
  • read array, generate install commands
    • best if we have only one yum install command
  • check for errors and fail if error after install

detailed documentation

Problem

  • documentation is not complete, there is setup taking place outside scripts

TODO

  • Instructions for each secret
    • 2-3 lines explaining where to find the info/value, similar to code comments
    • 1 (max 2) screenshots to help visualize the issue
  • Instructions for what is not in script
    • big picture first
    • most important: do not forget to include any major configuration areas
  • instructions to terminate VMs using AWS web console

Call resources from within the R script?

I assume that the R script in the Lambda has access to the AWS OS it's running on so theoretically it could use for example Clodyr's R library aws.s3 and fetch files from s3.

# get file as raw vector
get_object("mtcars.Rdata", bucket = "my_bucket")

Then you don't have to pass it thru Python and also you avoid rewriting your existing R code base. Or am I missing something? Is this not possible?

Error on transferring Python 3.6 to custom image

Hi

running through a test deployment after setting up AWS, and am finding this:

INFO : Copy files to local lambda/
INFO : Copy file: python/lambda_get.py to lambda/
INFO : Copy file: python/lambda_post.py to lambda/
INFO : Copy file: example.R to lambda/
INFO : Remove EC2 directory ~/rtest-aws
INFO : Copying local lambda/ to EC2 ~/rtest-aws ...
.gitignore                                                                                                                                                                                                                                                                                                                                100%   84     2.8KB/s   00:00
example.R                                                                                                                                                                                                                                                                                                                                 100% 1281    42.3KB/s   00:00
lambda_get.py                                                                                                                                                                                                                                                                                                                             100%  672    22.2KB/s   00:00
lambda_post.py                                                                                                                                                                                                                                                                                                                            100%  673    22.3KB/s   00:00
README.md                                                                                                                                                                                                                                                                                                                                 100%  267     8.8KB/s   00:00
INFO : Configure EC2, create Lambda package/function & API method ...
INFO : Configure AWS on EC2
INFO : Check AWS configuration on EC2 ...
INFO : AWS Account ID: ********3729
INFO : Lambda Function Name: rtest-aws-master-alpha-resource-v1-get
INFO : Creating deployment package
INFO : PWD: /home/ec2-user/rtest-aws
INFO : Transferring Python 3.6 packages to the deployment package
bash: line 392: /home/ec2-user/env/bin/activate: No such file or directory
ERROR: Cannot create Lambda package/function & API method. Terminating end exiting ...

one thought is that perhaps my custom ami is not right, or perhaps a bug somewhere?

error creating Lambda Role

Hi

in script 24-setup_lambda.sh I cant seem to be able to create the Lambda Authorizer Function.
It fails with the following (the permissions for the IAM user running this are full access to lambda)

INFO : Creating Lambda Authorizer Function ...

An error occurred (InvalidParameterValueException) when calling the CreateFunction operation: The role defined for the function cannot be assumed by Lambda.

An error occurred (ResourceNotFoundException) when calling the GetFunctionConfiguration operation: Function not found: arn:aws:lambda:ap-southeast-2:11111111111:function:aws-lambda-r-lambda-authorizer
ERROR: Failed to obtain ARN of Lambda Authorizer Function aws-lambda-r-lambda-authorizer. Exiting.```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.