Automation in infrastructure provisioning is key to reducing the development time.
No one else can do this better than a developer so that DevOps engineer does less heavy-lifting at the time-crunched release window period
In this talk, we will ...
Cover how to build a Python NLP App in Cloud using replicable infrastructure provisioning codes
Give transferable learnings on building Serverless Apps in any cloud infra (here, AWS Cloud infra is chosen).
Be using a combination of CLI commands and Yaml Taskfiles to provision the AWS Infrastructure.
application code development
applied to their infrastructure provisioning codes
? The right DevOps culture ultimately makes you deliver better products faster.
“It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change.” Charles Darwin
The reason why we chose AWS CLI is that
*arguable personal opinion
/path/where/lambda_codes/located % cat aws_cli_command_for_lambda_creation.bash
#!/bin/bash
aws lambda create-function \
--function-name $1 \
--zip-file fileb://${1}.zip \
--runtime python3.8 \
--role $2 \
--handler lambda_function.lambda_handler \
--timeout 60 \
--memory-size 256 \
--layers $3 \
--architectures x86_64
task <task_name>
/path/where/lambda_codes/located % cat Taskfile.yml
...
...
tasks:
create_lambda_name:
cmds:
- zip -r ${LAMBDA_FUNCTION_NAME}.zip lambda_function.py
- bash aws_cli_command_for_lambda_creation.bash $LAMBDA_FUNCTION_NAME $IAM_ROLE_ARN $SPACY_LAYER
# how to create IAM policy and roles
/path/where/IAM_Taskfile/is/located/ % task create_policy && task create_role && task attach_role_to_policy
# how lambda function is created
/path/where/LAMBDA_Taskfile/is/located/ % task create_lambda_name && task update_lambda_environment
# how to test the lambda
/path/where/Testing_Taskfile/is/located/ % task run_test_event_1
{
"StatusCode": 200, "ExecutedVersion": "$LATEST"
},
{
"output_bucket_name": "pycon-$USER-nlp-output-bkt", "file_key": "email_1.txt",
"message": "PII Redaction Pipeline worked successfully"
}
Alternate Options:
Tasks
Broadly, there are two major types of NLP Pipelines:
If an NLP Pipeline could be defined in above 2 major ways, the second definition of Data Engineering based pipeline, is what we will accomplish in this talk.
Note: This pipeline is intentionally made simple. Real-world Serverless Pipelines could be much more complex
- Create the S3 Trigger bucket (s3_1 in pic), intermediary S3 bucket and Output S3 Bucket (s3_2 in pic)
- With no special/ extra packages, in a standard Py3.8 lambda env,
- create a lambda that replaces Phone and Email and
- test it with a sample csv file
amazon/aws-lambda-python:3.8
and publish as a layer
# set up the temporary AWS credentials
## task executes the task from `Taskfile.yml` in the `/path/to/serverless_nlp_app`
/path/to/serverless_nlp_app/src/aws/2.stepfunctions_invoke_lambda/c.testing % task run_test_event_1
Taskfile.yml
¶Taskfile.yml
¶
AWS CLI
+ Taskfile.yml
approach DevOps Mindset
ensures better software development cycle