Developer Associate - Study Notes

Taken as if SA and SysOps are completed

Lambda Cheatsheet
DDB Cheatsheet
API GW Cheatsheet
SAM Cheatsheet
SDK
KMS Cheatsheet
Roles (Service Roles etc)
- IAM Role to grant permissions to apps running on EC2 instances
SQS Cheatsheet
SNS Cheatsheet
Kinesis Cheatsheet
- Kinesis resharding and parallel processing
BeanStalk Cheatsheet
ElastiCache Cheatsheet
Cognito
ECS Cheatsheet
EKS
Fargate
CloudWatch Events Cheatsheets
- CloudTrail vs CloudWatch
SWF vs Step Functions vs SQS Cheatsheet

Developer Associate

S3 Performance

Historically

When you had > 100 TPS S3 perf could degrade
Behind the scenes each object goes to an S3 partition and for best perf you want high partition distribution
In the exam, and in life historically, it was recommended to have random characters in front of your key name to optimise perf (partition distribution)
- <my_bucket>/5r4d_my_folder/my_file1.txt
- <my_bucket>/a91e_my_folder/my_File2.txt
It was recommended never to use dates to prefix keys

Current State

As of July 17 2018 it scales up to 3500 TPS for PUT and 5500 TPS for GET for EACH PREFIX
Negates previous guidance to randomize object prefixes to achieve faster perf

Performance

Faster upload of large objects (>=100MB), use multipart upload:
- parallelizes PUTs for greater throughput
- maximize your network bandwidth and efficiency
- decrease time to retry in csase a part fails
- must use multi-part upload if object size is greater than 5GB
Use CloudFront to cache S3 objects around the world (improves reads)
S3 Transfer Acceleration (use edge locations, improves writes) - just need to change the endpoint you write to, not the code
If using SSE-KMS encryption you may be limited to your AWS limits for KMS usage (~100s to 1000s doanloads/uploads per second, request limit increase)

S3 Select & Glacier Select

If you retrieve data in S3 and Glacier you may only want a subset of it
If you retrieve all the data the network costs may be high
With S3 Select / Glacier Select you can use SQSL SELECT queries to let S3 or Glacier know exactly which attributes / filters (columns / rows)
- select * from s3objects where s."Country (Name)" like '%United States%'
Save up to 80% and increase perf by up to 400%
the "SELECT" happens within S3 or Glacier
Works with files in CSV, JSON, or Parquet
Files can be compressed with GZIP or BZIP2
No subqueries or Joins are supported

CLI

Dry Runs: Some AWS commands (not all) contain a --dry-run option to simulate API calls
Multiple Profiles
- aws configure --profile xxx
- aws s3 ls --profile=xxx

Developing on AWS

AWS CLI STS Decode Errors

When you run API calls and they fail you need to decode them using the STS command line"
aws sts decode-authorization-message

EC2 Instance Metadata

Allows EC2 Instances to learn about themselves without using an IAM Role for that purpose
http://169.254.169.254/meta-data
Can retrieve the IAM Role name from the metadata but CANNOT retireve the IAM policy

AWS SDK

Perform action on AWS directly from your application's code using an SDK (the CLI is a wrrapper around boto3)
- Java, .NET, Node.js, PHP, Python (boto3/botocore), Go, Ruby, C++)
Recommended to use the default credential provider chain
- Which works seamlessly with:
  - AWS credentials at ~/.aws/credentials (only on our computers on premise)
  - Instance Profile Credentials using IAM Roles (for EC2 machines, etc)
  - Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), not really recommended
Best practice is for credentials to be inherited from mechanisms above, and %100 IAM roles if working from within AWS Services

Exponential Backoff

Any API that fails because of too many calls needs to be retired with Exponential Backoff
These apply to rate limited API
Retry mechanism included in SDK API calls
2ms, 4ms, 8ms, 16ms, 32ms, 64ms, etc

AWS Elastic Beanstalk

Uses Cloudformation under the hood
Managed service
- Instance configuration / OS is handled by beanstalk
- Deployment strategy is configurable but performed by ElasticBeanStalk
Just the applciation code is the responsbility of the developer
Three architecture models:
- Single Isntance deployment: good for dev
- LB + ASG: Gret for prodution or pre-production we applications
- ASG only: great for non-web applications in production (workers, etc)

ElasticBeanstalk has three components
- Application
- Application Version: Each deployment gets assigned a new version
- Environment name (dev, test, prod): free naming
You deploy application versions to environments and can promote application versions to the next environment
Rollback feature to previsou application
Full control over lifecycle of environments

Deployment Options for Updates

All at once (deploy all at one go): Fastest but instances aren't available to serve traffic for the downtime
Rolling: Update a few instances at a time (bucket), and then move on to the next bucket once the first bucket is healthy
Rolling with additional batches: Like rolling but spins up new instances to move the batch (so always at max capacity)
Immutable: Spins up new instances in a new ASG, deploys version to it, and then moves them into old ASG when everything is healthy and terminates old instances. Highest cost, quick rollback (just terminate ASG, good for prod).

Blue/Green Deployment

Create a new "stage" env, deploy v2 there
New env (green) can be fully validated and roll back if issues
Route 53 can be set up using weighted policies to redirect traffic bit by bit to the new env
Using Beanstalk use "swap URLs" when done with env test

Elastic Beanstalk Extensions

A zip file containing our code msut be deployed to Elastic Beanstalk
All the paramters set in the UI can be configured with code using files
Requirements:
- in the .ebextentions/ directory in the root of source code
- YAML / JSON format
- .config extensions (ex: logging.config)
- Able to modify some default settings using: option_settings
- Ability to add resources such as RDS, ElastiCache, DynamoDB, etc
Resources managed by .ebextensions get deleted if the environment goes away

Elastic Beanstalk CLI

We can install an additional CLI called the "EB CLI" which makes working with Beanstalk from the CLI easier
Basic Commands are:
- eb create, eb status, eb health, eb events, eb logs, eb open, eb deploy, eb config, eb terminate
Helpful for you automated deployment pipelines

Elastic Beanstalk Deployment Mechanism

Describe dependencies
- (requirements.txt for Python, package.json for Node.js)
Package code as zip
Zip file is uplaoded to each EC2 machine
Each EC2 machine resolves dependencies (SLOW)
Optimization in case of long deployments: Package dependencies with source code to improve deployment preformance and speed

Exam Tips

Beanstalk with HTTPS
- Load the SSL certificate onto the LB
- Can be done from the console (EB console - LB config)
- Can be done via code: .ebextensions/securelistener-alb.config
  - elb.config??
- SSL Certificate can be provisioned using ACM or CLI
- Must configure a security group rule to allow incoming port 443
Beanstalk redirect HTTP to HTTPS
- Configure your instance to redirect HTTP to HTTPS
- OR configure the ALB (only) with a rule
- Make sure health checks are not redirected (so they giving 200 OK)

Beanstalk Lifecycle Policy

Elastic Beanstalk can store at most 1000 application versions
If you don't remove old versions you won't be able to deploy any more
To phase out old application versions use a lifecycle policy
- Based on time (old versions are removed)
- Based on space (when you have too many versions)
Version that are currently used won't be delted
Option not to delete the source bundle in S3 to prevent data loss

Web Server vs Worker Environment

If your application performs tasks that are long to complete, offload these tasks to a dedicated worker environment
Decooupling your application into two tiers is common
Example: processing a video, generating a zip file, etc.
You can define periodic tasks in a file - cron.yaml

RDS with Elastic Beanstalk

RDS can be provisioned with Beanstalk, which is great for dev/test
- Not so great for Prod, as the database lifecycle is tied to the Beanstalk environment lifecycle
- The best for Prod is to separately create an RDS database and provide our EB application with the connection string
Steps to migrate from RDS coupled in EB to standalone RDS:
- Take an RDS DB snapshot (backup)
- Enable deletion protection in RDS
- Create a new Beanstalk env without an RDS, point to existing old RDS
- Perform a blue/green deployment and swap old and new environments
- Terminate the old environment (RDS will not be deleted due to protection)
- Delete the CloudFormation stack (will be in DELETE_FAILED due to not deleting RDS, that's ok)

CI/CD

CodeCommit

Version control is the ability to understand the various change that happen to code over time (and possibly roll back)
All these are neable by using a version control system such as Git (can be local but usually central online)
Benefits are:
- Collaborate with other devs
- Make sure codei backed up somehwere
- Make sure it's fully viewable and auditable

AWS Codecommit:
- Private Git repositories
- No size limit on repositories (and scales seamlessly)
- Fully managed, highly available
- Code remains in AWS Cloud account -> increased security and compliance
- Secure (encrypted, access control, etc)
- Integrates with Jenkins, CodeBuild, other CI tools

CodeCommit Security

Interactions are done using Git (standard)
Authentication in Git
- SSH Keys: AWS Users can configure SSH keys in their IAM Console
- HTTPS: Done through the AWS CLI Authentication helper or Generating HTTPS credentials
- MFA can be enabled
Authorization in Git
- IAM Policies manage user / roles rigths to repositories
Encryption
- Repositories are automatically encrypted at rest using KMS
- Encrypted in transit (can only use HTTPS or SSH)
Cross account access
- Do not share SSH keys
- Do not share AWS credentials
- Use IAM Role in your AWS Account and AWS STS in recipient account (with AssumeRole API)

CodeCommit Notifications

You can trigger notifications in CodeCommit using AWS SNS or AWS Lambda or AWS CloudWatch Event Rules
Use cases for notifications SNS / AWS Lambda notifications (modifications to code use case)
- Deletion of branches
- Trigger for pushes that happen in master branch
- Notify external Build System
- Trigger AWS Lambda function to perform codebase analysis (maybe credentials got committed in code, etc)
Use cases for CloudWatch Event Rules (more around pull requests)
- Trigger for pull request updates (created, updated, deleted, commented)
- Commit comment events
- CloudWatch Event Rules goes into an SNS topic

Triggers vs Notifications

CodePipeline

Continuous Delivery
Visual Workflow
Source: GitHub / CodeCommit / S3
Build: CodeBuild / Jenkins / etc
Load Testing: 3rd party tools
Deploy: AWS CodeDeploy / BeanStalk / CloudFormation / ECS
Made of stages
- Each stage can have sequential actions and/or parallel actions
- Stage examples: Build / Test / Deploy / LoadTest / etc
- Manual approval can de defined at any stage

CodePipeline Artifacts

Each pipeline stage can create 'Artifacts'
Artifacts are passed and stored in S3 and on to the next stage

CodePipline Troubleshooting

CodePipeline state changes happen in AWS CloudWatch Events which can in return create SNS notifications
- ex: you can create events for failed pipelines
- ex: you can create events for cancelled stages
If CodePipeline failes a stage you pipeline stops and you can get information in the console
AWS CloudTrail can be used to audit AWS API calls
If Pipeline can't perform an action, make sure the "IAM Service Role" attached has enough permissions (IAM Policy)

Pipeline stages can have multiple action groups
Can have sequential and parallel action groups

Codebuild

FUlly managed build server, alternative to others like Jenkins
Continuous scaling (no servers to manage, no build queue)
Pay for usage: the time it takes to complete the builds
Leverages Docker under the hood for reproducible builds
Possibility to extend capablities leveraging our own base Docker images (??)
Secure: Integration with KMS for encryption of build artifacts, IAM for build permissions, and VPC for network security, ClouTrail for API calls logging

CodeBuild Overview

Source Code from GitHub / CodeCommit / CodePipeline / S3 etc
Buidl instructions can be defined in code (buildspec.yml file)
Output logs to Amazon S3 & AWS CloudWatch Logs (go look, find)
Metrics to monitor CodeBuild Statistics
Use CloudWatch Alarms to detect failed builds and trigger notifications
CloudWatch Events / AWS Lambda as a Glue
SNS notifications
Ability to reproduce CodeBuild locally to troubleshoot in case of errors
- In case of trouble shooting beyond available logs
- Install Docker on desktop
- Leverage CodeBuild Agent
Builds can be defined within CodePipeline or CodeBuild itself
Java, Ruby, Python, Go, Node,js, Android, .NET Core, PHP, Docker to extend any environment you like (fully extensible)

Buildspec.yml

Must be at root of code
Define environment variables
- Plaintext variables
- Secure secrets: use SSM Parameter Store
Phases (specify commands to run):
- Install: install dependencies you may need for your build
- Pre build: final commands to execute before build
- Build: actual build commands
- Post build: finishing touches (zip output for example)
Artifacts: what to uplaod to S3 (encrypted with KMS)
Cache: Files to cache (usually dependencies) to S3 for future build speedup

CodeDeploy

Each EC2 or On-Prem server must be running the CodeDeploy Agent
The agent is continually polling AWS CodeDeploy for work to do
CodeDeploy send appspec.yml file
Application is pulled from GitHub or S3
EC2 will run the deployment instructions
CodeDeploy Agent will report on the success or failure of deployment on the instance

EC2 instances are grouped by deployment group (de/test/prod)
Lots of flexibility to define any type of deployment
CodeDeploy can be chained into CodePipeline and use artifacts from there
CodeDeploy can re-use existing setup tools, works with any application, auto-scaling integration
Note: Blue / Green only works with EC2, not on prem
Support for AWS Lambda deployments
CodeDeploy does not provision resources
Only In-Place and Blue/Green deployment types

Primary Components (don't need to memorize)

Application: unique name
Compute Platform: EC2/On-prem or Lambda
Deployment configuration: Deployment rules for success/failure
- EC2/On-Prem: you can specify the minimum number of healthy instances for the deployment
- AWS Lambda: specify how traffic is routed to your updated Lambda function versions
Deployment Group: Group of tagged instances (allows to deploy gradually)
Deployment Type: In-place deployment or blue/green deployment
IAM instance profile: need to give EC2 the permissions to pull from S3 / GitHub
Application Revision: application code + appspec.yml
Service role: Role for CdeDeploy to perform what it needs
Target revision: Target deployment application version

CodeDeploy AppSpec.yml (must know for exam)(in root of app source code)

File section: how to source and copy from S3 / GitHub to filesystem
Hooks: set of instructions to do to deploy the new version (hooks can have timeouts), can run in these steps. Remember the order :
- ApplicationStop
- DownloadBundle
- BeforeInstall
- AfterInstall
- ApplicationStart
- ValidateService (quite important)

Deployment Config

Configs
- Once at a time: one instance at a time, one isntance fails -> deployment stops
- Half at a time: 50%
- All at once: quick but no healthy host = downtime. Good for Dev.
- Custom: ex.: min healthy host = 75%
Failures:
- Instances stay in "failed state"
- New Deployments will first be deployed to "failed state" instances
- To rollback: redeploy old deployment or enable automated rollback for failures
Deployment Targets:
- Set of EC2 instances with designated tags
- Directly to an ASG
- Mix of ASG / Tags so you can build deployment segments
- Customization in scripts with DEPLOYMENT_GROUP_NAME environment variable
- CodeDeploy only deploys to EC2 instances
- CodeDeploy doesn't require SecurityGroups

AWS CodeStar

CodeStar is an integrated solution that regroups: GitHub, CodeCommit, CodeBuild, CodeDeploy, CloudFormation, CodePipeline, CloudWatch
Helps quickly create "CICD-readhy" projects for EC2, Lambda, Beanstalk
Ability to integrate with Cloud9
One dasboard to view all components
Free, only pay for underlying
Limited customization

CloudFormation

Update stack button (in design tool or upload new), preview changes

Resources

Core of Template, mandatory
Represent components that will be created and configured
They are declared and can reference each other
Over 224 types, in the form of AWS::aws-product-name::data-type-name
Cannot create a dynamic amount of resources (CDK?)

Parameters

A Way to provide *inputs to your templates
Important if:
- You want to reuse templates
- Some inputs cannot be determined ahead of time
Extremely powerful, control, and can prevent errors from happening thanks to types
Don't need to reupload if you want to change something each time, just use parameters
Fn::Ref is leverated to reference parameters, shorthand !Ref
PseudoParameters enabled by default:
- AWS::AcountID
- AWS::NotificationARNS
- AWS::NoValue
- AWS::Region
- AWS:StackID
- AWS::StackName

Mappings

Mappings are fixed variable within your template, hardcoded
They're handy to differentiate between different environments (dev vs prod), regions, AMI types, etc
Example:
- Mappings:
  - Mappin01:
    Key01:
    Name: Value01
    Key02:
    Name: Value02
  - RegionMAP:
    us-east-1:
    "32" : "ami-641120d"
    "64" : "ami-1241212"
    us-east-2:
    "32" : "ami-1231231"
Good when you know in advance all the values that could be entered, and they can be deduced from variables such as region, AZ, Account, etc
Safer control over the template
Only use Parameters when the values are user specific and have to be hand entered

Use Fn::FindInMap to return a named value from a specific key
- **!FindInMap [ MapName, TopLevelKey, SecondLevelKey ]
- !FindInMap [RegionMap, !Ref "AWS::Region", 32]

Outputs

Declares optional output values that we can import into other stacks (if you export them first)
Useful, for example, if you define a network CloudFormation and output the variables such as VPC ID and your Subnets IDs
BEst way to perform collaboration cross stack, as you let each expert handle their own part of the stack
Can't delete a stack if its outputs are being referenced by another stack

Conditions

Used to control the creation of resources or outputs basedon a condition
Each condition can reference another condition, parameter value, or mapping
Ex:
- Conditions:
  - CreateProdResources: !Equals [ !Ref EnvType, prod]
Conditions can be applied to resources, outputs, etc
- Ex:
  - Resources:
    MountPoint:
    Type: "AWS::EC2::VolumeAttachment"
    Condition: CreateProdResources

Intrinsic Functions

Fn::Ref = !Ref
- Parameters -> returns the value of the parameter
- Resources -> returns the physcail ID of the underlying resource
Fn::GetAtt = !GetAtt
- Attributes can be attached to any resource you create (see docs)
Fn::FindInMap = !FindMap
- !FindInMap [ MapName, TopLevelKey, SecondLevelKey]
Fn::ImportValue = !ImportValue
- Import values that have been exported from toher templates
Fn::Join
- Join values with a delimiter
- !Join [ delimiter, [ comma-delimited list of values ] ]
- A🅱️c = !Join [ ":", [ a, b, c ] ]
Fn::Sub = !Sub
- Subsitute variables in a text, can combine with References or pseudovariables. Must contain ${VariableName}
- !Sub
- -- String
- -- {var1name: var1value, var2name: var2value }
Condition Functions (if not equals or and)

Rollback on failures

Stack Creation fails: (CreateStack API) - Stack Creation Options
- Default: everything rolls back (gets deleted)
  - OnFailure=ROLLBACK
- Troubleshoot: Option to disable rollback to manually troubleshoot
  - OnFailure=DO_NOTHING
- Delete: Get rid of stack entirely, don't keep anything
  - OnFailure=DELETE
Stack Update Failes: (UpdateStack API)
- The stack automatically rolls back to the previous known working state
- Ability to in logs what happened

AWS Monitoring, Troubleshooting, and Auditing

CloudWatch Metrics

CloudWatch provides metrics for every service in AWS
Metric is a variable to monitor (CPUUtilization, NetworkIn)
Metrics belong to namespaces
Dimension is an attribute of a metric (instance id, environment, etc)
Up to 10 dimensions per metric
Metrics have timestamps
Can create a CloudWatch dashboard of metrics

EC2 Detailed Monitoring

EC2 instance metrics have metrics every 5 minutes
With detailed monitoring you get data every 1 minute, for a cost
Free Tier allows uto have 10 detailed monitoring metrics

Custom Metrics

Possiblity to define and send your own custom metrics tso CloudWatch
Ability to use dimensions (attributes) to segment metrics
- Instance.id
- Environment.name
Metric resolution:
- Standard: 1 minute
- High resolution: up to 1 second (StorageResolution API parameter), for higher cost
Use API call PutMetricData
Use exponential backoff in case of throttle errors
By default in ASG Group Metric collection is not enabled

CloudWatch Alarms

Alarms are used to trigger notifications for any metric
Alarms can go to Auto Scaling, EC2 Actions, SNS Notifications
Various options (sampling, %, max, min, etc)
Alarm States:
- OK
- INSUFFICIENT_DATA
- ALARM
Period:
- Length of time in seconds to evalute the metric
- High resolution custom metrics: can only choose 10 sec or 30 sec
Alarm Targets (exam)
- Stop, Terminate, Reboot, or Recover an EC2 instance
- Trigger autoscaling action
- Send notificatin to SNS (from which you can do almost anything)
Good to know
- Alarms can be created based on CloudWatch Logs Metrics Filters
- CloudWatch doesn't test or validate the actions that are assigned
- To test alarms and notifications, set the alarm state to Alarm using CLI
  - aws cloudwatch set-alarm-state --alarm-name "myalarm" --state-value ALARM --state-reason "testing purposes"

CloudWatch Events

Source + Rule -> Target
Schedule: Like a cron job (same format)
Event Pattern: Event rules to react to a service doing something (Ex: CodePipeline state changes)
Triggers to Lambda functions, SQS/SNS/Kinesis Messages
CloudWatch Event creates a small JSON document to give info on the change

CloudWatch Logs

Applications can send logs to CloudWatch via the SDK
CloudWatch can collect logs from:
- Elastic Beanstalk: Collects from application
- ECS: Colelcts from containers
- Lambda: Collects from functions
- VPC Flow Logs
- API Gateway
- CloudTrail based on filter
- CloudWatch Logs Agents: For example on EC2 machines
- Route53: Logs DNS queries
CloudWatch logs can go to:
- Batch exporter to S3 for archival
- Stream to ElasticSearch cluster for further analytics
Never expire by default

CloudWatch Logs can use filter expressions
Log storage architecture:
- Logs groups: Arbitrary name, usually representing an application
- Log Stream: intances within application / log files / containers
Can define log expiration policies
Using the CLI we can tal CloudWatch Logs
To send logs to CloudWatch, make sure IAM permissions are correct!
Security: encryption of logs using KMS at the Group level

X-Ray (exam)

Debugging in Production, the good old way:
- Test Locally
- Add log statements everywhere
- Re-deploy in production
Log formats differ across application using CloudWatch and analytics is hard
Debugging: monolith "easy", distributed services "hard"
No common view of your entire architecture

X-Ray gives a visual analysis of our applications

Troubleshooting performance (bottlenecks)
Understand dependencies in a microservice architecture
Pinpoint service issues
Review request behaviour
Find errors and exceptions
Are we meeting time SLA?
Where am I throttled?
Identify users that are impacted

Compatbility
- Malbda
- Beanstalk
- ECS
- ELB
- API GW
- EC2 instances or any application server (even on-prem)

X-Ray Leverages Tracing
- Tracing is aend to end way to follow a "request"
- Each component dealing with the request adds its own "trace"
- Tracing is made of segments (+ sub-segments)
- Annotations can be added to traces to provide extra-information
- Ability to trace
  - Every request
  - Sample of requests (as a % for example or a rate per minute)
- Security
  - IAM for authorization
  - KMS for encryption at rest

How to enable?

Your code (Java, Python, Go, Node.js, .NET) must import the X-ray SDK

Very little code modification needed
The application SDK will then caputre:
- Calls to AWS services
- HTTP/HTTPS requests
- Database Calls (MySQL, PostgreSQL, DynamoDB)
- Queue calls (SQS)

Install the X-Ray daemon or enable X-Ray AWS Integration

X-Ray Daemon works as a low level UDP packet interceptor (Win, Lin, Mac)
AWS Lambda / other services already run the daemon for you (done via .ebextensions/xray-daemon.cfg for beanstalk)
Each application must have the IAM rights to write data to X-Ray

X-Ray service collects data from all the different services
Visual Service map is computed from all the segments and traces
X-Ray is graphical, so even on technical people can help troubleshoot

Troubleshooting

If X-Ray is not working on EC2
- Ensure the EC2 IAM Role has the proper permissions
- Ensure the EC2 instance is running the X-Ray Daemon
To enable on AWS Lambda
- Ensure it has an IAM execution role with proper policy (AWSX-RayWriteOnlyAccess)
- Ensure that X-Ray is imported in the code

X-Ray Additional Exam Tips

The X-Ray daemon/agent (which must be configured) has a config to send traces cross account:
- Make sure the IAM permissions are correct - the agent will assume the role
- This allows to have a central account for all your application tracing

Segments: each application / service sends its own segment
Trace: segments collected together to form an end-to-end trace
Sampling: decrease the amount of the requests sent to X-Ray, reduce cost or stop flooding
Annotations: Key Value pairs used to index traces and use with filters to be able to search through them (filter based on key/value pair)
Metadata: Key Value pairs not indexed, not used for searching

Code must be instrumented to use the AWS X-Ray SDK (interceptors, handlers, http clients)
IAM role must be correct to send traces to X-Ray
X-Ray on EC2 / On-Prem:
- Linux system must run the X-Ray daemon
- IAM instance role if EC2, other AWS credentials for on-prem instance
X-Ray on Lambda:
- Make sure X-Ray integration is ticked on Lambda (Lambda runs the daemon)
- IAM role is Lambda role
X-Ray on Beanstalk:
- Set configuration on EB console
- Or use a beanstalk extenstion (.ebextensions/xray-daemon.config)
X-Ray on ECS / EKS / Fargate (Docker):
- Create a Docker image that runs the Daemon / or use the official X-Ray Docker Image
- Ensure port mapping network settings are correct and IAM task roles are defined

CloudTrail

Provides governance, compliance, and audit for your AWS account
Get a history of events / API calls made by Console, SDK, CLI, Services
Enabled by default
Can put logs from CloudTrail into CloudWatch logs

CloudTrail vs CloudWatch vs X-Ray

CloudTrail
- Audit API calls made by users / services / AWS Console
- Useful to detect unauthorized calls or root cause of changes
CloudWatch
- CloudWatch Metrics over time for monitoring
- CloudWatch Logs for storing application logs
- CloudWatch Alarm to send notifications in case of unexpected metrics
X-Ray
- Automated Trace Analysis and Central Service Map Visualization
- Latency, Errors, and Fault analysis
- Request tracking acroos distributed systems

AWS Integration and Messaging

Synchronous communications
Asynchronous communications

AWS SQS

Producers (send) -> Queue -> Consumers (poll)
Scales from 1 message per second to 10,000s per second (nearly unlimited)
Default retention of messages: 4 days, maximum of 14 days
No limit to how many messages in queue
Low latency (<10ms on publish and receive)
Horizontal scaling in terms of numbers of consumers
**Can have duplicate messages occasionally (at least once delivery)
Can have out of order messages (best effort)
Limitation of 25KB per message

SQS Delay Queue

Delay a message (consumers don't see it immediately) up to 15 minutes
Default is 0 seconds
Can set a default at queue level
Can override the default using the DelaySeconds parameter

SQS Producing Messages

Define Body (up to 25KB string)
Add message attributes (metadata, optional)
Provide Delay Delivery (optional)
Get Back
- Message identifier
- MD5 hash of body

SQS Consuming Messages

Consumers Poll SQS for messages (receive up to 10 messages at a time)
Process the message within the visibility timeout
Delete the message using the message ID and receipt handle

SQS Visibility Timeout

When a consumer polls a message from a queue, the message is "invisible" to other consumers for the *Visbility Timeout period
- Set between 0 seconds and 12 hours (default 30 seconds)
- If too high (15 minutes) and consumer fails to process the message you must wait a long time before processing the message again
- If too low (30 seconds) and consumer needs time to process the messafge (2 minutes) another consumer will receive the essage and it will be processed more than once
ChangeMessageVisiblity API to change the visiblity while processing the message (consumer does)
DeleteMessage API to tell SQS the message was successfully processed

SQS Dead Letter Queue

If a consumer fails to process a message within the Visibility Timeout the message goes back to the queue
We can set a threshold of how many time it can go back, the redrive policy
After the threshold is exceeded the message goes in to a Dead Letter Queue (DLQ)
We have to create a DQL first, and then designate it dead letter queue
Make sure to process messages in the DLQ before they expire

SQS Long Polling

When a consumer requests a message from the queue it can optionally wait for messages to arrive if there are none in the queue
This is called Long Polling
LongPolling decreases the nu ber of API calls made to SQS while increasing the efficiency and latency of your application
The wait time can be between 1 sec to 20 sec (20 sec preferable)
Long polling is preferable to Short Polling
Long Polling can be enabled at the queue level or at the API level using WaitTimeSeconds
Receive Message Wait Time is name in console

SQS FIFO Queue

First in First out
Name of queue must end in .fifo
Lower throughput, 300 msg/s or 3000 msg/s in batch
Messages are processed in order by the consumer
Messages are sent exactly once
No per message delay (or fifo is broken)

FIFO Features

Deduplication (to not send same msg twice):
- Provide a MessageDeduplicationId with your message
- De-duplication interval is 5 minutes
- Content based deduplication: the MessageDuplicationId is generated as teh SHA-256 hash of the message body (not the attributes)
Sequencing:
- to ensure strict ordering between messages you msut specify a MessageGroupId
- Messages with different GroupId may be received out of order
- Eg. to order messages for a user you could use the "user_id" as a group id
- Messages with the same Group ID are delivered to one consumer at a time

PreviousNotes NextEverything

Last updated 3 years ago

hashtagTaken as if SA and SysOps are completed

hashtagDeveloper Associate

hashtagS3 Performance

hashtagCLI

hashtagDeveloping on AWS

hashtagAWS CLI STS Decode Errors

hashtagEC2 Instance Metadata

hashtagAWS SDK

hashtagAWS Elastic Beanstalk

hashtagDeployment Options for Updates

hashtagBlue/Green Deployment

hashtagElastic Beanstalk Extensions

hashtagElastic Beanstalk CLI

hashtagElastic Beanstalk Deployment Mechanism

hashtagExam Tips

hashtagBeanstalk Lifecycle Policy

hashtagWeb Server vs Worker Environment

hashtagRDS with Elastic Beanstalk

hashtagCI/CD

hashtagCodeCommit

hashtagCodePipeline

hashtagCodePipeline Artifacts

hashtagCodebuild

hashtagCodeBuild Overview

hashtagCodeDeploy

hashtagAWS CodeStar

hashtagCloudFormation

hashtagAWS Monitoring, Troubleshooting, and Auditing

hashtagCloudWatch Metrics

hashtagX-Ray (exam)

hashtagX-Ray Additional Exam Tips

hashtagCloudTrail

hashtagCloudTrail vs CloudWatch vs X-Ray

hashtagAWS Integration and Messaging

hashtagAWS SQS

hashtagSQS Delay Queue