IAM & S3

Benjamin
5 min readAug 30, 2021

Identity and Access Management (IAM)

  • Consists of:- Users, Groups, Roles, Policies
  • It is universal. It does not apply to regions at this time
  • Root account is the account created when an AWS account is first setup. It has complete admin access
  • New Users have no permissions when first created
  • New Users are asiigned Access Key ID & Secret Access Keys when first created. These are not the same as a password and these can only be viewed once
  • Always setup Multifactor Authentication on root account
  • Create and customize password rotation

S3

  • S3 is Object-based: i.e., allows file uploads
  • Files can be from 0 bytes to 5TB
  • There is unlimited storage
  • Files are stored in Buckets
  • S3 is a universal namespace: i.e., names must be unique globally
  • Not suitable to install an OS on
  • Successful uploads will generate HTTP 200 status code
  • By default, all newly created buckets are PRIVATE. Access controls can be setup using: Bucket Policies, ACLs
  • Can be configured to create access logs which logs all reuests made to the S3 bucket, This can be sent to another bucket and even buckets in another account
  • Key fundamentals of S3: key (nsme of object), Value(data made up of a sequence of bytes)
  • version ID (important for versioning)
  • Metadata (data about data being stored)
  • Subresources: ACLs and Torrents
  • Read after Write consistency for PUTS of new Objects
  • Eventual consistency for overwrite PUTS and DELETES (can take some time to propagate)
  • Bucket names share a common name space
  • You can have buckets in individual regions but can view them globally
  • You can replicate contents of one bucket to another bucket automatically by using cross region replication
  • You can change storage classes and encryption of your objects on the flow
  • Storage Classes: S3 Standard, S3-IA, S3 One Zone-IA, S3-Intelligent Tiering, S3 Glacier, S3 Glacier Deep Archive (Pricing Tiers is in this order, from most costly to least costly)
  • Restricting Bucket Access: Bucket policies (Applies across the whole bucket), Object Policies (Applies to individual files), IAM policies to Users & Groups (Applies to Users & Groups)

Security And Encryption

  • Encryption In Transit is achieved by: SSL/TLS
  • Encryption At Rest (Server Side) is achieved by: S3 Managed Keys-SSE-S3, AWS Key Management Service, Managed Keys-SSE-KMS, Server Side Encryption With Customer Provided Keys-SSE-C
  • Client Side Encryption

S3 Versioning

  • Stores all versions of an object (including all writes and even if an object is deleted)
  • Great backup tool
  • Once enabled, it cannot be disabled, only suspended
  • Integrates with Lifecycle rules
  • Versioning’s MFA Delete capability, which uses multi-factor authentication, can be used to provide an additional layer of security

Lifecycle Management with S3

  • Automates moving objects between the different storage tiers
  • Can be used in conjunction with versioning
  • Can be applied to current and previous versions

S3 Object Lock & Glacier Vault Lock

  • Use S3 Object Lock to store objects using write once, read many (WORM) model. It can help prevent objects from being deleted or modified for a fixed amount of time or indefinitely
  • Object locks can be on individual objects or applied across the objects as a whole
  • S3 Object Lock Modes: Governance Mode (only users with special permissions can overwrite or delete an object or alter its lock settings), Compliance Mode (no user, including AWS account root user cab overwrite object versions)
  • S3 Glacier Vault allows easy deploy and enforce compliance controls for individual S3 Glacier vaults with a Vault Lock policy. Controls such as WORM in a Vault Lock policy can be specified for future edits. Once locked, the policy can no longer be changed

S3 Performance

  • S3 prefix represents the pathway between a bucket name and the object name: bucketname/folder1/subfolder1/filename.jpg -> /folder1/subfolder1
  • A high number of requests can be achieved: 3500 PUT/COPY/POST/DELETE and 5500 GET/HEAD requests per second per prefix
  • Better performance can be got by spreading reads across different prefixes. With two prefixes, 11000 requests per second can be achieved
  • When using SSE-KMS to encrypt objects in S3: Uploading/ downloading will count toward the KMS quota, Region-specific — however its either 5500, 10000, or 30333 requests per second, Currently, quota increase for KMS cannot be requested
  • Use multipart uploads to increase performance when uploading files to s3
  • Use ” ” for any files over 100MB and for files over 5GB(a must)
  • Use S3 byte-range fetches to increase performance when downloading files to S3

S3 Select and Glacier Select

  • S3 Select is used to retrieve only a subset of data from ana object by using simple SQL expressions
  • Get data by rows or columns using simple SQL expressions
  • Save money on data transfer and increase speed

AWS Organizations & Consolidated Billing

  • Always enable multi-factor authentication on root account
  • Always use a strong and complex password on root account
  • Paying account should be used for billing purposes only. Do not deploy resources into the paying account
  • Enable/Disable AWS services using Service Control Policies (SCP) either on OU(organizational units) or on individual account

Sharing S3 Buckets Across Accounts

  • Using Bucket Policies & IAM (applies across the entire bucket). Programmatic Access Only
  • Using Bucket ACLs & IAm (individual objects). Programmatic Access Only
  • Cross-account IAM Roles. programmatic AND Cosole access

Cross Region Replication

  • Versioning must be enabled on both the source and destination buckets
  • Files in an existing bucket are not replicated automatically
  • All subsequent updated files will be replicated automatically
  • Delete markers are not replicated
  • Deleting individual versions or delete markers will not be replicated

S3 Transfer Acceleration

  • Users upload big files to edge locations, which are then transported across AWS backbone networks to your S3 bucket

AWS DataSync

  • Used to move large amounts of data from on-premises to AWS
  • Used with NFS-and SMB-compatible file systems
  • replication can be done hourly, daily or weekly
  • Install the DataSync agent to start the replication
  • Can be used to replicate EFS to EFS

CloudFront

  • Edge Location- location where content will be cached. Its separate to an AWS Region/AZ
  • Origin- Origin of all the files that the CDN will distribute e.g., S3 Bucket, an EC2 Instance, an Elsatic Load balancer, or Route53
  • Distribution: Name given to the CDN which consists of a collection of Edge Locations
  • Web Distribution- Typically used for Websites
  • RTMP- used for Media Streaming
  • Edge locations are not just READ only — one can write to them to. (i.e., put an object on to them)
  • Objects are cached for the life of the TTL (Time To Live)
  • Cached objects can be cleared, but this action comes with a charge

CloudFront Signed URLs and Cookies

  • Use signed URLs/cookies to secure content so that only authorized people are able to access it
  • A signed URL is for individual files. 1 file=1 URL
  • A signed cookie is for access to multiple files. 1 cookie = multiple files
  • If origin is EC2, use CloudFront signed URL. If S3, use S3 signed URL

Snowball

  • Snowball can import to S3, export from S3

Storage Gateway

  • File Gateway- For flat files, stored directly on S3
  • Volume Gateway- Stored Volumes: Entire dataset is stored on site and is asynchromously backed up to S3, Cached Volumes: Entire Dataset is stored on S3 and tyhe most frequently accesssed data is cached on site
  • Gateway Virtual tape Library

Athena vs. Macie

  • Athena is an interactive query service
  • It allows query for data located in S3 using standard SQL
  • Serverless
  • Commonly used to analyse log data stored in S3
  • Macie uses AI to analyse data in S3 and helps identify PII (personally Identifiable Information)
  • Can also be used to analyse Cloud Trail logs for suspicious API activity
  • Includes Dashboards, Reports and Alerting
  • Great for PCI-DSS compliance and preventing ID theft

--

--

Benjamin

... practising Software Engineering and DevOps.