Section-4-AWS Object Storage & CDN-S3,Glacier & cloudFront

S3 - 101

1. S3 - Simple Storage service

2. Object storage - files,videos,photos,media etc. 2 types of storage - block & object.

3. 0-5 terrabytes file size, unlimited storage - amazon checks the storage availability in each region & will provision SANs as needed

4. Files are stored in buckets - buckets in folder in amazon terms

5. Cloudberry - provides cool explorer type apps to access s3

6. S3 is a universal namespace- must be unique globally

http://s3-eu(region)-west/amazonaws.com/acloudguru

7. When you upload a file to S3 ok code of 200 means it was successful

8. Data consistency model for s3 -

a) Read after write consistency for PUTS of new objects

b) Eventual consistency for Override PUTS & DELETES (takes time to propogate)

9. S3 is a Key-Value store- Lexicographic design - so alphabetical order of file name

Key - name of file

Value - data or sequence of bytes

version id-

metadata - data about data like the date etc

Subresources- doesnot exist on its own. exists under a object

a) access control lists - users/groups that can have access to the object. This access can be defined to a file, object or a bucket also

b) Torrent- S3 supports Torrent bit protocol

10. S3 specs

a) built for 99.99% availability - SLA

b)guaranteest 99.9-11*9's for durability of information stored - can support failure of 2 data centers concurently

c)Tiered storage available - like 30 days old file put in this tier, the newer ones in the other tier etc

d) Lifecycle management - tier management configuration settings

e)versioning

f)encryption

g)secure your data using access control lists & bucket policies

11. S3 Storage Tiers/Classes

a) S3 - 99.99 availability & 11*9s 99.9 durability

b) S3- IA Infrequently accessed - less priced than s3 but charged for retrievals

c)Reduced Redundancy storage- much cheaper. but durability is 99.99 only

d) Glacier - archival only, restoration is slow may take 3-5 hours. Cheapest

Refer S3 Tiered Storage specs

Refer S3 vs Glacier

12. S3 Charges -

Charged for

a)Storage

b)Requests

c)Storage management

d) transfer pricing- data coming into S3 is free but transferring around costs

e)Transfer acceleration

S3-BUCKET Lab

1. created a bucket Storage services - click on S3

2. Uploaded a file.

3. Added users to the bucket

4. Can set encryptions - Client side encryption, Server side encryption with amazon provided keys (SSE with S3), SS encryption with KMS(SSE with KMS), SS encryption with customer provided keys (SSE-c)

5. Security is through ACL - Access control lists & bucket policies

6. By default all buckets & objects in it are private.

For a bucket there are 3 tabs shown - Overview, Properties,Permissions & Management

similarly for a file also.

Added tags to the object itself.- the tag added to the bucket doesnt get passed on

S3 Versioning Lab

1. Once version is enabled it can only be suspended but not be deleted.

2. It writes/stores every change to a file as a separate object- even delete is stored

3. Provides additional level of security - MFA (multi factor authentication) to enable versioning and delete capability

Cross Region Replication

- Enabling CRR only copies the future changes from the source to the destination bucket

- the existing contents must be copied with aws cli

aws configure - will ask for access key code, access value, region.

aws s3 ls - lists the bucket

aws s3 cp --recursive sourcebucket name destibucket name

- delete markers are replicated but deleting individual versions or delete markers are not replicated

-cross region replication is at a high level

- the two buckets must be in unique regions

Lifecycle management /rules -lab

Glacier is not available in singapore & southamerica - so create your buckets for this lab in some other region

AWS console->services->storage->s3->create bucket(no caps allowed in bucket name)- in properties tab enable versioning

on selecting the bucket u get the bucket screen with overview,properties,permissions & management tab

click on management tab to find the lifecycle tab

on click of lifecycle tab it lets you create a lifecycle rule and the pop up dialog appears that takes you through the lifecycle rule setting process

The rule can be set to a bucket or an individual file

1. current versions-

a) settings to transfer to infrequent access - IA - has to be a min of 30 days

b) settings to transfer to glacier archival- has to be in ia for a min of 30 days - so 60 days form creation is minimum here

2. after file becomes previous version

a) settings to transfer to ia-

b) settings to transfer to glacier archival - here no limit on days

c) settings to expire - when this is set - only a delete marker is added against the current version at the expiry date. If delete has to happen then it has to be combined with the permanent delete option

Once you create the rule the summary screen gives you the transitions& expirations .

Exam tips

- lifecycle rule configs can be done in conjunction with versioning

- can be applied on current versions / previous versions

- transition rules - 30 days for IA & 60 days for glacier archiving

-permanently delete

CDN Overview -

Content delivery network - is a system of distributed networks that deliver webpages and other webcontents to a user based on their geographic location, the origin of the content & the content delivery server.

edgelocation - location where content will be cached and is different from AWS region

origin - the origin of all files which is distributed by the cdn. eg a website in london,europe

distribution - a cdn which consists of a cluster of edge locations.

a) web distribution - for websites only

b) RTMP - for media streaming- for dobi files with rtmp protocol

Exam tips

Edge location is not read only - write/put object is allowed which gets written to the object in the origin server

TTL- time to live- objects in the cache are present untiil TTL expires

You can clear cached object , but will be charged

Clount front distribution - Lab

Services->network& content delivery

create distribution

Origin domain- prefilled with the bucket names

Origin path - user friendly domain name otherwise its a collection of letters & number random

Origin Description - it must be unique within a distribution

an object may have multiple origins

-cache behaviour settings

read only for the distribution to the bucket/file access

allow http methods- just read or put,options etc

allow only logged in users - apply security to restrict only logged in/signed in users-restrict signed in urls

use origin cache headers

configure min,max ttls

-distribution settings

use all or specific edge locations - price class

alternate domain name

ssl certificate - default or client ssl certrification

you can apply geo restriction also

Security & encryption

Security

-Bucket policies & access control lists

-access control lists can drill down to specific objects in a bucket also

-objects in a bucket are by default private

-access logs - gives log of all requests/access done to your bucket. This can also be set up to other bucket or account

Encryption

In transit -

When object is transfer into or out of s3

SSL/TSL- https

In Rest

Server Side encryption - SSE

- SSE- with amazon provided keys - sse-s3

-SSE with aws key management service -SSE-KMS- they provide envelope/management of your encryption key- also provide order trail for the keys

-SSE with customer provided keys - SSE-C

Client side encryption

Storage Gateway

- connects in-Premise IT environment with cloud storage to provide secure & scalable data transfer & storage to AWS cloud environment from in premise

Your data center ->asynchronously replicate -> AWS(S3 or glacier)

-SG(storage gateway) is available as a software for download in the form of a VM image. It supports VMware Elxi or microsoft hyper v. Once installed at your data center

and connected to your aws account through the activation process- the software gateway can be set up using aws console with options that work for you

4 different types of gateways

-File Gateway - uses NFS - Network file system - to store flat files to s3. All data is stored only in S3. Nothing onsite

-Volume Gateway - For block storage - takes point in time snapshots and stores in s3 using Amazon - EBS - Elastic block storage

Blocks/snapshotsa are stored in incrementals - so only the latest changes are stored

a) Stored volume- data asynchronously backed up using iSCSI block storage. All data is stored in premise and backed up in s3

b) Cached volume - all data stored in S3- only the most frequently accessed data is cached on site

Gateway virtual tape library -

Used for backup and uses popula backup applications but with virtual tape cartridge

Snowball

Petabyte device -its a physical data transfer device - to effectively transfer large amounts of data into and out of AWS without the high network cost

Snowball - 80Tbyte data, onboard storage capabilities

SnowballEdge - 100TB - on board storage & computing capabilities - basically data processing in premise with lamda functions

1256 TB is 1 petabyte

& 1256 petabyte is 1 exabyte

Snowmobile - exabyte scale data transfer- at a time you can transfer 100 PB with snow mobile

Snowball lab

Its under migration in services

- click create job to create a job for AWS to send you the snowball. keep pressing next to enter your address details etc

the workflow block diagram is shown- job is midway when the snowball has been delivered to you

- open flap right & left of the cuboid narrow side. - one side is the actual snowball kindle. the other end has the access to ethernet cable etc. the top has the power jack - which needs to be connected to the power cable.

log on to aws and download the snowball client & install in your pc. get credentials & download the manifest

in cli

power up the kindle

/snowball -i internet ip -m the manifest name -u credentials

/snowball cp filename bucket link(can get from your create job request also) will be of the pattern s3//bucketname

starts copying

once done power off & create a job for aws to pick it up

Transfer Acceleration

- instead of directly uploading to a bucket it lets you upload to a edge location which then uploads to your bucket faster. this service comes at a additional cost. when this is enabled , a unique endpoint url is given which has s3-acclerated as its domain. through this link you'll write to the edge location instead of directly to the bucket.

Static website hosting

http://pri-staticwebsite.s3-website-us-east-1.amazonaws.com/

create s3 bucket and in properties enable static website. give public access to read - only then the website will work. add index & error htmls to the static website properties. create the mentioned index & error files and upload to bucket. now go to static website and click on the endpoint url the website displays.

the url is always http://bucketname.s3-website-regionname.amazonaws.com

Section-4-AWS Object Storage & CDN-S3,Glacier & cloudFront

S3 Tiered Stoarage Specs

Next up

Previous

Section-4-AWS Object Storage & CDN-S3,Glacier & cloudFront