Saturday, April 09, 2016

How to Post to Blogger - Blogspot using Google API in Python automatically

This blog explains the end-to-end of the steps needed to write a Python program that will write a "post" to a blog on Google's blogger.com
Why? This is part of a step to automate writing blog posts. For example, you want to post a blog every minute to say:
    Today - at this exact minute 12:05:00, the temperature is 27'C
In this example the time and temperature data are obtained somewhere and you can programmatically construct the sentence above.
However, it will be very tedious to login to blogger, click on many buttons, write the content and publish the post every minute.
A program can be setup to login automatically and post the sentence above to the blog automatically - that is the goal here.

GOAL:
To write a Python program that will automatically publish blogger/blogspot post

METHOD:
The combination of the following technologies are used:
- Python programming language
- Blogger/Blogspot blog providers
- Google API for Python
- OAuth 2.0 Google Authorization


PREREQUISITE:
1. Python v2.x - assume you have this installed - instructions on how to install and use python is beyond the scope of this post.
2. Blogger.com / blogspot.com account - These two names are the same free Blog site owned by Google. It is assumed you already have an account and know how to post blogs by logging in and posting a blog manually. Instructions on how to setup and use a blog the normal ways are not covered here.


PREPARATION:

1. On a CMD prompt / terminal, do:
pip install --upgrade oauth2client
pip install --upgrade google-api-python-client
https://developers.google.com/api-client-library/python/apis/blogger/v3#sample
https://developers.google.com/blogger/docs/3.0/api-lib/python#sample

2. Go to website below to get 'api_name', 'api_version'
Supported Google APIs: https://developers.google.com/api-client-library/python/apis/
api_name=blogger    api_version=v3
api_name=drive      api_version=v3




OAUTH AUTHORIZATION:

These two websites are good to get a reasonable understanding, but ultimately does not provide direct instructions.
    https://developers.google.com/blogger/docs/3.0/using#auth
    https://developers.google.com/identity/protocols/OAuth2#basicsteps

The three main types of Credentials are:
1. API Keys
2. OAuth 2.0 client IDs
3. Service account keys

Go to https://console.developers.google.com
The interface is not very good at directing what to do next.
On the left panel are: Overview,  Credentials
On the top right horizontal menu of items, there is a Drop-Down icon
--- Click on the Drop-Down icon to "Manage all projects" or "Create Project"
--- If new here, then choose Create Project,
--- Note that you can enter quite different values for Project Name and Project ID.

Coming back to the main page, click Overview
--- Now the main area has: "Google APIs" and "Enabled APIs"
--- Click on the "Google APIs" to show a list of popular APIs.
--- If Blogger API is not listed, type Blogger on the search text field.
--- Then click Blogger and click Enable (if not enabled already)
--- Going back some pages, now click "Enabled APIs". On this page, check that "Blogger API v3" is listed.

Coming back to the main page, click Credentials (just below Overview)
--- The top submenu has: "Credentials", "OAuth consent Screen", "Domain Verification"
--- Click on Credentials.
--- for OAuth 2.0 client IDs
------ To create manually, click on "Create Credentials", choose OAuth Client ID, choose Other as Application type.
------ To create automatically, click on "Create Credentials", choose "Help Me Choose".
--------- Then in the drop down, choose Blogger API, choose calling the API from "Other UI(Windows)", choose "User Data"
------ At the end, the credentials is downloaded, eg client_secret_<client_id>.json, and the content is like:
                 {"installed":{"client_id":"
--- for Service Account Keys
------ To create manually, click on "Service Account Keys", choose a service account (create one if needed), choose JSON, click Create.
------ To create automatically, click on "Create Credentials", choose "Help Me Choose".
--------- Then in the drop down, choose Blogger API, choose calling the API from "Other UI(Windows)", choose "Application Data"
------ At the end, the credentials is downloaded, eg <project>-<keyID>.json, and the content is like:
{
  "type": "service_account",
  "project_id": ".......",
  "private_key_id": "..................",
  "private_key": "-----BEGIN PRIVATE

Since this project is about automated program that writes to the blog, the credentials should not involve any interaction - in this case, create and use the "Service account keys"
The "OAuth 2.0 client IDs" also works but requires clicking a button and copying a "code" to the python program. This is shown to work - but this is less suitable for our goals.


SCOPE:
Before the actual coding to use the APIs, the authorization process also need to know the scopes of your program.
Here are some scopes for the Blogger API and the Google Drive API:

Blogger API, v3
https://www.googleapis.com/auth/blogger Manage your Blogger account
https://www.googleapis.com/auth/blogger.readonly View your Blogger account
Drive API, v3
https://www.googleapis.com/auth/drive View and manage the files in your Google Drive
https://www.googleapis.com/auth/drive.appdata View and manage its own configuration data in your Google Drive
https://www.googleapis.com/auth/drive.file View and manage Google Drive files and folders that you have opened or created with this app
https://www.googleapis.com/auth/drive.metadata View and manage metadata of files in your Google Drive
https://www.googleapis.com/auth/drive.metadata.readonly View metadata for files in your Google Drive
https://www.googleapis.com/auth/drive.photos.readonly View the photos, videos and albums in your Google Photos
https://www.googleapis.com/auth/drive.readonly View the files in your Google Drive
https://www.googleapis.com/auth/drive.scripts Modify your Google Apps Script scripts' behavior

*** A Complete list of scopes for all Google APIs (not just Blogger and Drive), can be found in:
https://developers.google.com/identity/protocols/googlescopes
- title "OAuth 2.0 Scopes for Google APIs"


CODE in Python:
1. The code below is complete end to end. When run, it will handle the authorization, then send some text, to be posted to a blogger site and the json response is captured.
2. There may be more import statements than necessary.
3. The code will NOT work as is because the __blogID__ is not real and the <blah>.json files is not the real filenames. If these values are replaced with proper values, the code should work.
4. The code also shows to options of OAuth. The first uses "OAuth 2.0 for Server to Server Applications" which does not need interaction. The second, which is commented out, uses "Client Secrets", which pops-up a browser, then the operator needs to copy the code and paste it on the terminal where the python code is running.

<code><pre>
import os
# OAuth 2.0 for Installed Applications
from oauth2client.client import flow_from_clientsecrets
# OAuth 2.0 for Server to Server Applications
from oauth2client.service_account import ServiceAccountCredentials
from httplib2 import Http
from oauth2client.client import flow_from_clientsecrets
import json
import webbrowser
import httplib2

from apiclient import discovery
from oauth2client import client
from apiclient.discovery import build
from googleapiclient import sample_tools

__blogID__='99999999999999999999'

# Using OAuth 2.0 for Server to Server Applications
# https://developers.google.com/api-client-library/python/auth/service-accounts#authorizingrequests

scopes = ['https://www.googleapis.com/auth/blogger.readonly']
credentials = ServiceAccountCredentials.from_json_keyfile_name('<project>-<keyID>.json', scopes)
http_auth = credentials.authorize(Http())
# THis returns no errors - so what's next????

"""
# Client Secrets
# https://developers.google.com/api-client-library/python/guide/aaa_client_secrets
flow = flow_from_clientsecrets('client_secret_<client_ID>.json',
                               scope='https://www.googleapis.com/auth/blogger',
                               redirect_uri='urn:ietf:wg:oauth:2.0:oob')
# This section copied from https://developers.google.com/api-client-library/python/auth/installed-app#example
auth_uri = flow.step1_get_authorize_url()
webbrowser.open(auth_uri)
auth_code = raw_input('Enter the auth code: ')
credentials = flow.step2_exchange(auth_code)
http_auth = credentials.authorize(httplib2.Http())
# THis returns no errors - so what's next????
"""

service = build('blogger', 'v3', http=http_auth)   # (api_name, api_version)

text = \
'{ \
    "kind": "blogger#post", \
    "blog": { \
        "id": "999999999999999" \
    }, \
    "title": "A new post", \
    "content": "With <b>new</b> content..." \
}'

parsed = json.loads(text)
print json.dumps(parsed , indent=4, separators=(',', ': '))

request = service.posts().insert(blogId=__blogID__, body=parsed )
response = request.execute()
print json.dumps(response , indent=4, separators=(',', ': '))
</pre></code>





APPENDIXES:
There is a lot of information on this subject, to the extent that it has become confusing and difficult to understand.
The instructions in this blog post just give the minimum steps without too much explanations with the aim of reducing the confusion.
However, to gain an understanding after the fact, the list of websites is given here as a reference:

https://developers.google.com/api-client-library/python/guide/aaa_oauth#oauth-20-explained
- EXCELLENT page - title "OAuth 2.0" - part of "API Client Library for Python"
- examples of Python code to use OAuth 2.0 to connect to Google APIs.

https://developers.google.com/identity/protocols/OAuth2InstalledApp#choosingredirecturi
- talks about "redirect_uri" in the code. The main page is called "Using OAuth 2.0 for Installed Applications"

https://support.google.com/cloud/answer/6158849?hl=en#serviceaccounts
- not that useful

https://developers.google.com/api-client-library/python/auth/service-accounts
- title "Using OAuth 2.0 for Server to Server Applications", part of the "API Client Library for Python"
- OAuth 2.0 can be for "Web Server Application", "Installed Applications", "Server to Server Applications".
- For this goal of automatically posting to Blogger, the "Server to Server Applications" is the most relevant.

https://developers.google.com/blogger/
- Intro to the Blogger API

https://developers.google.com/blogger/docs/3.0/reference/#Pages
- Blogger API reference

About accessing Google APIs for Blogger
https://developers.google.com/blogger/docs/3.0/libraries
https://developers.google.com/resources/api-libraries/documentation/blogger/v3/python/latest/
https://developers.google.com/api-client-library/python/
https://developers.google.com/apis-explorer/#p/blogger/v3/

Friday, March 11, 2016

Twitter - How to change names

In Twitter there are multiple names. The names can be displayed to all users as follows:
----------------------
My Display Name
@myname
----------------------

On some mobiiles, the Twitter interface would only allow you to change the first line of name.
To change that first line name, here are the steps.
1. Open the Twitter app
2. Touch on the 'Triple vertical dots"on the top right corner - this will display a drop down list
3. On the drop down list on the top right, touch the first line - which is showing the current name.
4. On the new page, touch the "Edit Profile"
5. On another new page, touch the "Name" field to edit.

The steps above only allow the first line to be changed. So now it can be something like:
----------------------
My NEW Display Name
@myname
----------------------

So the second line with the @ sign is not changed yet. To change this, go to a Computer, then open a Browser.
1. In the browser, go to Twitter's website.
2. Login to your Twitter account.
3. Click on your icon/photo on the top right corner
4. In the Drop Down list that appears, click Settings.
5. In the Settings page, click Account on the left navigation panel
6. The main panel in the middle will show Account, and under it is a text field for Username.
7. Enter the new "Username" and this would appear with @Username.

8. In addition, go back to the Home Page
9. Click on your icon/photo on the top right corner.
10. Click on the first line, which shows your name.
11. This opens a new page. Click on the Edit Profile,
12. Then fill in the text field that pop up - this would be the first line name.

----------------------
My NEW Display Name
@NEWusername
----------------------

Wednesday, February 10, 2016

Outlook Gmail IMAP stuck in syncing endlessly

I'm trying to help someone that is using a newish Outlook version, to read their emails from Gmail. I think the method is set as IMAP, rather than POP. The problem is that when Outlook is opened, the bar at the bottom says syncing and it never stops, even after overnight session. Another symptom is that new email cannot be received.

After some googling, (going to visit soon), I'm just putting a few links here which may be helpful.







Friday, January 29, 2016

Notes Raspberry Pi

NotesRaspberryPi
================

How to setup a brand new Raspberry Pi
https://www.raspberrypi.org/help/quick-start-guide/

Various resources to teach/learn RaspberryPi. Also contain some Python lessons
https://www.raspberrypi.org/resources/learn/

Official Documentation
https://www.raspberrypi.org/documentation/

Other interesting setup articles
http://www.everydaylinuxuser.com/2015/03/setting-up-raspberry-pi-2.html
http://www.everydaylinuxuser.com/2014/03/connect-to-raspberry-pi-from-hp.html

Sunday, January 03, 2016

Windows 10 - Adding / Creating Local User Instead of Microsoft User

Back in the days (not apologizing for reminiscing) of Windows XP (or maybe even Windows 7) or earlier, adding a user is a simple process of adding a user.

With Windows 8 (I hope not 7) onwards, there seems to be a deliberate attempt to coerce (force) people to open up Microsoft accounts and create such cloud users, even when people are just trying to use their Windows computers locally (not in the cloud). Then in Windows 8, you must read the fine print and try to choose to create or add a "Local User". So a "Local User" is actually what a normal user is, in the previous Windows version - but now we have to remember to use "Local Users" if we just want a normal user on  a computer.

In Windows 10, this is much more difficult, frustrating and annoying. So previous Windows just need: Go to Control Panel -> click Users -> Add user.
But now the problem is so complicated, bordering on deceiving, I just cannot stand it and have to share this process here so that people can get help.

Step 1: Use the normal process of going to Control Panel, or searching for User, to land on this dialog below. Or Control Panel -> User Accounts -> User Accounts (not typo) -> Manage Accounts -> click on "Add a new user in PC Settings"

Then click "Family and Other Users".
Why why why they can't just say "users"?

Then click on "Add someone else to this PC"
(What if this is my family member, but they don't have Microsoft account - sic)

Don't put email address, don't click next.
Instead click "I don't have this person's sign-in information"
(Why why why - well I have this person's sign in - but I refuse to use their Microsoft account. I just want to create them as a "Local User" - phethhhh  :p  )


Finally - almost there. Still don't fall into the trap of giving your details and email.
Click "Add a user without a Microsoft Account"

Home Run !!! (Exactly the fourth / home base)
Now this is more like the usual creation of a normal Windows user.
Go ahead and give it a funny user name - as all computer users have done since the dawn of computing - instead of giving your email account.

Missing Windows 10 Start Key functionality

Beware - Windows 10 - Broken Windows Start Key functionality.
Normally, the Start button on the lower left corner of the Desktop, when pressed, will bring up a list of apps, and some tiles.

When this problem hits (out of nowhere), pressing (left clicking) the Windows Start button has no effect. Double click the Search bar next to it - still sort of works sometimes. Right-clicking the Start button reveals a text based list of thinks that you can do - including poweroff. So all is not lost, but it is very frustrating to see the normal functionality disappear. (see the image below to view the new text-based list of items when right click).



Solved - but don't know which of the following did the trick. Googling gave two solutions.
1 ) Open CMD prompt as Administrator and run:  SFC /scannow
2) Open Powershell as Administrator and run the command shown in the webpage in the image below
3) Restart the PC
All is back to normal. I did all three in sequence. Don't really know if the restart at the end actually solves the problem or all three together.

Thursday, October 01, 2015

Notes Amazon AWS Developer Certification - Associate

Below are the notes that I made while studying for the Amazon AWS Developer Certification - Associate.

Hope someone will find this useful too.

-------------------------------------

AWS Essentials
sudo yum install python-pip
IAM - role - must assign role during creation/build of instance,
Python Boto SDK
API access credentials, use in code and use in AWS account
- access KeyId, Secret Access Key -> Attach user policy
- boto.connect_s3('accessKey','Secret')
Federated credentials ASSUME role to work on instances

AWS S3 - Simple Storage Service
- 100 buckets cannot increase, no limit of objects
- name: lower/upper cases, numbers, period, dashes - cannot place symbols together. 3-63 characters
- min 1 byte, 5 TB; largest single upload is 5Gb, large objects use Multipart Upload
- bucket.s3-website.region.amazonaws.com
Error 404 file not found - contents not in bucket
eorror 403 - no permission to that bucket
error 400 - invalid bucket state
409 - bucket has something in, cannot delete yet
500 - internal server error.

Stored obj in lexicographical order - user RANDOMness by using hash key. eg bucket/8761-2010-25-05......
sequential name files is slower because files are likely stored in same partition. different random names, then send different storage partitions.

- host static website on buckets
- have index, error, redirect documents
- route 53 allow bucket to redirect to website yourdomain.com

Static Hosted Website
- infinitely scalable
- create static HTML - create redirect from blog.
- if use route53 to redirect webpage to S3, need same bucket AND domain name.
- use AWS nameservers and put it on nameserver IP with website host.


CORS - Cross origin resource sharing
- to load from another buckets, need to load from another domain name
- use ajax and javascript.
- setup to allow for CORS - allow javascript to perform AJAX calls in other domain. Go To bucket where content is to share from -> Permission - add CORS - specify manually bucket URL that it is coming from. User Cors Configuration Editor.

IAM and Bucket policies
-restrict based on IP address or agents
IAM - user/account level
Bucket - resource level - 20 kb size, only bucket owner can do bucket policy
object / bucket / namespace / subdirectories etc
bucket ownership cannot transfer - need remove everything first.
ACL - are cross account object/bucket resource level perm
Owner of bucket - has full permissions, if no IAM or bucket then he can be denied. Even owner can be denied.
- explicit deny always overriides allow.
- permission apllied to S3 ARNs
- apply policy to user's ARN (each user has one)
- explicit deny overrides allow.


S3 Error Messages
----
import boto
conn=boto.connect_s3()
bucket=conn.create_bucker('test')

Server Side Encryption - SSE
- AWS handle most - just need to choose which needs to be encrypted - the bucket level and object level.
- x-amz-server-side-encryption request header to upload request.
- AES256, bucket policies require all objs to SSE, enable SSE vi AWS console
- Python SDK does not support encryption for now.
- go to Object -> Properties -> details


DynamoDB - NoSQL

-only create tables, not DB
- 256 tables / region -> contact AWS to increase limit
- read throughput, write throughput - determines resources - auto provision resrouces needed for the load - stored on SSD (fast)
- Primary Range Key - Primary Hash Key
- don't need to manage DB server
- Forum - Reply - Thread tables work together
- Product Catalog table - has Hash key only - specify what we can search on. Can search on ID only since we only has Hash key.
Can still search other columns if A) have range key, b) setup primary key as another column c) setup secondary indexes.
- Pri key should have: unique, otherwise it will be SLOWER
- Reply table has Hash key and Range Key.
-- combine Hash / Range in search
-- cannot do table join
- Create a table with Secondary Index
-- create table, select Hash as Id (UNordered index), Range key(Ordered index is created) as PostedBy,
- need to create index for column that we want to search on.

Limits & Overview
- fully managed, read/write scale without downtime, can specify throughput by calling update table. Data spread across servers, stored on SSD, replicated on diff zones.
- fault tolerance by synch replication
- 64kb limit per item (item=rows) (attributes=colums)
- integrate with MaprReduce and Redshift (Data Warehouse)
- Scalable - no limits for the storage
- provision throughput - during create or update - read/write specify
- Hash index - indexes with PK attrib allows apps to retrieve data by specifying the PK values. ONLY PK can be queried.
- Hash PK - a value that unique identifies item in table
- Hash-and-Range PK - pair of items that together form unique identifier for each item in table.
- Hash is unodered, Range is ordered
- Secondary indexes - datat structure with subset of attrib from table, along with alternate key.
- Local secondary index - same hash key as tbale but diff range key, local because scope stays to same hash key partition as table.
- Global secondary index - hash and range key are diff than that of table, therefore queries on index an span all data in table across partitions.
- Limits: 256 tables per region by default but can increase
- range PK 1024 bytes, hash PK 2048 bytes, item 64kb incl attrib name, 5 local and 5 global secondary index/table,

Provision Throughput
- Read - 1 strongly consistent read/sec or two eventually consistent read/sec for items up to 4KB. Eventually - when write, it can write to two places. But if want to read, make sure read the most recent data by consistent read.
- Write 1 write per second up to 1 KB
- 1 thruput ->eg 1 shared by 7 items -> 1kb/s -> 7 secs
- 4 thruput -> eg 1 shared by 7 items -> 1kb/s -> UP(7/4) = 2 secs
- Read example: item 3kb rounded to 4kb, 80 items / sec to read. Thruput required = 80*(1 4kb item) = 80 strongly consistent
80/2 = 40 eventually consistent.

Queries vs Scan API calls
- Query(Get) only PK and Sec Index keys for search, efficient coz it searches index only.
- Scan - reads every item, so it's inefficient. uses filter and looks thru all rows. Returns the filters only, not other attributes, only does eventually consistent reads

Conditional Writes and Atomic counters
- someone updates table, another tries to write the same row.
- Conditionally write - only if current atribute meet spec, say ONLY IF price = $10, then only update = $12. If another update to $8, then it won't write.
- Atomic counter - allow increase / decreasse value without interfering with other write requests, all write requests applied in order received. Use UpdateItem to increment/decrement.
- Eventually Consistent - multiple copies across servers BUT read data may not be most recent. THUS use strongly consistent read, but doubling thruput.

Temp Access to AWS resource (eg DynamoDB)
- eg mobile app - need to use DynamoDB.
- don't want to put API credentials in the code.
- create temp access on user end, and AWS end
- Federated ID providers and IAM role.
- create new role for each identity provider, eg facebook, Google, ...
- when login, Facebook give temp credentials.
- In role, define what permissions the role has, eg read/write access, which tables can access.
- assumeRoleWithWebIdentity() to request temp AWS Security credentials using Provider tokens, and specify ARN for the IAM role, need Amazon resource name of role.
- Create Role - "Role for Identity Provider Access" - Grant access to web identity -


SNS - Simple Notification Service
- subscription = designation of an endpoint
- send email, notice to apps, google cloud messaging, integrate with Cloud watch
- push notification service - Apple Push.. APNS, Google Messaging, Amazon Messaging
- need to subscribe to topics of SNS, from say DynamoDB, CloudWatch, S3 RRS (Reduced Redundancy Storage 99.99%).
- eg notified if CPU > 80%, DB needs more provisioning.
- Create new Topic -> topic ARN-> send to www.xxx.notify.php where the webpage listens for SNS email/json, etc. Endpoint = ARN
SQS -  Amazon queue, when send a message, add its ARN to SNS. In SQS, select permissions, receive messages on all resources.
- SQS - usually EC2 will poll SQS in order to do something after getting something in SQS.
- SNS messages sends to ALL endpoints. Each message can be different.
- HTTP/s, SMS, EMail/JSON, SQS, Apps

- SNS - message create by Publisher Endpoint -> SNS Topic -> Subscriber ->
- When register each mobile device as endpoint, receive a Device Token (APSN Appple) or Reg ID(Google GCM, Android ADM)
1- Receive Token, RegIDfrom notification service when app registered
2)Tokens unique for each app/ mob device
3) Amazon SNS uses token to create mob endpoint
4) Reg app with Amazon SNS by giving App ID, credentials
5)add returned device tokens, regID to create mob endpoint
i) manually add ii) migrate from CSV iii) CreatePlatformEndpoint iv) Register token from devices that will install your app in future.

SNS Message Data
- message posted to subscriber endpoint - key/val pair in JSON format
- Signature - Base64/ 'SHA1withRSA' signature of message,
- SignatureVersion
- MessageId,
- Subject type,
- timestamp,
- Topic ARN for the topic this message was published to
- Type - General Notification
- SigningCertURL
- Unsubscribe URL

S3, SNS, Python Hands On, LOL Cats
- Eg website where people upload image, our app applies a filter.
- Store original source file in Standard S3 - 11 9s reliability
- don't store the filtered image - put them in RRD 99.99% - saves cost.
- SNS on S3, say one object in RRD is lost, AWS will send SNS that image lost. Want to automate so that image will be re-processed and reupload.
- Case Study
- EC2 create worker instance to poll SQS message queue, apply filter, upload to S3
--- SQS to deploy images - poll the queue and process messages
- SNS use SQS as subscription endpoint
- Create role - AWS service roles = Amazon EC2 (allows ec2 instances to call AWS servcies on our behalf
- setup SQS - create new queue - visibility timeout hides message so a node has time to work on it, before other instances try to work on it.
- setup SNS - create new topic - choose SQS endpoint to subscribe -
sudo yum install python-pip
sudo pip install boto


Cloud Formation
- limit of 20 stacks, then fill a form to increase
- deploy resources through a template in JSON.
- eg create dev, test, staging
- able to version control, rollback, monitor changes,
- need to save as *.template
- Template has: Resources, Parameters, Outputs, Description, Mappings, AWSTemplateFormatVersion
- "Resources" - what resources in AWS to use, eg S3Bucket, type is AWS::S3::Bucket. Go to Template References to see properties
- Output: Fn::GetAtt->WebsiteURL or Fn::GetAtt->DomainName
- Parameters: Define an input for users to put bucket name.
Eg KeyName, VpcId, SubnetId, SSHLocation
- Mappings eg "RegionMap" : { "us-east-1" : {"AMI":"ami-13141"}}
- Can update stack, update template code, add resources without downtime.


SQS - scalable messaging system
- loosely decoupled: elasticity, scaling, layered, protection against lost of data.
- message size 256KB
- guarantee message will be delivered at least once (can have duplicate messages)
- delay queues - delay delivery eg 30 secs, min 0, max 12hrs
- Message retention period - time message will live in Q, if not deleted, def is 4 days, . min 1minute, max 14 days
- Visible timeout - seconds message received from Q is invisible to other components polling SQS, min 0, max 20sec, so that other instances don't try to work on task while one instance already working on it. Def:30secs
- Receive Message Wait Time - if value > 0, this activates long polling. It's max time that long polling will wait for message, if there is no message, before returning empty.
- acl - who can retrieve/send messages
- multiple writers/readers, ie multiple EC2 consistently polling queue - allow auto scaling when needed.
- messages can send instructions, eg tell where uploaded images is storeed.
- Lifecycle - our component sends message A then SQS creates multiple messages. ii)component 2 retrieves message from Q and message A is returned. Message A is in Q while being processed and not returned to subsequent receive request for duration of visibility timeout. Component 2 deletes Message A from queue.
- no downtime, High Availability, fault tolerance - visibility timeout.
- Short Polling (default) - queries a subset only, continuous poll needed to ensure every SQS server is polled. So get false empty responses sometimes.
- Long Polling - reduces empty responses, may wait until there is a message in queue before timeout. each polling is charged $$$. Long polling cheaper.
- SQS - guarantees at least ONE message arrive, but can be duplicate. NOT guarantee delivery order of messages.
- if need order, can use sequence in the instances.

SQS  Developer Requirements
- extend single message visibility timeout - ChangeMessageVisibility() - changes visibility of single message
- change a queue default visibility timeout, API-setQueueAttributes, VisibilityTimeout attribute
- enable long polling queue  API-SetQueueAttributes ReceiveMessageWaitTimeSeconds attrib
- enable delay queue - API-SetQueueAttributes  DelaySeconds attrib
- GetQueueAttributes(), ChangeMessageVisibilityBatch(), DeleteMessageBatch(), GetQueueURL()

AWS Documentation - AmazonSQS - API Reference -

SWF - Simple Workflow Service
- define task by task workflow - code execute each task - distributed service so components in pieces and scalable.
- applications can be in cloud or on-premise,
- workflow can consist of human events, last up to 1 year,
- Domains - determine scope of workflow, multiple workflows can live in one domain, workflows cannot interact with workflows in other domain
- Workers and Decider - activity worker perform activity - worker poll to see if there are tasks to do. After doing task, will report to SWF.
- Activity task do something
- Decision task - occurs when state has changed - tells decider state of workflow has changed. let decider choose what is next.
- SQS has duplicate task. But workflow like video transcoding where order is important CANNOT use SQS. Need to use SWF
- SQS/SWF similarity: distributed system, scaled,
- SQS has best effort and duplicate, order not guaranteed, messages live up to 14 days
- SWF guarantees order, can have human task, task up to 1 year, allow asynchronous and synchronous process.

EC2
- instance launched in VPC - allows provision own cloud area -  internal static IP addresses, build private subnet, secure routing between instances, network layer protocols, etc....
- subnet lives in multiple regions
- create new key-pair, then download, then launch instance
- can share AMI with other users, or make public - can be used in different regions
- can copy AMI from one region to another.
- ebs backed vs instance store. EBS backed is stored in storage device, ie maintain state in data. Show up in VOLUME. Instance-Store uses temporary storage, can be rebuild, but if stopped then changes not saved.

EC2 Classic
- virtual servers in Cloud.
- SpotInstances - bid for unused EC2 instances, not for critical service
- Reserved Instance - pay down EC2 price and guarantee compute time in AZ
- On Demand instances - hourly price, no upfront cost,
- Service Limits - 20 instances per account, 5 EIP Elastic IP Addresses
- S3 simple storage Service
- Instance store volumes (virtual devices), attached to actual hardware (like USB), data is gone when instance is stopped.
- EBS volumes (remote elastic block storage device), attached to network storage, root volume in /dev/sda1
--- attached to only 1 instance, min 1GB, max 1TB
--- PreWarming ensure data is not lost, during attaching/detaching volumes. Prewarming allow to touch every single block device.
--- Snapshots are incremental
- IP address - each instance has public IP address, public cname, prviate IP
- IOPS - IO operations per sec - measured in 16KB chunks, provision IOPSx16KB/1024 = MB transfer /sec
- ELB load balancers - distributes traffic, stops serving traffic to unhealthy instances, store SSL certificates.

Maintain session state on ELB
- by default instance are file based. if have multiple load balance instances, user leaves and comes back and get send to another session.
- Solution 1 - enable stickiness - app generated cookie stickiness.
- Solution 2 - app controlled stickiness. ELB issues cookie to associate session with original server.
- Elastic Cache - ELB will balance and distribute across EC2, maintain session state in DB or send session memory to Memcache

VPC
- allows AWS to define a network, resembles traditional network
- like on-premise, has internal IP addrees (private network).
- private network, public/private subnets
- can define custom IP addrress ranges inside each subbet.
- can configure route tables between subnets, configure gateways and attach them to subnets
- able to extand corporate/home/on-premise to cloud as if part of your network
- NAT allow instance to download from internet, but not send things out, within a private subnet
- VPN to cloud. extend home network to cloud with VPN,VPG with IPsec VPN tunnel, layered Security, within a private subnet
- Default VPC - make instances look like Classic Elastic, have public IP/subnet, all pre-configured, internet gateway is connected,
- non-default VPC - have private but not public ip address. Subnets will not have gateway attached by default. Connect by elasticIP, NAT or VPN
- VPC peering - allow setup direct network route, can access each other with private IP address, as if on same private network.
- Peering with two VPCs - multiple VPCs connected as if in private network. - Peering TO a VPC - multiple VPCs connect to a central VPC but not each other.
- Limits: 5VPC per region, number of Gateways  = number of VPCs, 5 internet gateways (can request more), 1 gateway attached to subnet at a time.
- 50 customer gateways per region, 50 VPN connections/reg, 200 route tables / region, 5 elastic IP, 100 security group. 50 rules pre security group.


Building a Non-Default VPC
- VPC = network, subnet is inside a specific VPC
- Create Private Subnets
-CIDR - Classless InterDomain Range, eg 10.0.0.0/16 up to 256 subnets,
Tenancy - Default (Shared) or Dedicated (Single Tenant) - This tenancy option takes preference over tenancy selected at instance launch.
- Subnet - has its own CIDR range, think about multi region availability,
eg 1st subnet: 10.0.1.0/24 -> us-east-1a
eg 2nd subnet: 10.0.2.0/24 -> us-east-1b  Use load balancer between 1st and 2nd subnet
VPC automatically allow subnets to communicate to each other.
--- cannot connect to these from the outside, no internet gateway yet, cannot send/receive to internet.
--- if create NAT - can download patches, but cannot serve outside, and cannot connect to instance from outside.
--- create gateway to public subnet, launch instance inside public subnet, attach elasticIP, then connect to private instance in private subnet.
--- or create VPN - use OpenVPN or Amazon VPN

Route Table for 10.0.0.0/16
- all subnets routed to have traffic routed to each one.
- one route assigned to a subnet at a time.

Internet Gateway
- one gateway attached to default VPC
- to communicate outside, launch instance into a subnet with internet gateway attached AND NEED to attach ElasticIP or in ELB group.
- attach Gateway to VPC (subnet still private)
- assign gateway to a route, then change route on public subnet.
- goto route table, choose a ROUTE, add gateway to this route, 0.0.0.0/0 <-> gateway.
- then attach route to subnet, make 10.0.1.0 to be public, goto SUBNET, select Route table, choose the Route #2 created above.
Route #2 has
Destinations   Target
10.0.0.0/16    local  
0.0.0.0/0      gateway
Route #1 (default)
10.0.0.0/16    local  

Elastic IP address
- can attach to any instance, but if instance is not in subnet with gateway, then CANNOT connect.

Security
Network ACLs - on network level- protects subnet
Security Group - on instance level

- Public Subnets
- attach a gateway to it


VPC Security
Router VPC - 10.0.0.0/16
- has Virtual Priv Gateway
- has Internet Gateway
- has TWO Routing table
Routing Table -> Network ACL -> Subnet 10.0.0.0/24 -> Security Group -> Instances
Network ACL = Firewall protecting Subnets, can deny DDOS with rule
- STATELESS-> return traffic (outbound) must be specified, eg port 80
- Explicit DENY OVERRULE ALLOW
- Rules from low number to high.
- LAST RULE: * 0.0.0.0/0 DENY

Secruity Group = Firewall protecting Instances
- STATEFUL -> outbound allowed automatically
- Instance can belong to multiple Security Group


Create VPC NAT Instance
- private instance is still protected
- one private, one public instance
- Create a Security Group called NATsec, launch this inside VPC
- for every subnet, needs to add its CIDR to NATsec,
eg INBOUND 10.0.3.0/24, OUTBOUND 0.0.0.0/24
- Create instance Linux-AMI-NAT, create in VPC, in public subnet, select security NATsec.
- associate ElasticIP with new NATinstance. This will communicate on behalf of private subnet.
- Right click NAT - Disable "Change Source Dest check"
- go to Route tables - go to ROUTE that is associated with private subnet,  enter Destination=0.0.0.0/0 target=Natsec-id.
- In Route tables - check subnet association to attach to private subnet


VPC Networking
- VPN - don't need Internet Gateway.
- VPN goes through internet - Change Internet Gateway in Virtual Private Gateway
- create Virtual private Gateway - then connect to VPN on cloud side
- One customer side, connect VPN to customer Gateway.
- if use OpenVPN, may be different to customer Gateway.


Elastic IP Addresses and Elastic Network Interfaces
- can attach private IP to elastic IP
- create an E.Network Instance - this will give a new private IP - can attach to any instance. can attach/detach. Has Elastic IP attached to it.
- reasons: creating HA architecture, etc
- instance automatically have primary private IP
- allows reassoication
- when instance stops, EIP stays attach because EIP attach to ENetwork Interfaces which belong VPC


Create a WIKI in VPC
- create wikiVPC - CIDR 10.0.0.0/16
- need public x2 (to use load balancer) and private subnets x2
- subnet public1 CIDR 10.0.0.0/24 -> us-east1b -> apache
- subnet public2 CIDR 10.0.1.0/24 -> us-east1c
- subnet private1 CIDR 10.0.3.0/24 -> us-east1b -> DB, failover, redundancy
- subnet private2 CIDR 10.0.4.0/24 -> us-east1c
- RouteTables - create route wikiPublic
- Internet Gateway - create wikeGateway -> attach to wikiVPC
- Route Tables, for wikiPublic routeTable , add route 0.0.0.0/0 - wikiGateway
- Route Tables -> Subnet associations -> 4 subnets to choose from, select public1, public2 subnets
- Go to RDS Dashboard -> Subnet Groups. Has default VPC associated.
--- Need to create DB subnet group - attach to wikiVPC .
--- add subnets - only private subnet i)private1, ii) private2
--- Go to Instances -> launch DB instance, enable Multi-AZ (Availability Zone) for failover DB. switch DNS to another zone. Data are replicated.
- Goto EC2 -
- launch instance, call websetup - choose wikiVPC, choose public1, create security gropu wikiSecGroup (add HTTP rule source=0.0.0.0/0)
- launch instance, call amisetup - choose wikiVPC, choose public1, create security gropu wikiSecGroup (add HTTP rule source=0.0.0.0/0, add SSH rule source=0.0.0.0/0) - create key pair
- go to Elastic IP - create new one - associate with instance amisetup
- change permission on PEM file is 400 or 600
- download php/apache - sudo apt-get .....
- go to Route53 - add Domain name with nameservers pointed to delegation set. Create new record set - add public address,
- go to RDS -> Security Group -> Inbound All traffic choose wiki-app security group - outbound traffic choose wiki-app security group
- Instances -> create Image of amisetup for HA, called wiki-ami
- Load Balancers - create Load Balance -> use wikiVPC -> protocol = http/80 -> select public1 , public2 -> use wikiapp Security Group
- Instances - Terminate Instance (now that the amisetup instance image is completed)
- goto AutoScaling - create auto scaling group -> new launch config -> select image MyAMIs- wiki-ami -> call this wiki-as -> launch in Security group called wiki-app (wikiSecgroup) ->
- Create Auto Scaling Group - name wiki-group - 2 instances needed -> use wikiVPC - add public1, public2 -> use wiki Load Balancer
- Scaling policy between 2 and 4, integrate with CloudWatch -> create Alarm CPU utilization > 50% for 1 minute, then scale up 50% of group, Use Tags as Name:wiki-as
- Self-Healing - if one instances terminated, after 60 seconds, an instance will relaunch



AWS compliance:
PCI DSS, SOC IRAP, ISO 9001, ISO 27001 MTCS, HIPAA, FERPA, ITAR, FedRAMP, DIACAP, FISMA, NIST, CJIS, FIPS, DOD CSM, G-CLOUD, IT-GRUNSHUTZ, MPAA, CSA,

DynamoDB:
- Actions: BatchGetItem, BatchWriteItem CreateTable DeleteItem DeleteTable DescribeTable GetItem ListTables PutItem Query Scan UpdateItem UpdateTable
- Item collection  - If an exceeds the 10 GB limit, DynamoDB will return an ItemCollectionSizeLimitExceededException and you won't be able to add more items to the item collection or increase the sizes of items that are in the item collection.
-- uses optimistic concurrency control, uses conditional writes
- NO CROSS JOIN support
- Local secondary index — an index that has the same hash key as the table, but a different range key. A local secondary index is "local" in the sense that every partition of a local secondary index is scoped to a table partition that has the same hash key.
- Global secondary index — an index with a hash or a hash-and-range key that can be different from those on the table. A global secondary index is considered "global" because queries on the index can span all items in a table, across all partitions.
- in scan operations, it returns data in 1MB



EC2 Instances
--Product Code - cannot be made public
-- AMI can be launched in the same region as AMI is stored.
- limit of 20 EC2 accounts

EBS Volumes
-- in stopped state, EBS vol can be attached/detached
-- when instance is terminated, volume is deleted
-- charged for volume, instance usage, in addition toAMI

CloudTrail
-- captures API calls made from SQS API from Console / API-calls, and delivers to S3 bucket.
-- from CloudTrail, can determine what SQS request is made, the source IP, who requested it, when, etc

IAM
- AWS temp credentials associated with IAM rotated many times a day
- Cannot change the IAM role of a EC2 running instance, but can change the permissions and effective immediately.
- IAM roles can have up to 250 policies, if more is needed, then fill a form to AWS.

SQS
-- can view messages that are Visible and NotVisible
-- valid identifiers for queue and messages are: QueueURL, MessageID, ReceiptHandle
-- SQSBufferedAsync - prefetch into local buffer. automatic batching of SendMessage / DeleteMessage
-- in the message, IP address of the sender is given by SenderId
- DLQ - is an SQS queue that can be setup to receive messages from other queue which have reached their limits

S3
- multipart upload - can stop and resume uploads, can start upload when file is being created.

EC2 - Relational Database AMIs
- stores data in EBS - fast, reliable, persistent
- avoid friction of infrastructure provisioning while gaining access to std DB engines
- enable complete control over administration and tuning of DB server

VPC
- allow up to 200 subnets in a VPC.
- allow 5 Virtual private gateways per region

SWF, IAM, RDS