Week of November 20th

“Climb🧗‍♀️ in the back with your head👤 in the Clouds☁️☁️… And you’re gone

Hi All –

Happy Name Your PC💻 Day!

Forward yesterday makes me wanna stay…”

“Welcome back, to that same old place that you laughed 😂 about”. So, after a short recess we made our splendiferous return this week. To where else? …But to no other than Google Cloud Platform a.k.a GCP☁️ , of course! 😊 So after completing our three-part Cloud Journey, we were feeling the need for a little refresher… Also, there were still had a few loose ends we needed to sew🧵 up. The wonderful folks at Google Cloud☁️ put together amazing compilation on GCP☁️ through their Google Cloud Certified Associate Cloud Engineer Path but we were feeling the need for a little more coverage on GCP CLI i.e. “gcloud”, “gsutil”, and “bq” . In addition, we had a great zest to learn a little more about some of the service offerings like GCP Development Services and APIs. Fortunately, we knew exactly who could deliver tremendous content on GCP☁️ as well as hit the sweet spot on some of the areas where we felt we were lacking a bit. That would be of course one of our favorite Canucks 🇨🇦 Mattias Andersson

For those who are not familiar with Mattias, he is one of the legendary instructors on A Cloud Guru. Mattias is especially well-known for his critically acclaimed Google Certified Associate Cloud Engineer 2020 course.

In this brilliantly produced course Mattias delivers the goods and then some! The goal of the course is to prepare those interested in preparing for Google’s Associate Cloud Engineer (ACE) Certification exam but it’s structured in a manner to efficiently to provide you with the skills to troubleshoot GCP through having a better understanding of “Data flows”. Throughout the course Mattias emphasizes the “see one, do one, teach one” technique in order to get the best ROI out of the tutorial.

So, after some warm salutations and a great overview of the ACE Exam, Mattias takes right to an introductions of all the Google Cloud product and Services. He accentuates the importance of Data Flow in fully understanding how all GCP solutions work. “Data Flow is taking data or information and it’s moving it around, processing it and remembering it.

Data flows – are the foundation of every system

  • Moving, Processing, Remembering
    • Not just Network, Compute, Storage
  • Build mental models
    • Helps you make predictions
  • Identify and think through data flows
    • Highlights potential issues
  • Requirement and options not always clear
    • Especially in the real world🌎
  • Critical skills for both real world🌎 and exam📝 questions

“Let’s get it started, in here…And the base keep runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and runnin’ 🏃‍♂️ runnin’ 🏃‍♂️, and…”

After walking🚶‍♀️ us through how to create a Free account it was time ⏰ to kick off 🦵 us with a little Billing and Billing Export.

“Share it fairly, but don’t take a slice of my pie 🥧”

Billing Export –to BigQuery enables you to export your daily usage and cost estimates automatically throughout the day to a BigQuery dataset.

  • Export must be set up per billing account
  • Resources should be placed into appropriate projects
  • Resources should be tagged with labels🏷
  • Billing export is not real-time
    • Delay is hours

Billing IAM – Role: Billing Account User

  • Link🔗 projects to billing accounts
  • Restrictive permissions
  • Along with the Project Creator allow a user to create new projects linked to billing

Budgets – Help with project planning and controlling costs

  • Setting a budget lets you track spend
  • Apply budget to billing account or a Project

Alerts 🔔 – notify billing administrators when spending exceeds a percentage of your budget

Google Cloud Shell 🐚 – provides with CLI access to Cloud☁️ Resources directly from your browser.

  • Command-line tool🔧 to interact GCP☁️
  • Basic Syntax
 gcloud–project=myprojid compute instances list
gcloud compute instances create myvm
gcloud services list --available
gsutil ls
gsutil mb -l northamerica-northeast1 gs://storage-lab-cli
gsutil label set bucketlables.json gs://storage-lab-cli

GCS via gsutil in Command Line

 gcloud config list
 gcloud config set project igneous-visitor-293922
 gsutil ls
 gsutil ls gs://storage-lab-console-088/
 gsutil ls gs://storage-lab-console-088/**
 gsutil mb --help
 gsutil mb -l northamerica-northeast1 gs://storage-lab-cli-088
 
 gsutil label get gs://storage-lab-console-088/
 gsutil label get gs://storage-lab-console-088/ > bucketlabels.json
 cat bucketlabels.json
 gsutil label get gs://storage-lab-cli-088
 gsutil label set bucketlabels.json gs://storage-lab-cli-088
 gsutil label ch -l "extralable:etravalue" gs://storage-lab-cli-088
 gsutil versioning get gs://storage-lab-cli-088
 gsutil versioning set on gs://storage-lab-cli-088
 gsutil versioning get gs://storage-lab-cli-088
 gsutil cp README-Cloudshell.txt gs://storage-lab-cli-088
 gsutil ls -a gs://storage-lab-cli-088
 gsutil rm gs://storage-lab-cli-088/README-Cloudshell.txt
 gsutil cp gs://storage-lab-console-088/** gs://storage-lab-cli-088/
 gsutil acl ch -u AllUsers:R gs://storage-lab-cli-088/shutterstock.jpg 

Create VM via gsutil in Command Line

 gcloud config get-value project
 gcloud compute instances list
 gcloud services list
 gcloud services list --enabled
 gcloud services list --help
 gcloud services list –available
 gcloud services list --available |grep compute
 gcloud services -h
 gcloud compute instances create myvm
 gcloud compute instances delete myvm 

Security🔒 Concepts

Confidentiality, Integrity, and Availability (CIA)

  • You cannot view data you shouldn’t
  • You cannot change data you shouldn’t
  • You can access data you should

Authentication, Authorization, Accounting (AIA)

  • Authentication – Who are you?
  • Authorization – What are you allowed to do?
  • Accounting – What did you do?
  • Resiliency – Keep it running 🏃‍♂️
  • Security🔒 Products
  • Security🔒 Features
  • Security🔒 Mindset
    • Includes Availability Mindset

Key🔑 Security🔒 Mindset (Principles)

  • Least privilege
  • Defense in depth
  • Fail Securely

Key🔑 Security🔒 Products/Features

  • Identity hierarchy👑 (Google Groups)
  • Resource⚙️ hierarchy👑 (Organization, Folders📂, Projects)
  • Identity and Access Management (IAM)
    • Permissions
    • Roles
    • Bindings
  • GCS ACLs
  • Billing management
  • Networking structure & restrictions
  • Audit / Activity Logs (provided by Stackdriver)
  • Billing export
    • To BigQuery
    • To file (in GCS bucket🗑)
      • Can be JSON or CSV format
  • GCS object Lifecycle Management

IAM – Resource Hierarchy👑

  • Resource⚙️
    • Something you create in GCP☁️
  • Project
    • Container for a set of related resources
  • Folder📂
    • Contains any number of Projects and Subfolders📂
  • Organization
    • Tied to G Suite or Cloud☁️ Identity domain

IAM – Permissions & Roles

Permissions – allows you to a perform a certain action

  • Each one follows the form Service.Resource.Verb
  • Usually correspond to REST API methods
    • pubsub.subcription.consume
    • pubsub.topics.publish

Roles – is a collection of permissions to use or manage GCP☁️ resources

  • Primitive Roles – Project-level and often too broad
    • Viewer is read-only
    • Editor can view and change things
    • Owner can also control access & billing
  • Predefined Roles
    • roles/bigquery.dataEditor, roles/pub.subscriber
    • Read through the list of roles for each product! Think about why each exists
  • Custom Role – Project or Org-Level collection you define of granular permissions

IAM – Members & Groups

Members – some Google-known identity

  • Each member is identifying by unique  email📧 address
  • Can be:
    • user: Specific Google account
      • G Suite, Cloud☁️ Identity, gmail, or validated email
    • serviceAccount: Service account for apps/services
    • group: Google group of users and services accounts
    • domain: whole domain managed by G Suite or Cloud☁️ Identity
    • allAuthenticatedUsers: Any Google account or service account
    • allUsers: Anyone on the internet (Public)

Groups – a collection of Google accounts and service accounts

  • Every group has a unique  email📧 address that is associated with the group
  • You never act as the group
    • But membership in a group can grant capabilities to individuals
  • Use them for everything
  • Can be used for owner when within an organization
  • Can nest groups in an organization
    • One group for each department, all those in group for all staff

IAM – Policies

Policies – binds members to roles for some scope of resources

  • Enforce who can do what to which thing(s)
  • Attached to some level in the Resource⚙️ history
    • Organization
    • Folder📂
    • Project Resource⚙️
  • Roles and Members listed in policy, but Resources identified by attachment
  • Always additive (Allow) and never subtractive (no Deny)
  • One policy per Resource⚙️
  • Max 1500-member binding per policy
 gCloud[GROUP] add-iam-policy-binding [Resource-NAME]
 --role [ROLE-ID-TO-GRANT] –member user: [USER-EMAIL]
 gCloud[GROUP] remove-iam-policy-binding [Resource-NAME]
 --role [ROLE-ID-TO-REVOKE] –member user: [USER-EMAIL] 

Billing Accounts – represents some way to pay for GCP☁️ service usuage

  • Type of Resource⚙️ that lives outside of Projects
  • Can belong to an Organization
    • Inherits Org-level IAM policies
  • Can be linked to projects
    • Not the Owner
      • No impact on project IAM
RolePurposeScope
Billing Account CreatorCreate new self-service billing accountsOrg
Billing Account AdministratorManage billing accountsBilling Account
Billing Account UserLink Projects to billing accountsBilling Account
Billing Account ViewerView billing account cost information and transactionsBilling Account
Project Billing ManagerLink/unlink the project to/from a billing accountProject

Monthly Invoiced Billing – Billed monthly and pay by invoice due date

  • Pay via check or wire transfer
  • Increase project and quota limits
  • Billing administrator of org’s current billing account contacts Cloud☁️ Billing Support
    • To Determine eligibility
    • To apply to switch to monthly invoicing
  • Eligibility depends on
    • Account age
    • Typical monthly spend
    • Country

Networking

            Choose the right solution to get data to the right Resource⚙️

  • Latency reduction – Use Servers physically close to clients
  • Load Balancing – Separate from auto-scaling
  • System design – Different servers may handle different parts of the system
  • Cross-Region Load Balancing – with Global🌎 Anycast IPs
  • Cloud☁️ Load Balancer 🏋️‍♀️ – all types; internal and external
  • HTTP(S) Load Balancer 🏋️‍♀️ (With URL Map)

Unicast vs Anycast

Unicast – There is only one unique device in the world that can handle this; send it there.

Anycast – There are multiple devices that could handle this; send it to anyone – but ideally the closest.

            Load Balancing – Layer 4 vs Layer 7

  • TCP is usually called Layer 4 (L4)
  • HTTP and HTTPS work at Layer (L7)
  • Each layer is built on the one below it
    • To route based on URL paths, routing needs to understand L7
    • L4 cannot route based on the URL paths defined in L7

DNS – Name resolution (via the Domain Name System) can be the first step in routing

  • Some known issues with DNS
    • Layer 4 – Cannot route L4 based on L7s URL paths
    • Chunky – DNS queries often cached and reused for huge client sets
    • Sticky – DNS lookup “locks on” and refreshing per request has high cost
      • Extra latency because each request includes another round-trip
      • More money for additional DNS request processing
    • Not Robust – Relies on the client always doing the right thing
  • Premium tier routing with Global🌎 anycast Ips avoids these problems

Options for Data from one Resource to another

  • VPC (Global🌎) Virtual Private Cloud☁️ – Private SDN space in GCP☁️
    • Not just Resource-to-Resource – also manages the doors to outside & peers
  • Subnets (regional) – create logical spaces to contain resources
    • All Subnets can reach all others – Globally without any need for VPNs
  • Routes (Global🌎) define “next hop” for traffic🚦 based on destination IP
    • Routes are Global🌎 and apply by Instance-level Tags, not by Subnet
    • No route to the Internet gateway means no such data can flow
  • Firewall🔥 Rules (Global🌎) further filter data flow that would otherwise route
    • All FW Rules are Global🌎 and apply by Instance-level Tags or Service Acct.
    • Default Firewall🔥 Rules are restrictive inbound and permissive outbound

IPs and CIDRS

  • IP Address is 255.255.255.255 (dotted quad) where each piece is 0-255
  • CIDR block is group of IP addresses specified in <IP>/xy notation
    • Turn IP address into 32-bit binary number
    • 10.10.0.254 -> 00001010 00001010 00000000 11111110
    • /xy in CIDR notation locks highest (leftmost) bits in IP address (0-32)
    • abc.efg.hij.klm/32 is single IP address (255.255.255.255) because all 32 bits are looked
    • abc.efg.hij.klm /24 is 24 is 256 (255.555.255.0) IP addresses because last 8 bits can vary
    • 0.0.0.0/0 means “any IP address” because no bits are locked
  • RFC1918 defines private (i.e non-internet) address ranges you can use:
    • 10.0.0.0/8 172.16.0.12, and 192.168.0.0/16

Subnet CIDR Ranges

  • You can edit a subnet to increase its CIDR range
  • No need to recreate subnet or instances
  • New range must contain old range (i.e. old range must be subnet)

Shared VPC

  • In an Organization, you can share VPCs among multiple projects
    • Host Project: One project owns the Shared VPC
    • Service Projects: Other projects granted access to use all/part of Shared VPC
  • Lets multiple projects coexist on same local network (private IP space)
  • Let’s a centralized team manage network security🔒

“Ride, captain👨🏿‍✈️ ride upon your mystery ship⛵️

GKE

A Kubernetes ☸️ cluster is a set of nodes that run containerized applications. Containerizing applications packages an app with its dependences and some necessary services.

K8s ☸️ you know that the control plane consists of the kube-apiserver, kube-scheduler, kube-controller-manager and an etcd datastore. 

Deploy and manage clusters on-prem

Step 1: The container runtime

Step 2: Installing kubeadm

Step 3: Starting the Kubernetes cluster ☸️

Step 4: Joining a node to the Kubernetes cluster ☸️

Deploy and manage clusters on-prem in the Cloud☁️

To deploy and manage your containerized applications and other workloads on your Google Kubernetes Engine (GKE) cluster, you use the K8s ☸️ system to create K8s ☸️  controller objects. These controller objects represent the applications, daemons, and batch jobs running 🏃‍♂️ on your clusters.

            Cloud Native Application Properties

  • Use Cloud☁️ platform services.
  • Scale horizontally.
  • Scale automatically, using proactive and reactive actions.
  • Handle node and transient failures without degrading.
  • Feature non-blocking asynchronous communication in a loosely coupled architecture.

Kubernetes fits into the Cloud-native ecosystem

K8s ☸️ native technologies (tools/systems/interfaces) are those that are primarily designed and built for Kubernetes ☸️.

  • They don’t support any other container or infrastructure orchestration systems
  • K8s ☸️ accommodative technologies are those that embrace multiple orchestration mechanisms, K8s ☸️ being one of them.
  • They generally existed in pre-Kubernetes☸️ era and then added support for K8s ☸️ in their design.
  • Non-Kubernetes ☸️ technologies are Cloud☁️ native but don’t support K8s ☸️.

Deploy and manage applications on Kubernetes ☸️

K8s ☸️ deployments can be managed via Kubernetes ☸️ command line interface kubectl. Kubectl uses the Kubernetes ☸️ API to interact with the cluster. 

When creating a deployment, you will need to specify the container image for your application and the number of replicas that you need in your cluster.

  • Create Application
    • create the application we will be deploying to our cluster
  • Create a Docker🐳 container image
    • create an image that will contain the app built.
  • Create a K8s ☸️ Deployment
    • K8s ☸️ deployments are responsible for creating and managing pods
    • K8s ☸️ pod is a group of one or more containers, tied together for the purpose of administration and networking. 
    • K8s ☸️ Deployments can be created in two ways
      •  kubectl run command
      •  YAML configuration

Declarative Management of Kubernetes☸️ Objects Using Configuration Files

K8s ☸️ objects can be created, updated, and deleted by storing multiple object configuration files in a directory and using kubectl apply to recursively create and update those objects as needed.

This method retains writes made to live objects without merging the changes back into the object configuration files. kubectl diff also gives you a preview of what changes apply will make.

DaemonSet

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.

Some typical uses of a DaemonSet are:

  • Running 🏃‍♂️ a cluster storage🗄 daemon on every node
  • Running 🏃‍♂️ a logs collection daemon on every node
  • Running 🏃‍♂️ a node monitoring🎛 daemon on every node

Cloud Load Balancer 🏋️‍♀️ that GKE created

Google Kubernetes☸️ Engine (GKE) offers integrated support for two types of Cloud☁️ Load Balancing for a publicly accessible application:

When you specify type:LoadBalancer 🏋️‍♀️ in the Resource⚙️ manifest:

  1. GKE creates a Service of type LoadBalancer 🏋️‍♀️. GKE makes appropriate Google Cloud API calls to create either an external network load balancer 🏋️‍♀️ or an internal TCP/UDP load balancer 🏋️‍♀️.
  • GKE creates an internal TCP/UDP load balancer 🏋️‍♀️ when you add the Cloud.google.com/load-balancer 🏋️‍♀️-type: “Internal” annotation; otherwise, GKE creates an external network load balancer 🏋️‍♀️.

Although you can use either of these types of load balancers 🏋️‍♀️ for HTTP(S) traffic🚦, they operate in OSI layers 3/4 and are not aware of HTTP connections or individual HTTP requests and responses.

Imagine all the people👥 sharing all the world🌎

GCP Services

Compute

Compute Engine (GCE) – (Zonal) (IaaS) – Fast-booting Virtual Machines (VMs) for rent/demand

  • Pick set machine type – standard, high memory, high CPU-or custom CPU/RAM
  • Pay by the second (60 second min.) for CPUs, RAM
  • Automatically cheaper if you keep running 🏃‍♂️ it (“sustained use discount”)
  • Even cheaper for “preemptible” or long-term use commitment in a region
  • Can add GPUs and paid OSes for extra cost*
  • Live Migration: Google seamlessly moves instance across hosts, as needed

Kubernetes Engine (GKE) – (Regional (IaaS/Paas) -Managed Kubernetes ☸️ cluster for running 🏃‍♂️ Docker🐳 containers (with autoscaling)

  • Kubernetes☸️ DNS on by default for service discovery
  • NO IAM integration (unlike AWS ECS)
  • Integrates with Persistent Disk for storage
  • Pay for underlying GCE instances
    • Production cluster should have 3+ nodes*
  • No GKE management fee, no matter how many nodes in cluster

  App Engine (GAE) – (Regional (PaaS) that takes your code and runs it

  • Much more than just compute – Integrates storage, queues, NoSQL
  • Flex mode (“App Engine Flex”) can run any container & access VPC
  • Auto-Scales⚖️ based on load
    • Standard (non-Flex) mode can turn off las instance when no traffic🚦
  • Effectively pay for underlying GCE instances and other services

Cloud Functions – (Regional (FaaS), “Serverless” -Managed K8s☸️ cluster for running 🏃‍♂️ Docker🐳 containers (with autoscaling)

  • Runs code in response to an event – Node.js Python🐍, Java☕️, Go🟢
  • Pay for CPU and RAM assigned to function, per 100ms (min. 100ms)
  • Each function automatically gets an HTTP endpoint
  • Can be triggered by GCS objects, Pub/Sub messages, etc.
  • Massively Scalable⚖️ (horizontally) – Runs🏃‍♂️ many copies when needed

Storage

Persistent Disk (PD) – (Zonal) Flexible🧘‍♀️, block-based🧱 network-attached storage; boot disk for every GCE instance

  • Perf Scales⚖️ with volume size; max way below Local SSD, but still plenty fast🏃‍♂️
  • Persistent disks persist, and are replicated (zone or region) for durability
  • Can resize while in use (up to 64TB), but will need file system update with VM
  • Snapshots (and machine images🖼 ) add even more capability and flexibility
  • Not file based NAS, but can mount to multiple instances if all are read-only
  • Pay for GB/mo provisioned depending on perf.class;plus snapshot GB/mo used

  Cloud Filestore – (Zonal) Fully managed file-based storage

  • “Predictably fast🏃‍♂️ performance for your file-based workloads”
  • Accessible to GCE and GKE through your VPC, via NFSv3 protocol
  • Primary use case is application migration to Cloud☁️ (“lift and shift”)🚜
  • Fully manages file serving, but not backups
  • Pay for provisioned TBs in “Standard” (slow) or “Premium” (fast🏃‍♂️) mode
  • Minimum provisioned capacity of 1TB (Standard) or 2.5TB (Premium)

Cloud Storage (GCS) – (Regional, Multi-Regional) Infinitely Scalable⚖️, fully managed, versioned, and highly durable object storage

  • Designed for 99.999999999% (11 9’s) durability
  • Strong consistent💪 (even for overwrite PUTs and DELETEs)
  • Integrated site hosting and CDN functionality
  • Lifecycle♻️ transitions across classes: Multi-Regional, Regional, Nearline, Coldline🥶
    • Diffs in cost & availability (99.5%, 99.9%, 99%, 99%), not latency (no thaw delay)
  • All Classes have same API, so can use gsutil and gcsfuse

Databases

            Cloud SQL – (Regional, Fully managed and reliable MySQL and PostgreSQL databases

  • Supports automatic replication, backup, failover, etc.
  • Scaling is manual (both vertically and horizontally)
  • Effectively pay for underlying GCE instances and PDs
    • Plus, some baked-in service fees

Cloud Spanner – (Regional, Multi-Regional), Global🌎 horizontally Scalable⚖️, strongly consistent 💪, relational database service”

  • “From 1 to 100s or 1000s of nodes”
    • “A minimum of 3 nodes is recommended for production environments.”
  • Chooses Consistency and Partition – Tolerance (CP and CAP theorem)
  • But still high Availability: SLA has 99.999% SLO (five nines) for multi-region
    • Nothing is actually 100%, really
    • Not based on fail-over
  • Pay for provisioned node time (by region/multi-region) plus used storage-time

BigQuery (BQ)– Multi-Regional Serverless column-store data warehouse for analytics using SQL

  • Scales⚖️ internally (TB in seconds and PB in minutes)
  • Pay for GBs actually considered (scanned) during queries
    • Attempts to reuse cached results, which are free
  • Pay for data stored (GB-months)
    • Relatively inexpensive
    • Even cheaper when table not modified for 90 days (reading still fine)
  • Pay for GBs added via streaming inserts

Cloud Datastore – (Regional, Multi-Regional) Managed & autoscale⚖️ NoSQL DB with indexes, queries, and ACID trans, support

  • No joins or aggregates and must line up with indexes
  • NOT, OR, and NOT EQUALS (<>,!=) operations not natively supported
  • Automatic “built-in” indexes for simple filtering and sorting (ASC, DESC)
  • Manual “composite” indexes for more complicated, but beware them “exploding”
  • Pay for GB-months of storage🗄 used (including indexes)
  • Pay for IO operations (deletes, reads, writes) performed (i.e. no pre-provisioning)

Cloud Bigtable – (Zonal) Low latency & high throughput NoSQL DB for large operational & analytical apps

  • Supports open-source HBase API
  • Integrates with Hadoop, Dataflow, Dataproc
  • Scales⚖️ seamlessly and unlimitedly
    • Storage🗄 autoscales⚖️
    • Processing nodes must be scaled manually
  • Pay for processing node hours
  • GB-hours used for storage 🗄 (cheap HDD or fast🏃‍♂️ SSD)

Firebase Realtime DB & Cloud Firestore 🔥 – (Regional, Multi-Regional) NoSQL document📃 stores with ~real-time client updates via managed WebSockets

  • Firebase DB is single (potentially huge) JSON doc, located only in central US
  • Cloud☁️ Firestore has collection, documents📃, and contained data
  • Free tier (Spark⚡️), flat tier (Flame🔥), or usage-based pricing (Blaze)
    • Realtime Db: Pay more GB/month stored and GB downloaded
    • Firestore: Pay for operations and much less for storage🗄 and transfer

Data Transfer ↔️

Data Transfer Appliance – Rackable, high-capacity storage 🗄 server to physically ship data to GCS

  • Ingest only; not a way to avoid egress charges
  • 100 TB or 480 TB/week is faster than a saturated 6 Gbps link🔗

Storage Transfer Service – (Global) Copies objects for you, so you don’t need to set up a machine to do it

  • Destination is always GCS bucket 🗑
  • Source can S3, HTTP/HTTPS endpoint, or another GCS bucket 🗑
  • One-time or scheduled recurring transfers
  • Free to use, but you pay for its actions

External Networking

Google Domains – (Global) Google’s registrar for domain names

  • Private Whois records
  • Built-in DNS or custom nameservers
  • Support DNSSEC
  •  email📧 forwarding with automatic setup of SPF and DKIM (for built-in DNS)

Cloud DNS– (Global) Scalable⚖️, reliable, & managed authoritative Domain (DNS) service

  • 100 % uptime guarantee
  • Public and private managed zones
  • Low latency Globally
  • Supports DNSSEC
  • Manage via UI, CLI, or API
  • Pay fixed fee per managed zone to store and distribute DNS records
  • Pay for DNS lookups (i.e. usage)

Static IP Addresses – (Regional, Global🌎 Reserve static IP addresses in projects and assign them to resources

  • Regional IPs used for GCE instances & Network Load Balancers🏋️‍♀️
  • Global IPs used for Global load balancers🏋️‍♀️:
  • HTTP(S) SSL proxy, and TCP proxy
  • Pay for reserved IPs that are not in use, to discourage wasting them

Cloud Load Balancing (CLB) – (Regional, Global🌎 High-perf, Scalable ⚖️ traffic🚦 distribution integrated with autoscaling & Cloud☁️ CDN

  • SDN naturally handles spikes without any prewarming, no instances or devices
  • Regional Network Load Balancer 🏋️‍♀️: health checks, round robin, session affinity
    • Forwarding rules based on IP, protocol (e.g. TCP, UDP), and (optionally) port
  • Global load balancers 🏋️‍♀️ w/ multi-region failover for HTTP(S), SSL proxy, & TCP proxy
    • Prioritize low-latency connection to region near user, then gently fail over in bits
    • Reacts quickly (unlike DNS) to changes in users, traffic🚦, network, health, etc.
  • Pay by making ingress traffic🚦 billable (Cheaper than egress) plus hourly per rule

Cloud CDN – (Global) Low-latency content delivery based on HTTP(S) CLB integrated w/ GCE & GCS

  • Supports HTTP/2 and HTTPS, but no custom origins (GCP☁️ only)
  • Simple checkbox ✅ on HTTP(S) Load Balancer 🏋️‍♀️ config turns this on
  • On cache miss, pay origin-> POP “cache fill” egress charges (cheaper for in-region)
  • Always pay POP->clint egress charges, depending on location
  • Pay for HTTP(S) request volume
  • Pay per cache invalidation request (not per Resource⚙️ invalidated)
  • Origin costs (e.g. CLB, GCS) can be much lower because cache hits reduce load

Virtual Private Cloud (VPC) – (Regional, Global), Global IP v4 unicast Software-Defined Network (SDN) for GCP☁️ resources

  • Automatic mode is easy; custom mode gives control
  • Configure subnets (each with a private IP range), routes, firewalls🔥, VPNs, BGP, etc.
  • VPC is Global🌎 and subnets are regional (not zonal)
  • Can be shared across multiple projects in same org and peered with other VPCs
  • Can enable private (internal IP) access to some GCP☁️ services (e.g. BQ, GCS)
  • Free to configure VPC (container)
  • Pay to use certain services (e.g. VPN) and for network egress

Cloud Interconnect – (Regional, Multi-Regional) Options for connecting external networks to Google’s network

  • Private connections to VPC via Cloud VPN or Dedicated/Partner Interconnect
  • Public Google services (incl. GCP) accessible via External Peering (no SLAs)
    • Direct Peering for high volume
    • Carrier Peering via a partner for lower volume
  • Significantly lower egress fees
    • Except Cloud VPN, which remains unchanged

Internal Networking

Cloud Virtual Private Network (VPN)– (Regional) IPSEC VPN to connect to VPC via public internet for low-volume data connections

  • For persistent, static connections between gateways (i.e. not for a dynamic client)
    • Peer VPN gateway must have static (unchanging) IP
  • Encrypted 🔐 link🔗 to VPC (as opposed to Dedicated interconnect), into one subnet
  • Supports both static and dynamic routing
  • 99.9% availability SLA
  • Pay per tunnel-hour
  • Normal traffic🚦 charges apply

Dedicated Interconnect – (Regional, Multi-Regional) Direct physical link 🔗 between VPC and on-prem for high-volume data connections

  • VLAN attachment is private connection to VPC in one region: no public GCP☁️ APIs
    • Region chosen from those supported by particular Interconnect Location
  • Links are private but not Encrypted 🔐; can layer your own encryption 🔐
    • Redundancy achieves 99.99% availability: otherwise, 99.9% SLA
  • Pay fee 10 Gbps link, plus (relatively small) fee per VLAN attachment
  • Pay reduced egress rates from VPC through Dedicated Interconnect

Cloud Router 👮‍♀️ – (Regional) Dynamic routing (BGP) for hybrid networks linking GCP VPCs to external networks

  • Works with Cloud VPN and Dedicated Interconnect
  • Automatically learns subnets in VPC and announces them to on-prem network
  • Without Cloud Router👮‍♀️ you must manage static routes for VPN
    • Changing the IP addresses on either side of VPN requires recreating it
  • Free to set up
  • Pay for usual VPC egress

CDN Interconnect – (Regional, Multi-Regional) Direct, low-latency connectivity to certain CDN providers, with cheaper egress

  • For external CDNs, not Google’s Cloud CDN service
    • Supports Akamai, Cloudflare, Fastly, and more
  • Works for both pull and push cache fills
    • Because it’s for all traffic🚦 with that CDN
  • Contact CDN provider to set up for GCP☁️ project and which regions
  • Free to enable, then pay less for the egress you configured

Machine Learning/AI 🧠

Cloud Machine Learning (ML) Engine – (Regional) Massively Scalable ⚖️ managed service for training ML models & making predictions

  • Enables apps/devs to use TensorFlow on datasets of any size, endless use cases
  • Integrates: GCS/BQ (storage), Cloud Datalab (dev), Cloud Dataflow (preprocessing)
  • Supports online & batch predictions, anywhere: desktop, mobile, own servers
  • HyperTune🎶 automatically tunes 🎶model hyperparameters to avoid manual tweaking
  • Training: Pay per hour depending on chosen cluster capabilities (ML training units)
  • Prediction: Pay per provisioned node-hour plus by prediction request volume made

Cloud Vison API👓 – (Global) Classifies images🖼 into categories, detects objects/faces, & finds/reads printed text

  • Pre-trained ML model to analyze images🖼 and discover their contents
  • Classifies into thousands of categories (e.g., “sailboat”, “lion”, “Eiffel Tower”)
  • Upload images🖼 or point to ones stored in GCS
  • Pay per image, based on detection features requested
    • Higher price for OCR of Full documents📃 and finding similar images🖼 on the web🕸
    • Some features are prices together: Labels🏷 + SafeSearch, ImgProps + Cropping
    • Other features priced individually: Text, Faces, Landmarks, Logos

Cloud Speech API🗣 – (Global) Automatic Speech Recognition (ASR) to turn spoken word audio files into text

  • Pre-trained ML model for recognizing speech in 110+ languages/variants
  • Accepts pre-recorded or real-time audio, & can stream results back in real-time
  • Enables voice command-and-control and transcribing user microphone dictations
  • Handles noisy source audio
  • Optionally filters inappropriate content in some languages
  • Accepts contextual hints: words and names that will likely be spoken
  • Pay per 15 seconds of audio processed

Cloud Natural Language API 💬 – (Global) Analyzes text for sentiment, intent, & content classification, and extracts info

  • Pre-trained ML model for understanding what text means, so you can act on it
  • Excellent with Speech API (audio), Vision API (OCR), & Translation API (or built-ins)
  • Syntax analysis extracts tokens/sentences, parts of speech & dependency trees
  • Entity analysis finds people, places, things, etc., labels🏷 them & links🔗 to Wikipedia
  • Analysis for sentiment (overall) and entity sentiment detect +/- feelings & strength
  • Content classification puts each document📃 into one of 700+ predefined categories
  • Charged per request of 1000 characters, depending on analysis types requested

Cloud Translation API –(Global) Translate text among 100+ languages; optionally auto-detects source language

  • Pre-trained ML model for recognizing and translating semantics, not just syntax
  • Can let people support multi-regional clients in non-native languages,2-way
  • Combine with Speech, Vision, & Natural Language APIs for powerful workflows
  • Send plain text or HTML and receive translation in kind
  • Pay per character processed for translation
  • Also pay per character for language auto-detection

Dialogflow – (Global) Build conversational interfaces for websites, mobile apps, messaging, IoT devices

  • Pre-trained ML model and service for accepting, parsing, lexing input & responding
  • Enables useful chatbot and other natural user interactions with your custom code
  • Train it to identify custom entity types by providing a small dataset of examples
  • Or choose from 30+ pre-built agents (e.g. car🚙, currency฿, dates as starting template
  • Supports many different languages and platforms/devices
  • Free plan had unlimited text interactions and capped voice interactions
  • Paid plan is unlimited but charges per request: more for voice, less for text

Cloud Video Intelligence API 📹 – (Regional, Global) Annotates videos in GCS (or directly uploaded) with info about what they contain

  • Pre-trained ML model for video scene analysis and subject identification
  • Enables you to search a video catalog the same way you search text documents📃
  • “Specify a region where processing will take place (for regulatory compliance)”
  • Label Detection: Detect entities within the video, such as “dog” 🐕, “flower” 🌷 or “car”🚙
  • Shot Change Detection: Detect scene changes within the video🎞
  • SafeSearch Detection: Detect adult content within the video🎞
  • Pay per minute of video🎞 processed, depending on requested detection modes

Cloud Job Discovery– (Global) Helps career sites, company job boards, etc. to improve engagement & conversion

  • Pre-trained ML model to help job seekers search job posting databases
  • Most job sites rely on keyword search to retrieve content which often omits relevant jobs and overwhelms the job seeker with irrelevant jobs. For example, a keyword search with any spelling error returns) results, and a keyword search for “dental assistant” returns any “assistant” role that offers dental benefits.’
  • Integrates with many job/hiring systems
  • Lots of features, such as commute distance and recognizing abbreviations/jargon
  • “Show me jobs with a 30-minute commute on public transportation from my home”

Big Data and IoT

            Four Different Stages:

  1. Ingest – Pull in all the raw data in
  2. Store – Store data without data loss and easy retrieval
  3. Process – transform that raw data into some actionable information
  4. Explore & Visualize – turn the results of that analysis into something that’s valuable for your business

Cloud Internet of Things (IoT) Core– (Global) Fully managed service to connect, manage, and ingest data from device Globally

  • Device Manager handles device identity, authentication, config & control
  • Protocol Bridge publishes incoming telemetry to Cloud☁️ Pub/Sub for processing
  • Connect securely using IoT industry standard MQTT or HTTPS protocols
  • CA signed certificates can be used to verify device ownership on first connect
  • Two-way device communication enables configuration & firmware updates
  • Device shadows enable querying & making control changes while devices offline
  • Pay per MB of data exchanged with devices; no per-device charge

Cloud Pub/Sub– (Global) Infinitely Scalable⚖️ at-least-once messaging for ingestion, decoupling, etc.

  • “Global🌎 by default: Publish… and consume from anywhere, with consistent latency”.
  • Messages can be up to 10 MB and undelivered ones stored for 7 days-but no DLQ
  • Push mode delivers to HTTPS endpoints & succeeds on HTTP success status code
    • “Slow-start” algorithm ramps up on success and backs off & retries, on failures
  • Pull mode delivers messages to requestion clients and waits for ACK to delete
    • Let’s clients set rate of consumption, and supports batching and long-polling
  • Pay for data volume
    • Min 1KB per publish/Push/Pull request (not by message)

Cloud Dataprep– (Global) Visually explore, clean, and prepare data for analysis without running 🏃‍♂️ servers

  • “Data Wrangling” (i.e. “ad-hoc ETL”) for business analysts, not IT pros
    • Who might otherwise spend 80% of their time cleaning data?
  • Managed version of Trifacta Wrangler – and managed by Trifacta, not Google
  • Source data from GCS, BQ, or file upload – formatted in CSV, JSON, or relational
  • Automatically detects schemas, datatypes, possible joins, and various anomalies
  • Pay for underlying Daaflow job, plus management overhead charge
  • Pay for other accessed services (e.g. GCS, BQ)

Cloud Dataproc– (Zonal) Batch MapReduce processing via configurable, managed Spark & Hadoop clusters

  • Handles being told to scale (adding or removing nodes) even while running 🏃‍♂️ jobs
  • Integrated with Cloud☁️ Storage, BigQuery, Bigtable, and some Stackdriver services
  • “Image versioning” switches between versions of Spark, Hadoop, & other tools
  • Pay directly for underlying GCE servers used in the cluster – optionally preemptible
  • Pay a Cloud Dataproc management fee per vCPU-hour in the cluster
  • Best for moving existing Spark/Hadoop setups to GCP☁️
    • Prefer Cloud Dataflow for new data processing pipelines – “Go with the flow”

Cloud Datalab 🧪– (Regional) Interactive tool 🔧 for data exploration🔎, analysis, visualization📊 and machine learning

  • Uses Jupyter Notebook📒
    • “[A]n open-source web🕸 application that allows you to create and share documents📃 that contain live code, equations, visualizations and narrative text. Use include data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.”
  • Supports iterative development of data analysis algorithms in Python🐍/ SQL/~JS
  • Pay for GCE/GAE instance hosting and storing (on PD) your notebook📒
  • Pay for any other resources accessed (e.g. BigQuery)

Cloud Data Studio– (Global) Big Data Visualization📊 tool 🔧 for dashboards and reporting

  • Meaningful data stories/presentations enable better business decision making
  • Data sources include BigQuery, Cloud SQL, other MySQL, Google Sheets, Google Analytics, Analytics 360, AdWords, DoubleClick, & YouTube channels
  • Visualizations include time series, bar charts, pie charts, tables, heat maps, geo maps, scorecards, scatter charts, bullet charts, & area charts
  • Templates for quick start; customization options for impactful finish
  • Familiar G Suite sharing and real-time collaboration

Cloud Genomics 🧬– (Global) Store and process genomes🧬 and related experiments

  • Query complete genomic🧬 information of large research projects in seconds
  • Process many genomes🧬 and experiments in parallel
  • Open Industry standards (e.g. From Global🌎 Alliance for Genomics🧬 and Health)
  • Supports “Requester Pays” sharing

Identity and Access – Core Security🔒

Roles– (Global) collections of Permissions to use or manage GCP☁️ resources

  • Permissions allow you to perform certain actions: Service.Resource.Verb
  • Primitive Roles: Owner, Editor, Viewer
    • Viewer is read-only; Editor can change things; Owner can control access & billing
    • Pre-date IAM service, may still be useful (e.g., dev/test envs), but often too broad
  • Predefined Roles: Give granular access to specific GCP☁️ resources (IAM)
    • E.g.: roles/bigquery.dataEditor, roles/pub/sub.subscriber
  • Custom Roles: Project- or Org-level collections you define of granular permissions

Cloud Identity and Access Management (IAM)– (Global) Control access to GCP☁️ resources: authorization, not really authentication/identity

  • Member is user, group, domain, service account, or the public (e.g. “allUsers”)
    • Individual Google account, Google group, G Suite/ Cloud Identity domain
    • Service account belongs to application/instance, not individual end user
    • Every identity has a unique e-mail address, including service accounts
  • Policies bind Members to Roles at a hierarchy👑 level: Org, Folder📂, Project, Resource⚙️
    • Answer: Who can do what to which thing(s)?
  • IAM is free; pay for authorized GCP☁️ service usage

Service Accounts– (Global) Special types of Google account that represents an application, not an end user

  • Can be “assumed” by applications or individual users (when so authorized)
  • “Important: For almost all cases, whether you are developing locally or in a production application, you should use service accounts, rather than user accounts or API keys🔑.”
  • Consider resources and permissions required by application; use least privilege
  • Can generate and download private keys🔑 (user-managed keys🔑), for non-GCP☁️
  • Cloud-Platform-managed keys🔑 are better, for GCP☁️ (i.e. GCF, GAE, GCE, and GKE)
    • No direct downloading: Google manages private keys🔑 & rotates them once a day

Cloud Identity– (Global) Identity as a Service (IDaaS, not DaaS) to provision and manage users and groups

  • Free Google Accounts for non-G-Suite users, tied to a verified domain
  • Centrally manage all users in Google Admin console; supports compliance
  • 2-Step verification (2SV/MFA) and enforcement, including security🔒 keys🔑
  • Sync from Active Directory and LDAP directories via Google Cloud☁️ Directory Sync
  • Identities work with other Google services (e.g. Chrome)
  • Identities can be used to SSO with other apps via OIDC, SAML, OAuth2
  • Cloud Identity is free; pay for authorized GCP☁️ service usage

Security Key Enforcement– (Global) USB or Bluetooth 2-step verification device that prevents phishing🎣

  • Not like just getting a code via email📧 or text message…
  • Eliminates man-in-the-middle (MITM) attacks against GCP☁️ credentials

Cloud Resource Manager– (Global) Centrally manage & secure organization’s projects with custom Folder📂 hierarchy👑

  • Organization Resource⚙️ is root node in hierarchy👑, folders📂 per your business needs
  • Tied 1:1 to a Cloud Identity / G Suite domain, then owns all newly created projects
  • Without this organization, specific identities (people) must own GCP☁️ projects
  • “Recycle bin” allows undeleting projects
  • Define custom IAM policies at organization, Folder📂, or project levels
  • No charge for this service

Cloud Identity-Aware Proxy (IAP)– (Global) Guards apps running 🏃‍♂️ on GCP☁️ via identity verification, not VPN access

  • Based on CLB & IAM, and only passes authed requests through
  • Grant access to any IAM identities, incl. group & service accounts
  • Relatively straightforward to set up
  • Pay for load balancing🏋️‍♀️ / protocol forwarding rules and traffic🚦

Cloud Audit Logging– (Global) “Who did what, where and when?” within GCP☁️ projects

  • Maintains non-tamperable audit logs for each project and organization:
  • Admin Activity and System Events (400-day retention)
  • Access Transparency (400-day retention)
  • Shows actions by Google support staff
  • Data Access (30-day retention)
  • For GCP-visible services (e.g. Can’t see into MySQL DB on GCE)
  • Data Access logs priced through Stackdriver Logging; rest are free

Security Management – Monitoring🎛 and Response

Cloud Armor🛡 – (Global) Edge-level protection from DDoS & other attacks on Global🌎 HTTP(S) LB🏋️‍♀️

  • Offload work: Blocked attacks never reach your systems
  • Monitor: Detailed request-level logs available in Stackdriver Logging
  • Manage Ips with CIDR-based allow/block lists (aka whitelist/blacklist)
  • More intelligent rules forthcoming (e.g. XSS, SQLi, geo-based🌎, custom)
  • Preview effect of changes before making them live
  • Pay per policy and rule configured, plus for incoming request volume

Cloud Security Scanner– (Global) Free but limited GAE app vulnerability scanner with “very low false positive rates”

  • “After you set up a scan, Cloud☁️ Security🔒 Scanner automatically crawls your application, following all links🔗 within the scope of your starting URLs, and attempts to exercise as many user inputs and event handlers as possible.”
  • Can identify:
    • Cross-site-scripting (XSS)
    • Flash🔦 injection💉
    • Mixed content (HTTP in HTTPS)
    • Outdated/insecure libraries📚

Cloud Data Loss Prevention API (DLP) – (Global) Finds and optionally redacts sensitive info is unstructured data streams

  • Helps you minimize what you collect, expose, or copy to other systems
  • 50+ sensitive data detectors, including credit card numbers, names, social security🔒 numbers, passport numbers, driver’s license numbers (US and some other jurisdictions), phone numbers, and other personally identifiable information (PII)
  • Data can be sent directly, or API can be pointed at GCS, BQ, or Cloud☁️ DataStore
  • Can scan both text and images🖼
  • Pay for amount of data processed (per GB) –and gets cheaper when large volume
    • Pricing for storage 🗄now very simple (June 2019), but for streaming is still a mess

Event Threat Detection (ETD)– (Global) Automatically scans your Stackdriver logs for suspicious activity

  • Uses industry-leading threat intelligence, including Google Safe Browsing
  • Quickly detects many possible threats, including:
    • Malware, crypto-mining, outgoing DDoS attacks, port scanning, brute-force SSH
    • Also: Unauthorized access to GCP☁️ resources via abusive IAM access
  • Can export parsed logs to BigQuery for forensic analysis
  • Integrates with SIEMs like Google’s Cloud☁️ SCC or via Cloud Pub/Sub
  • No charge for ETD, but charged for its usage of other GCP☁️ services (like SD Logging)

Cloud Security Command Center (SCC) – (Global)

  • “Comprehensive security🔒 management and data risk platform for GCP☁️”
  • Security🔒 Information and Event Management (SIEM) software
  • “Helps you prevent, detect & respond to threats from a single pane of glass”
  • Use: Security🔒 Marks” (aka “marks”) to group, track, and manage resources
  • Integrate ETD, Cloud☁️ Scanner, DLP, & many external security🔒 finding sources
  • Can alert 🔔 to humans & systems; can export data to external SIEM
  • Free! But charged for services used (e.g. DLP API, if configured)
  • Could also be charged for excessive uploads of external findings

Encryption Key Management 🔐

Cloud Key Management Services (KMS)– (Regional, Multi-Regional, Global) Low-latency service to manage and use cryptographic keys🔑

  • Supports symmetric (e.g. AES) and asymmetric (e.g. RSA, EC) algorithms
  • Move secrets out of code (and the like) and into the environment, in a secure way
  • Integrated with IAM & Cloud☁️ Audit Logging to authorize & track key🔑 usage
  • Rotate keys🔑 used for new encryption 🔐 either automatically or on demand
    • Still keeps old active key🔑 versions, to allow decrypting
  • Key🔑 deletion has 24-hour delay, “to prevent accidental or malicious data loss”
  • Pay for active key🔑 versions stored over time
  • Pay for key🔑 use operations (i.e. encrypt/decrypt; admin operation are free)

Cloud Hardware Security Module (HSM)– (Regional, Multi-Regional, Global) Cloud KMS keys🔑 managed by FIPS 140-2 Level 3 certified HSMs

  • Device hosts encryption 🔐 keys🔑 and performs cryptographic operations
  • Enables you to meet compliance that mandates hardware environment
  • Fully integrated with Cloud☁️ KMS
    • Same API, features, IAM integration
  • Priced like Cloud KMS: Active key🔑 versions stored & key🔑 operations
    • But some key🔑 types more expensive: RSA, EC, Long AES

Operations and Management

Google Stackdriver– (Global) Family of services for monitoring, logging & diagnosing apps on GCP/AWS/hybrid

  • Service integrations add lots of value – among Stackdriver and with GCP☁️
  • One Stackdriver account can track multiple:
    • GCP☁️ projects
    • AWS☁️ accounts
    • Other resources
  • Simple usage-based pricing
    • No longer previous system of tiers, allotments, and overages

Stackdriver Monitoring– (Global) Gives visibility into perf, uptime, & overall health of Cloud☁️ apps (based on collectd)

  • Includes built-in/custom metrics, dashboards, Global🌎 uptime monitoring, & alerts
  • Follow the trail: Links🔗 from alerts to dashboards/charts to logs to traces
  • Cross-Cloud☁️: GCP☁️, of course, but monitoring🎛 agent also supports AWS
  • Alerting policy config includes multi-condition rules & Resource⚙️ organization
  • Alert 🔔 via email, GCP☁️ Mobile App, SMS, Slack, PagerDuty, AWS SNS, webnook, etc.
  • Automatic GCP☁️/Anthos metrics always free
  • Pay for API calls & per MB for custom or AWS metrics

Stackdriver Logging – (Global) Store, search🔎, analyze, monitor, and alert 🔔 on log data & events (based on Fluentd)

  • Collection built into some GCP☁️, AWS support with agent, or custom send via API
  • Debug issues via integration with Stackdriver Monitoring, Trace & Error Reporting
  • Crate real-time metrics from log data, then alert 🔔 or chart them on dashboards
  • Send real-time log data to BigQuery for advanced analytics and SQL-like querying
  • Powerful interface to browse, search, and slice log data
  • Export log data to GCS to cost-effectively store log archives
  • Pay per GB ingested & stored for one month, but first 50GB/project free

Stackdriver Error Reporting– (Global) Counts, analyzes, aggregates, & tracks crashes in helpful centralized interface

  • Smartly aggregates errors into meaningful groups tailored to language/framework
  • Instantly alerts when a new app error cannot be grouped with existing ones
  • Link🔗 directly from notifications to error details:
    • Time chart, occurrences, affected user count, first/last seen dates, cleaned stack
  • Exception stack trace parser know Java☕️, Python🐍, JavaScript, Ruby💎,C#,PHP, & Go🟢
  • Jump from stack frames to source to start debugging
  • No direct charge; pay for source data in Stackdriver Logging

Stackdriver Trace– (Global) Tracks and displays call tree 🌳 & timings across distributed systems, to debug perf

  • Automatically captures traces from Google App Engine
  • Trace API and SDKs for Java, Node.js, Ruby, and God Capture traces from anywhere
  • Zipkin collector allows Zipkin tracers to submit data to Stackdriver Trace
  • View aggregate app latency info or dig into individual traces to debug problems
  • Generate reports on demand and get daily auto reports per traced app
  • Detects app latency shift (degradation) over time by evaluating perf reports
  • Pay for ingesting and retrieving trace spans

Stackdriver Debugger– (Global) Grabs program state (callstack, variables, expressions) in live deploys, low impact

  • Logpoints repeat for up to 24h; fuller snapshots run once but can be conditional
  • Source view supports Cloud Source Repository, Github, Bitbucket, local, & upload
  • Java☕️ and Python🐍 supported on GCE, GKE, and GAE (Standard and Flex)
  • Node.js and Ruby💎supported on GCE, GKE, and GAE Flex; Go only on GCE & GKE
  • Automatically enabled for Google App Engine apps, agents available for others
  • Share debugging sessions with others (just send URL)

Stackdriver Profiler– (Global) Continuous CPU and memory profiling to improve perf & reduce cost

  • Low overhead (typical: 0.5%; Max: 5%) – so use it in prod, too!
  • Supports Go, Java, Node,js, and Python (3.2+)
  • Agent-based
  • Saves profiles for 30 days
  • Can download profiles for longer-term storage

Cloud Deployment Manager– (Global) Create/manage resources via declarative templates: “Infrastructure as Code”

  • Declarative allows automatic parallelization
  • Templated written in YAML, Python🐍, or Jinja2
  • Supports input and output parameters, with JSON schema
  • Create and update of deployments both support preview
  • Free service: Just pay for resources involved in deployments

Cloud Billing API 🧾– (Global) Programmatically manage billing for GCP☁️ projects and get GCP☁️ pricing

  • Billing 🧾 Config
    • List billing🧾 accounts; get details and associated projects for each
    • Enable (associate), disable (disassociated), or change project’s billing account
  • Pricing
    • List billable SKUs; get public pricing (including tiers) for each
    • Get SKU metadata like regional availability
  • Export of current bill to GCS or BQ is possible – but configured via console, not API

Development and APIs

Cloud Source Repositories – (Global) Hosted private Git repositories, with integrations to GCP☁️ and other hosted repos

  • Support standard Git functionality
  • No enhanced workflow support like pull requests
  • Can set up automatic sync from GitHub or Bitbucket
  • Natural integration with Stackdriver debugger for live-debugging deployed apps
  • Pay per project-user active each month (not prorated)
  • Pay per GB-month of data storage 🗄(prorated), Pay per GB of Data egress

Cloud Build 🏗 – (Global) Continuously takes source code and builds, tests and deploys it – CI/CD service

  • Trigger from Cloud Source Repository (by branch, tag or commit) or zip🤐 in GCS
    • Can trigger from GitHub and Bitbucket via Cloud☁️ Source Repositories RepoSync
  • Runs many builds in parallel (currently 10 at a time)
  • Dockerfile: super-simple build+push – plus scans for package vulnerabilities
  • JSON/YAML file: Flexible🧘‍♀️ & Parallel Steps
  • Push to GCR & export artifacts to GCS – or anywhere your build steps wrtie
  • Maintains build logs and build history
  • Pay per minute of build time – but free tier is 120 minutes per day

Container Registry (GCR) 📦– (Regional, Multi-Regional) Fast🏃‍♂️, private Docker🐳 image storage 🗄 (based on GCS) with Docker🐳 V2 Registry API

  • Creates & manages a multi-regional GCS bucket 🗑, then translates GCR calls to GCS
  • IAM integration simplifies builds and deployments within GCP☁️
  • Quick deploys because of GCP☁️ networking to GCS
  • Directly compatible with standard Docker🐳 CLI; native Docker🐳 Login Support
  • UX integrated with Cloud☁️ Build & Stackdriver Logs
  • UI to manage tags and search for images🖼
  • Pay directly for storage 🗄and egress of underlying GCS (no overhead)

Cloud Endpoints – (Global) Handles authorization, monitoring, logging, & API keys🔑 for APIs backed by GCP☁️

  • Proxy instances are distributed and hook into Cloud Load Balancer 🏋️‍♀️
  • Super-fast🏃‍♂️ Extensible Service Proxy (ESP) container based on nginx: <1 ms /call
  • Uses JWTs and integrates with Firebase 🔥, AuthO, & Google Auth
  • Integrates with Stackdriver Logging and Stackdriver Trace
  • Extensible Service Proxy (ESP) can transcode HTTP/JSON to gRPC
    • But API needs to be Resource⚙️-oriented (i.e RESTful)        
  • Pay per call to your API

Apigee API Platform – (Global) Full-featured & enterprise-scale API management platform for whole API lifecycle

  • Transform calls between different protocols: SOAP, REST, XML, binary, custom
  • Authenticate via OAuth/SAML/LDAP: authorize via Role-Based Access Control
  • Throttle traffic🚦 with quotas, manage API versions, etc.
  • Apigee Sense identifies and alerts administrators to suspicious API behaviors
  • Apigee API Monetization supports various revenue models /rate pans
  • Team and Business tiers are flat monthly rate with API call quotas & feature sets
  • “Enterprise” tier and special feature pricing are “Contact Sales”

Test Lab for Android – (Global) Cloud☁️ infrastructure for running 🏃‍♂️ test matrix across variety of real Android devices

  • Production-grade devices flashed with Android version and locale you specify
  • Robo🤖 test captures log files, saves annotated screenshots & video to show steps
    • Default completely automatic but still deterministic, so can show regressions
    • Can record custom script
  • Can also run Espresso and UI Automator 2.0 instrumentation tests
  • Firebase Spark and Flame plans have daily allotment of physical and virtual tests
  • Blaze (PAYG) plan charges per device-hour-much less for virtual devices

“Well, we all shine☀️ on… Like the moon🌙 and the stars🌟 and the sun🌞

Thanks –

–MCS

Week of October 23rd

Part III of a Cloud️ Journey

“I’m learning to fly ✈️, around the clouds ☁️

Hi All –

Happy National Mole Day! 👨‍🔬👩🏾‍🔬

“Open your eyes 👀, look up to the skies🌤 and see 👓

As we have all learned, Cloud computing ☁️ empowers us all to focus our time 🕰 on dreaming 😴 up and creating the next great scalable ⚖️ applications. In addition, Cloud computing☁️ enables us less worry😨 time 🕰 about infrastructure, managing and maintaining deployment environments or agonizing😰 over security🔒. Google evangelizes these principals stronger💪 than any other company in the world 🌎.

Google’s strategy for cloud computing☁️ is differentiated by providing open source runtime systems and a high-quality developer experience where organizations could easily move workloads from one cloud☁️ provider to another.

Once again, this past week we continued our exploration with GCP by finishing the last two courses as part of Google Cloud Certified Associate Cloud Engineer Path on Pluralsight. Our ultimate goal was to have a better understanding of the various GCP services and features and be able to apply this knowledge, to better analyze requirements and evaluate numerous options available in GCP. Fortunately, we gained this knowledge and a whole lot more! 😊

Guiding us through another great introduction on Elastic Google Cloud Infrastructure: Scaling and Automation were well-known friends Phillip Maier and Mylene Biddle. Then taking us through the rest of the way through this amazing course was the always passionate Priyanka Vergadia.

Then finally taking us down the home stretch 🏇 with Architecting with Google Kubernetes Engine – Foundations (which was last of this amazing series of Google Goodness 😊) were famous Googlers Evan Jones and Brice Rice.  …And Just to put the finishing touches 👐 on this magical 🎩💫 mystery tour was Eoin Carrol who gave us in depth look at Google’s game changer in Modernizing existing applications and building cloud-native☁️ apps anywhere with Anthos.

After a familiar introduction by Philip and Mylene we began delving into the comprehensive and flexible 🧘‍♀️ infrastructure and platform services provided by GCP.

“Across the clouds☁️ I see my shadow fly✈️… Out of the corner of my watering💦 eye👁

Interconnecting Networks – There are 5 ways of connecting your infrastructure to GCP:

  1. Cloud VPN
  2. Dedicated interconnect
  3. Partner interconnect
  4. Direct peering
  5. Carrier peering

Cloud VPN – securely connects your on-premises network to your GCP VPC network. In order to connect to your on-premise network via Cloud VPN configure cloud VPN, VPN Gateway, and to repeat in tunnels.

  • Useful for low-volume connections
  • 99.9% SLA
  • Supports:
    • Site-to-site VPN
    • Static routes
    • Dynamic routes (Cloud Router)
    • IKEv1 and IKEv2 ciphers

Please note: The maximum transmission unit or MTU for your on-premises VPN gateway cannot be greater than 1460 bytes.

$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" target-vpn-gateways create "vpn-1" --region "us-central1" --network "vpn-network-1"

$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" target-vpn-gateways create "vpn-1" --region "us-central1" --network "vpn-network-1"
$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" forwarding-rules create "vpn-1-rule-esp" --region "us-central1" --address "35.192.18.39" --IP -protocol "ESP" --target-vpn-gateway "vpn-1"

$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" forwarding-rules create "vpn-1-rule-udp500" --region "us-central1" --address "35.192.18.39" --IP -protocol "UDP" --ports "500" --target-vpn-gateway "vpn-1"

$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" forwarding-rules create "vpn-1-rule-udp4500" --region "us-central1" --address "35.192.18.39" --IP -protocol "UDP" --ports "4500" --target-vpn-gateway "vpn-1"

$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" vpn-tunnels create "tunnel1to2" --region "us-central1" --ike-version "2" --target-vpn-gateway "vpn-1"

$gcloud compute --project "qwiklabs-GCP -02-9474b560327d" vpn-tunnels create "vpn-1-tunnel-2" --region "us-central1" --peer-address "34.78.144.99" --shared-secret "GCP  rocks" --ike-version "2" --local-traffic-selector "0.0.0.0/0" --target-vpn-gateway "vpn-1"

Cloud Interconnect and Peering

Dedicated connections provide a direct connection to Google’s network while shared connections provide a connection to Google’s network through a partner

Comparison of Interconnect Options

  • IPsec VPN Tunnel – Encrypted tunnel to VPC networks through the public internet
    • Capacity 1.5-3 Gbps/Tunnel
    • Requirements VPN Gateway
    • Access Type – Internal IP Addresses
  • Cloud Interconnect – Dedicated interconnect provides direct physical connections
    • Capacity 10 Gbps/Link -100 Gbps/ link
    • Requirements – connection in colocation facility
    • Access Type– Internal IP Addresses
  • Partner Interconnect– provides connectivity through a supported service provider
    • 50 Mbps -10 Gbps/connection
    • Requirements – Service provider
    • Access Type– Internal IP Addresses

Peering

Direct Peering provides a direct connection between your business network and Google.

  • Broad-reaching edge network locations
  • Capacity 10 Gbps/link
  • Exchange BGP routes
  • Reach all of Google’s services
  • Peering requirement (Connection in GCP PoPs)
  • Access Type: Public IP Addresses

Carrier Peering provides connectivity through a supported partner

  • Carrier Peering partner
  • Capacity varies based on parent offering
  • Reach all of Google’s services
  • Partner requirements
  • No SLA
  • Access Type: Public IP Addresses

Choosing the right connection – Decision Tree 🌳

Shared VPC and VPC Peering

Shared VPC allows an organization to connect Resource is from multiple projects to a common VPC network.

VPC Peering is a decentralized or distributed approach to multi project networking because each VPC network may remain under the control of separate administrator groups and maintains its own global firewall and routing tables.

Load Balancing 🏋️‍♂️ and Autoscaling

Cloud Load Balancing 🏋️‍♂️ – distributes user traffic 🚦across multiple instances of your applications. By spreading the load, load balancing 🏋️‍♂️ reduces the risk that your applications experience performance issues. There are 2 basic categories of Load balancers:  Global load balancing 🏋️‍♂️ and Regional load balancing 🏋️‍♂️.

Global load balancers – when workloads are distributed across the world 🌎 Global load balancers route traffic🚦 to a backend service in the region closest to the user, to reduce latency.

They are software defined distributed systems using Google Front End (GFE) reside in Google’s PoPs are distributed globally

Types of Global Load Balancers

  • External HTTP and HTTPS (Layer 7)
  • SSL Proxy (Layer 4)
  • TCP Proxy (Layer 4)

Regional load balancers – when all workloads are in the same region

  • Regional load balancing 🏋️‍♂️ route traffic🚦within a given region.
  • Regional Load balancing 🏋️‍♂️ uses internal and network load balancers.

Internal load balancers are software defined distributed systems (using Andromeda) and network load balancers which use  Maglev distributed system.

Types of Global Load Balancers

  • Internal TCP/UDP (Layer 4)
  • Internal HTTP and HTTPS (Layer 7)
  • TCP/UDP Network (Layer 4)

Managed instance groups – is a collection off identical virtual machine instances that you control as a single entity. (Same as creating a VM but applying specific rules to an Instance group)

  • Deploy identical instances based on instance template
  • Instance group can be resized
  • Manager ensures all instances are Running
  • Typically used with autoscaling ⚖️
  • Can be single zone or regional

Regional managed instance groups are usually recommended over zonal managed instance groups because this allow you to spread the application’s load across multiple zones through replication and protect against zonal failures.

         Steps to create a Managed Instance Group:

  1. Need to decide the location and whether instance group will be Single or multi-zones
  2. Choose ports that are going allow load balancing🏋️‍♂️ across.
  3. Select Instance template
  4. Decide Autoscaling ⚖️ and criteria for use
  5. Creatine a health ⛑ check to determine instance health and how traffic🚦should route

Autoscaling and health checks

Managed instance groups offer autoscaling ⚖️ capabilities

  • Dynamically add/remove instances
    • Increases in load
    • Decrease in load
  • Autoscaling policy
    • CPU Utilization
    • Load balancing 🏋️‍♂️capacity
    • Monitoring 🎛 metrics
    • Queue-based workload

HTTP/HTTPs load balancing

  • Target HTTP/HTTPS proxy
  • One singed SSL certificate installed (minimum)
  • Client SSL session terminates at the load balancer 🏋️‍♂️
  • Support the QUIC transport layer protocol
  • Global load balancing🏋️‍♂️
  • Anycast IP address
  • HTTP or port 80 or 8080
  • HTTPs on port 443
  • IPv4 or IP6
  • Autoscaling⚖️
  • URL maps 🗺

Backend Services

  • Health ⛑ check
  • Session affinity (Optional)
  • Time out setting (30-sec default)
  • One or more backends
    • An instance group (managed or unmanaged)
    • A balancing mode (CPU utilization or RPS)
    • A capacity scaler ⚖️ (ceiling % of CPU/Rate targets)

SSL certificates

  • Required for HTTP/HTTPS load balancing 🏋️‍♂️
  • Up to 10 SSL certificates /target proxy
  • Create an SSL certificate resource 

SSL proxy load balancing – global 🌎 load balancing service for encrypted non-http traffic.

  • Global 🌎 load balancing for encrypted non-HTTP traffic 🚦
  • Terminate SSL session at load balancing🏋️‍♂️ Layer
  • IPv4 or IPv6 clients
  • Benefits:
    • Intelligent routing
    • Certificate management
    • Security🔒 patching
    • SSL policies

TCP proxy load balancing – a global load balancing service for unencrypted non http traffic.

  • Global load balancing for encrypted non-HTTP traffic 🚦
  • Terminates TCP session at load balancing🏋️‍♂️ Layer
  • IPv4 or IPv6 clients
  • Benefits:
    • Intelligent routing
    • Security🔒 patching

Network load balancing – is a regional non-proxy load balancing service.

  • Regional, non-proxied load balancer
  • Forwarding rules (IP protocol data)
  • Traffic:
    • UDP
    • TCP/SSL ports
  • Backends:
    • Instance group
    • Target pool🏊‍♂️ resources – define a group of instances that receive incoming traffic 🚦from forwarding rules:
      • Forwarding rules (TCP and UDP)
      • Up to 50 per project
      • One health check
      • Instances must be in the same region

Internal load balancing – is a regional private load balancing service for TCP and UDP based traffic🚦

  • Regional, private load balancing 🏋️‍♂️
    • VM instances in same region
    • RFC 1918 IP address
  • TCP/UDP traffic 🚦
    • Reduced latency, simpler configuration
    • Software-defined, fully distributed load balancing (is not based on a device or a virtual machine.)

“..clouds☁️ roll by reeling is what they say … or is it just my way?”

Infrastructure Automation – Infrastructure as code (IaC)

Automate repeatable tasks like provisioning, configuration, and deployments for one machine or millions.

Deployment Manager -is an infrastructure deployment service that automates the creation and management of GCP. By defining templates, you only have to specify the resources once and then you can reuse them whenever you want.

  • Deployment Manager is an infrastructure automation tool🛠
  • Declarative Language (allows you to specify what the configuration should be and let the system figure out the steps to take)
  • Focus on the application
  • Parallel deployment
  • Template-driven

Deployment manager creates all the resources in parallel.

Additional infrastructure as code tools🛠 for GCP

  • Terraform 🌱
  • Chef 👨‍🍳
  • Puppet
  • Ansible
  • Packer 📦

It’s recommended that you provisioned and managed resource is on GCP with the tools🛠 you are already familiar with

GCP Marketplace 🛒 – lets you quickly deploy functional software packages that run 🏃‍♂️ on GCP.

  • Deploy production-grade solutions
  • Single bill for GCP and third-party services
  • Manage solutions using Deployment Manager
  • Notification 📩 when a security update is available
  • Direct access to partner support

“… Must be the clouds☁️☁️ in my eyes 👀

Managed Services – automates common activities, such as change requests, monitoring 🎛, patch management, security🔒, and backup services, and provides full lifecycle 🔄 services to provision, run🏃‍♂️, and support your infrastructure.

BigQuery is GCP serverless, highly scalable⚖️, and cost-effective cloud Data warehouse

  • Fully managed
  • Petabyte scale
  • SQL interface
  • Very fast🏃‍♂️
  • Free usage tier

Cloud Dataflow🚰 executes a wide variety of data processing patterns

  • Server-less, fully managed data processing
  • Batch and stream processing with autoscale ⚖️
  • Open source programming using Apache beam 🎇
  • Intelligently scale to millions of QPS

Cloud Dataprep – visually explore, clean, and prepare data for analysis and machine learning

  • Serverless, works at any scale⚖️
  • Suggest ideal analysis
  • Integrated partner service operated by Trifacta

Cloud Dataproc – is a service for running Apache Spark 💥 and Apache Hadoop 🐘 clusters

  • Low cost (per-second, preemptible)
  • Super-fast🏃‍♂️ to start, scale ⚖️, and shut down
  • Integrated with GCP
  • Managed Service
  • Simple and familiar

“Captain Jack will get you by tonight 🌃… Just a little push, and you’ll be smilin’😊 “

Architecting with Google Kubernetes Engine – Foundations

After Steller🌟 job by was Priyanka taking us through the load balancing 🏋️‍♂️ options, infrastructure as code and some of the managed service options in GCP it was time to take the helm⛵️ and get our K8s ☸️ hat 🧢 on.

Cloud Computing and Google Cloud

Just to wet our appetite 😋 for Cloud Computing ☁️and Evan takes through 5 fundamental attributes:

  1. On-demand self-services (No🚫 human intervention needed to get resources)
  2. Broad network access (Access from anywhere)
  3. Resource Pooling🏊‍♂️ (Provider shares resources to customers)
  4. Rapid Elasticity 🧘‍♀️ (Get more resources quickly as needed)
  5. Measured 📏Service (Pay only for what you consume)

Next, Evan introduced some of GCP Services under Compute like Compute Engine, Google Kubernetes Engine (GKE), App Engine, Cloud Functions. He then discussed some Google’s managed services with Storage, Big Data, Machine Learning Services.

Resource Management

Network

  • GCP provides resource is in multi-regions, regions and zones.
  • GCP divides the world 🌎 up into 3 multi-regional areas the Americas, Europe and Asia Pacific.
  • 3 multi regional areas are divided into regions which are independent geographic areas on the same continent.
  • Regions are divided into zones (like a data center) which are deployment areas for GCP resources.

The network interconnect with the public internet at more than 90 internet exchanges and more than 100 points of presence worldwide🌎 (and growing)

  • Zonal resources operate exclusively in a single zone
  • Regional resources span multiple zones
  • Global resources could be managed across multiple regions.
  • Resources have hierarchy
    • Organization is the root node of a GCP resource hierarchy
    • Folders reflect their hierarchy of your enterprise
    • Projects are identified by unique Project ID and Project Number
  • Cloud Identity Access Management (IAM) allows you to fine tune access controls on all resources in GCP.

Billing

  • Billing account pay for project resources
  • A billing account in linked to one or more projects
  • Charged automatically or invoiced every month or at threshold limit
  • Subaccounts can be used for separate billing for projects

How to keep billing under control

  1. Budgets and alerts 🔔
  2. Billing export 🧾
  3. Reports 📊
    • Quotas are helpful limits
      • Quotas apply at the level of GCP Project
      • There two types of quotas
        • Rate quotas reset after a specific time 🕰
        • Allocation quotas govern the number of resources in projects

GCP implements quotas which limit unforeseen extra billing charges. Quotas error designed to prevent the over consumption of resources because of an error or a malicious attack 👿.

Interacting with GCP -There are 4 ways to interact with GCP:

  1. Cloud Console
  • Web-based GUI to manage all Google Cloud resources
  • Executes common task using simple mouse clicks
  • Provides visibility into GCP projects and resources

2. SDK

3. Cloud Shell

  • gcloud
  • kubectl
  • gsutil
  • bq
  • bt
  • Temporary Compute Engine VM
  • Command-line access to the instance through a browser
  • 5 GB of persistent disk storage ($HOME dir)
  • Preinstalled Cloud SDK and other tools🛠
    • gcloud: for working with Compute Engine, Google Kubernetes Engine (GKE) and many Google Cloud services
    • gsutil: for working with Cloud Storage
    • kubectl: for working with GKE and Kubernetes
    • bq: for working with BigQuery
  • Language support for Java ☕️Go, Python 🐍 , Node.js, PHP, and Ruby♦️
  • Web 🕸preview functionality
  • Built-in authorization for access to resources and instances

4. Console mobile App

“Cloud☁️ hands🤲 reaching from a rainbow🌈 tapping at the window, touch your hair”

Introduction to Containers

Next Evan took us through a history of computing. First starting with deploying applications on its physical servers. This solution wasted resources and took a lot of time to deploy maintaining scale. It also wasn’t very portable. It all applications were built for a specific operating system, and sometimes even for specific hardware as well.

Next transitioning to Virtualization. Virtualization made it possible to run multiple virtual servers and operating systems on the same physical computer. A hypervisor is the software layer that removes the dependencies of an operating system with its underlying hardware. It allows several virtual machines to share that same hardware.

  • Hypervisors creates and manages virtual machines
  • Running multiple apps on a single VM
  • VM-centric way to solve this problem (run a dedicated virtual machine for each application.)

Finally, Evan introduced us to containers as they solve a lot of the short comings of Virtualization like:

  • Applications that share dependencies are not isolated from each other
  • Resource requirements from one application can starve out another application
  • A dependency upgrade for one application might cause another to simply stop working

Containers are isolated user spaces for running application code. Containers are lightweight as they don’t carry a full operating system. They could be scheduled or packed 📦tightly onto the underlying system, which makes them very efficient.

Containerization is the next step in the evolution of managing code.

Benefits of Containers:

  • Containers appeal to developers 👩🏽‍💻
  • Deliver high performing and scalable ⚖️ applications.
  • Containers run the same anywhere
  • Containers make it easier to build applications that use Microservices design pattern
    • Microservices
      • Highly maintainable and testable.
      • Loosely coupled. Independently deployable. Organized around business capabilities.

Containers and Container Images

An Image is an application and its dependencies

A container is simply a running 🏃‍♂️ instance image.

Docker 🐳 is an open source technology that allows you to create and run 🏃‍♂️ applications and containers, but it doesn’t offer away to orchestrate those applications at scale ⚖️

  • Containers use a varied set of Linux technologies
    • Containers use Linux name spaces to control what an application can see
    • Containers used Linux C groups to control what an application can use.
    • Containers use union file systems to efficiently encapsulate applications and their dependencies into a set of clean, minimal layers.
  • Containers are structured in layers
    • Container manifest is tool🛠 you used to build the image reads instructions from a file
    • Docker 🐳 file is a formatted container image
      • Each instruction in the Docker file specifies a layer (Read Only) inside the container image.
      • Readable ephemeral top layer
  • Containers promote smaller shared images

How to get containers?

  • Download containerized software from a container registry gcr.io
  • Docker 🐳 – Build your own container using the open-source docker 🐳 command
  • Build your own container using Cloud Build
    • Cloud Build is a service that executes your builds on GCP.
    • Cloud Build can import source code from
      • Google Cloud Storage
      • Cloud Source Repositories
      • GitHub,
      • Bitbucket

Introduction to Kubernetes ☸️

Kubernetes ☸️ is an open source platform that helps you orchestrate and manage your container infrastructure on premises or in the cloud☁️.

It’s a container centric management environment. Google originated it and then donated it to the open source community.

K8s ☸️ automates the deployment, scaling ⚖️, load balancing🏋️‍♂️, logging, monitoring 🎛 and other management features of containerized applications.

  • Facilitates the features of an infrastructure as a service
  • Supports Declarative configurations.
  • Allows Imperative configuration
  • Open Source

K8s ☸️ features:

  • Supports both stateful and stateless applications
  • Autoscaling⚖️
  • Resource limits
  • Extensibility

Kubernetes also supports workload portability across on premises or multiple cloud service providers. This allows Kubernetes to be deployed anywhere. You could move Kubernetes ☸️ workloads freely without vendor lock🔒 in

Google Kubernetes Engine (GKE)

GKE easily deploys, manages and scales⚖️ Kubernetes environments for your containerized applications on GCP.

GKE Features:

  • Fully managed
    • Cluster represent your containerized applications all run on top of a cluster in GKE.
    • Nodes are the virtual machines that host your containers inside of a GKE app cluster
  • Container-optimized OS
  • Auto upgrade
  • Auto repair🛠
  • Cluster Scaling⚖️
  • Seamless Integration
  • Identity and access management (IAM)
  • Integrated logging and monitoring (Stack Driver)
  • Integrated networking
  • Cloud Console

Compute Options Detail

 Computer Engine

  • Fully customizable virtual machines
  • Persistent disks and optional local SSDs
  • Global load balancing and autoscaling⚖️
  • Per-second billing

                  Use Cases

  • Complete control over the OS and virtual hardware
    • Well Suited for lift-and shift migrations to the cloud
    • Most flexible 🧘‍♀️ compute solution, often used when a managed solution is too restrictive

  App Engine

  • Provides a fully managed code-first platform
  • Streamlines application deployment and scalability⚖️
  • Provides support for popular programming language and application runtimes
  • Supports integrated monitoring 🎛, logging and diagnostics.

Use Cases

  • Websites
    • Mobile app📱 and gaming backends
    • Restful APIs

Google Kubernetes Engine

  • Fully managed Kubernetes Platform
  • Supports cluster scaling ⚖️, persistent disk, automated upgrades, and auto node repairs
  • Built-in integration with GCP
  • Portability across multiple environments
    • Hybrid computing
    • Multi-cloud computing

Use Cases

  • Containerized applications
    • Cloud-native distributed systems
    • Hybrid applications

Cloud Run

  • Enable stateless containers
  • Abstracts away infrastructure management
  • Automatically scales ⚖️ up⬆️ and down⬇️
  • Open API and runtime environment

Use Cases

  • Deploy stateless containers that listen for requests or events
    • Build applications in any language using any frameworks and tool🛠

Cloud Functions

  • Event-driven, serverless compute services
  • Automatic scaling with highly available and fault-tolerant design
  • Charges apply only when your code runs 🏃‍♂️
  • Triggered based on events in GCP, HTTP endpoints, and Firebase

Use Cases

  • Supporting microservice architecture
    • Serverless application backends
      • Mobile and IoT backends
      • Integrate with third-party services and APIs
    • Intelligent applications
      • Virtual assistant and chat bots
      • Video and image analysis

Kubernetes ☸️ Architecture

There are two related concepts in understanding K8s ☸️ works object model and principle of declarative management

Pods – the basic building block of K8s ☸️

  • Smallest deployable object.
  • Containers in a Pod share resources
  • Pods are not self-healing

Principle of declarative management – Declare some objects to represent those in generic containers.

  • K8s ☸️ creates and maintains one or more objects.
  • K8s ☸️ compares the desired state to the current state.

The Kubernetes ☸️Control Plane ✈️continuously monitor the state of the cluster, endlessly comparing reality to what has been declared and remedying the state has needed.

K8s ☸️ Cluster consists of a Master and Nodes

Master is to coordinate the entire cluster.

  • View or change the state of the cluster including launching pods.
  • kube-API server – the single component that interacts with the Cluster
    • kubectl server interacts with the database on behalf of the rest of the system
  • etcd – key-value store for the most critical data of a distributed system
  • kube-scheduler – assigns Pods to Nodes
  • kube-cloud-manager – embeds cloud-specific control logic.
  • Kube-controller-manager- daemon that embeds the core control loops

Nodes runs run pods.

  • kubelet is the primary “node agent” that runs on each node.
  • kube-proxy is a network proxy that runs on each node in your cluster

Google Kubernetes ☸️ Engine Concepts

GKE makes administration of K8s ☸️ much simpler

  • Master
    • GKE manages all the control plane components
    • GKE provisions and manages all the master infrastructure
  • Nodes
    • GKE manages this by deploying and registering Compute Engine instances as Nodes
    • Use node pools to manage different kinds of nodes
      • node pool (using nodemon) is a subset of nodes within a cluster that share a configuration, such as their amount of memory or their CPU generation.
      • nodemon is GKE specific feature
        • enable automatic node upgrades
        • automatic node repairs 🛠
        • cluster auto scaling ⚖️

Zonal Cluster – has a single control plane in a single zone.

  • single-zone cluster has a single control plane running in one zone
  • multi-zonal cluster has a single replica of the control plane running in a single zone, and has nodes running in multiple zones.

Regional Cluster – has multiple replicas of the control plane, running in multiple zones within a given region.

Private Cluster – provides the ability to isolate nodes from having inbound and outbound connectivity to the public internet.

Kubernetes ☸️ Object Management – identified by a unique name and a unique identifier.

  • Objects are defined in a YAML file
  • Objects are identified by a name
  • Objects are assigned a unique identifier (UID) by K8s ☸️
  • Labels 🏷 are key value pairs that tag your objects during or after their creation.
  • Labels 🏷 help you identify and organize objects and subsets of objects.
  • Labels 🏷 can be matched by label selectors

Pods and Controller Objects

Pods have a life cycle🔄

  • Controller Object types
    • Deployment – ensure that sets of Pods are running
      • To perform the upgrade, the Deployment object will create a second ReplicaSet object, and then increase the number of (upgraded) Pods in the second ReplicaSet while it decreases the number in the first ReplicaSet
    • StatefulSet
    • DaemonSet
    • Job
  • Allocating resource quotas
  • Namespaces – provide scope for naming resources (pods, deployments and controllers.)

There are 3 initializer spaces in the cluster.

  1. Default name space for objects with no other name space defined.
  2. Kube-system named Space for objects created by the Kubernetes system itself.
  3. Kube-Public name space for objects that are publicly readable to all users.

Best practice tip: namespace neutral YAML

  • Apply name spaces at the command line level which makes YAML files more flexible🧘‍♀️.

Advanced K8s ☸️ Objects

Services

  • set of Pods and assigns a policy by which you can access those pods
    • Services provide load-balanced 🏋️‍♂️ access to specified Pods. There are three primary types of Services:
      • ClusterIP: Exposes the service on an IP address that is only accessible from within this cluster. This is the default type.
      • NodePort: Exposes the service on the IP address of each node in the cluster, at a specific port number.
      • LoadBalancer 🏋️‍♂️: Exposes the service externally, using a load balancing 🏋️‍♂️ service provided by a cloud☁️ provider.

Volume

  • A directory that is accessible to all containers in a Pod
    • Requirements of the volume can be specified using Pod specification
    • You must mount these volumes specifically on each container within a Pod
    • Set up Volumes using external storage outside of your Pods to provide durable storage

Controller Objects

  • ReplicaSets – ensures that a population of Pods
    • Deployments
      • Provides declarative updates to ReplicaSets and Pods
      • Create, update, roll back, and scale⚖️ Pods, using ReplicaSets
      • Replication Controllers – perform a similar role to the combination of ReplicaSets and Deployments, but their use is no longer recommended.
    • StatefulSets – similar to a Deployment, Pods use the same container specs
    • DaemonSets – ensures that a Pod is running on all or some subset of the nodes.
    • Jobs – creates one or more Pods required to run a task

“Can I get an encore; do you want more?”

Migrate for Anthos – tool🛠 for getting workloads into containerized deployments on GCP

  • Automated process that moves your existing applications into a K8s ☸️ environment.

Migrate for Anthos moves VMs to containers

  • Move and convert workloads into containers
  • Workloads can start as physical servers or VMs
  • Moves workload compute to container immediately (<10 min)
  • Data can be migrated all at once or “streamed” to the cloud☁️ until the app is live in the cloud☁️

Migrate for Anthos Architecture

  • A migration requires an architecture to be built
  • A migration is a multi-step process
  • Configure processing cluster
  • Add migration source
  • Generate and review plan
  • Generate artifacts
  • Test
  • Deploy

Migrate for Anthos Installation -requires a processing cluster

         Installing Migrate for Anthos uses migctl

$migctl setup install

         Adding a source enables migration from a specific environment

$migctl source create cd my-ce-src –project my-project –zone zone

         Creating a migration generates a migration plan

$migctl migration create test-migration –source my-ce-src –vm- id my-id –intent image

         Executing a migration generates resource and artifacts

$migctl migration generate-artifacts my-migration

         Deployment files typically need modification

$migctl migration get-artifacts test-migration

Apply the configuration to deploy the workload

$Kubectl apply -f deployment_sepc.yaml

“And we’ll bask 🌞 in the shadow of yesterday’s triumph🏆 And sail⛵️ on the steel breeze🌬

Below are some of the destinations I am considering for my travels for next week:

Thanks –

–MCS

Week of October 16th

Part II of a Cloud☁️ Journey

“Cause I don’t want to come back… Down from this Cloud☁️”

Hi All –

Happy Global 🌎 Cat 😺 Day!

Last week, we started our continuous Cloud☁️ journey exploring Google Cloud☁️ to help us better understand the core services and the full value proposition that GCP can offer.

It has been said “For modern enterprise, that Cloud☁️ is the closest thing to magic🎩🐰 that we have.” Cloud☁️ enables companies large🏢 and small🏠 to be more agile and nimble. It also empowers employees of these companies🏭 to focus on being more creative and innovative and not being bogged down in the minutiae and rigors of managing IT infrastructure. In addition, customers of these companies benefit from an overall better Customer experience as applications are more available and scalable ⚖️.

As you might know, “Google’s mission is and has as always been to organize the world’s🌎 information and make it universally accessible and useful and as result playing a meaningful role in the daily lives of billions of people” Google has been able to hold true to this mission statement through its unprecedented success from products and platforms like Search 🔎, Maps 🗺, Gmail 📧, Android📱, Google Play, Chrome and YouTube 📺. Google continues to strive for the same kind of success with their Cloud Computing☁️ offering with GCP.

This week we continued our journey with GCP and helping us through the Google Cloud☁️ Infrastructure Essentials (Essential Google Cloud☁️ Infrastructure: Foundation & Essential Google Cloud☁️ Infrastructure: Core Services) through a combination of lectures and qwiklabs were esteemed Googlers Phillip Maier who seems to have had more cameos in the Google Cloud☁️ training videos than Stan Lee has made in all of the Marvel Movies combined and the very inspirational Mylene Biddle who exemplifies transformation both in the digital and real world.

Phillip and Mylene begin the course discussing Google Cloud☁️ which is a much larger ecosystem than just GCP. This ecosystem consists of open source software providers, partners, developers, third party software and other Cloud☁️ providers.

GCP uses a state-of-the-art software defined networking and distributed systems technologies to host and deliver services around the world🌎. GCP offers over 90 products and Services that continues to expand. GCP spans from infrastructure as a service or (IaaS) to software as a service (SaaS).

Next, Philip presented an excellent analogy comparing IT infrastructure to one of a city’s 🏙 infrastructure. “Infrastructure is the basic underlying framework of fundamental facilities and systems such as transport, 🚆communications 📞, power🔌, water🚰, fuel ⛽️ and other essential services. The people 👨‍👩‍👧‍👦 in the city 🏙 are like users 👥, and the cars 🚙🚗 and bikes 🚴‍♀️🚴‍♂️ buildings🏬 in the city 🏙 are like applications. Everything that goes into creating and supporting those applications for the users is the infrastructure.

GCP offers wide range of compute services including:

  • Compute Engine – (IaaS) runs virtual machines on demand
  • Google Kubernetes ☸️ Engine (IaaS/PaaS) – run containerized applications on a Cloud☁️ environment that Google manages under your administrative control.
  • App Engine (PaaS) is fully managed platform as a service framework. Run code in the Cloud☁️ without having to worry about infrastructure.
  • CloudFunctions (Serverless) It executes your code in response to events, whether those events occur once a day or many times ⏳

There are Four ways to interact with GCP

  1. Google Cloud Platform Console or GCP Console
  2. CloudShell and the Cloud SDK
  3. API
  4. Cloud Mobile 📱 App

CloudShell provides the following:

  • Temporary Compute Engine VM
  • Command-line access to the instance via a browser
  • 5 GB of persistent disk storage 🗄 ($HOME dir)
  • Pre-installed Cloud☁️ SDK and other tools 🛠
  • gCloud: for working with Compute Engine and many Google Cloud☁️ services
  • gsutil: for working with Cloud Storage 🗄
  • kubectl: for working with Google Container Engine and Kubernetes☸️
  • bq: for working with BigQuery
  • bt: for working with BigTable
  • Language support for Java☕️, Go, Python🐍, Node.js, PHP, and Ruby♦️
  • Web 🕸 preview functionality
  • Built-in authorization for access to resources and instances

“Virtual insanity is what we’re living in”

Virtual Networks

GCP uses a software defined network that is built on a Global 🌎 fiber infrastructure. This infrastructure makes GCP, one of the world’s 🌎 largest and fastest 🏃‍♂️networks.

Virtual Private Cloud☁️

Virtual Private Cloud☁️ (VPC) provides networking functionality to Compute Engine virtual machine (VM) instances, Google Kubernetes Engine (GKE) ☸️ containers, and the App Engine standard and flexible🧘‍♀️ environment. VPC provides networking for your Cloud-based☁️ services that is Global 🌎, scalable⚖️, and flexible🧘‍♀️.

VPC is a comprehensive set of Google managed networking objects:

  • Projects are used to encompass the Network Service as well as all other service in GCP
  • Networks come in three different flavors 🍨.
    • Default
    • Automotive
    • Custom mode.
  • Subnets allow for division or segregation of the environment.
  • Regions and zones (GCP DC) provide continues data protection and high availability.
  • IP addresses provided are internal or external
  • Virtual machines – instances from a networking perspective.
  • Routes and Firewall🔥rules allow or deny connections to or from VMs based specified configuration

Projects, Networks, and Subnets

A Project is the key organizer of infrastructure resources.

  • Associates objects and services with billing🧾
  • Contains networks (up to 5) that can be shared/peered

Networks are Global 🌎 are spans all available regions.

  • Has no IP address range
  • Contains subnetworks
  • Has three different options:
    • Default
      • Every Project
      • One subnet per region
      • Default firewall rules🔥
    • Automotive
      • Default network
      • One subnet per region
      • Regional IP allocation
      • Fixed /20 subnetwork per region
      • Expandable up to /16
    • Custom mode
      • No default subnets created
      • Full control of IP ranges
      • Regional IP allocation

VMs despite different locations geographically 🌎 take advantage of Google’s Global 🌎 fiber network. VMs appear as though they’re sitting in the same rack when it comes to a network configuration protocol.

  • VMs can be on the same subnet but in different zones
  • A single firewall rule 🔥 can apply to both VMs

Subnet is a ranges of IP addresses.

  • Every subnet has 4 reserved IP addresses in its primary IP Range.
  • Subnets can be expanded without re-creating instances or any down Time ⏳
    • Cannot overlap with other subnets
    • Must be inside the RFC 1918 address spaces
    • Can be expanded but cannot be shrunk
    • Auto mode can be expanded from /20 to /16
    • Avoid large subnets (don’t scale ⚖️ beyond what is actually needed)

IP addresses

VMs can have internal and external IP addresses

You also can assign a range of IP addresses as aliases to a VM’s network interface using IP range

Internal IP

  • Allocated from a subnet range to VMs by DHCP
  • DHCP lease is removed every 24 hours
  • VM name + IP is registered with network-scoped DNS

External IP

  • Assigned from pool (ephemeral)
  • Reserved (static)
  • VMs don’t know external IP
  • VMs are mapped to the internal IP

Mapping 🗺 IP addresses

DNS resolution for internal addresses

  • Each instance has a hostname that can be resolved to an internal IP address:
    • The hostname is the same as the instance name
    • FQDN is [hostname]. [zone].c.[project-id].internal
  • Name resolution is handled by internal DNS resolver
  • Configured for use on instance via DHCP
  • Provides answer for internal and external addresses

DNS resolution for external address

  • Instances with external IP addresses can allow connections from hosts outside the project
    • Users connect directly using external IP address
    • Admins can also publish public DNS records pointing to the instance
    • Public DNS records are not published automatically
  • DNS records for external addresses can be published using existing DNS servers (outside of GCP)
  • DNS zones can be hosted using Cloud DNS.

Host DNS zones using Cloud DNS

  • Google DNS service
  • Translate domain name into IP address
  • Low latency
  • High availability (100% uptime SLA)
  • Create and update millions of DNS records

Routes and firewall rules 🔥

Every network has:

  • Routes that let instances in a network send traffic🚦 directly to each other.
  • A default route that directs to destinations that are outside the network.

Routes map🗺 traffic🚦 to destination networks

  • Apply to traffic🚦 egress to a VM
  • Forward traffic🚦 to most specific route
  • Created when a subnet is created
  • Enable VMs on same network to communicate
  • Destination is in CIDR notation
  • Traffic🚦 is delivered only if it also matches a firewall rule 🔥

Firewall rules🔥 protect your VM instances from unapproved connections

  • VPC network functions as a distributed firewall. 🔥
  • Firewall rules🔥 are applied to the network as whole
  • Connections are allowed or denied at the instance level.
  • Firewall rules🔥 are stateful
  • Implied deny all ingress and allow all egress

Create Network

$gcloud compute networks create privatenet --subnet-mode=custom
$gcloud compute networks subnets create privatesubnet-us --network=privatenet --region=us-central1 --range=172.16.0.0/24
$gcloud compute networks subnets create privatesubnet-eu --network=privatenet --region=europe-west1 --range=172.20.0.0/20
$gcloud compute networks list
$gcloud compute networks subnets list --sort-by=NETWORK

Create firewall Rules 🔥

$gcloud compute firewall-rules create privatenet-allow-icmp-ssh-rdp --direction=INGRESS --priority=1000 --network=privatenet --action=ALLOW --rules=icmp,tcp:22,tcp:3389 --source-ranges=0.0.0.0/0

Common network designs

  • Increased availability with multiple zones
    • A regional managed instance group contains instances from multiple zones across the same region, which provides increased availability.
  • Globalization 🌎 with multiple regions
    • Putting resource is in different regions, provides an even higher degree of failure independence by spreading resources across different failure domains
  • Cloud NAT provides internet access to private instances
    • Cloud NAT is Google’s Mesh Network address translation service. Provision application instances without public IP addresses, while also allowing them to access the Internet in a controlled and efficient manner.
  • Private Google Access to Google APIs and services
    • Private Google access to allow VM instances that only have internal IP addresses to reach the external IP addresses of Google APIs and services.

“Know you’re nobody’s fool… So welcome to the machine”

Compute Engine (IaaS)

Predefined or custom Machines types:

  • vCPUs (cores and Memory (RAM)
  • Persistent disks: HDD, SDD, and Local SSD
  • Networking
  • Linux or Windows

Compute

Several machine types

  • Network throughput scales⚖️ 2 Gbps per vCPU (small exceptions)
  • Theoretical max of 32 Gbps with 16 vCPU or 100 Gbps with T4 or V100 GPUs

A vCPU is equal to 1 hardware hyper-thread.

Storage 🗄

         Disks

  • Standard, SSD, or Local SSD
  • Standard and SSD PB scale⚖️ in performance for each GB of space allocated

Resize disks or migrate instances with no downTime ⏳

Local SSD have even higher throughput and lower latency than SSD persistent disks because there are attached to the physical hardware. However, the data that you store local s SSDs persists only until you stop 🛑 or delete the instance.

Networking

         Robust network features:

  • Default, custom networks
  • Inbound/outbound firewall rules🔥
    • IP based
    • Instance/group tags
  • Regional HTTPS load balancing
  • Network load balancing
    • Does not require pre-warming
  • Global 🌎 and multi-regional subnetworks

VM access

Linux🐧 SSH (requires firewall to allow tcp:22)

  • SSH from Console CloudShell via Cloud SDK, computer

Windows RDP (requires firewall to allow tcp:3389)

  • RDP clients, PowerShell terminal

VM Lifecycle

Compute Engine offers live migration to keep your virtual machine instances running even when a host system event, such as a software or hardware update, occurs. Live migration keeps your instances running during the following events:

  • Regular infrastructure maintenance and upgrades.
  • Network and power 🔌 grid maintenance in the data centers.
  • Failed hardware such as memory, CPU, network interface cards, disks, power, and so on. This is done on a best-effort basis; if a hardware fails completely or otherwise prevents live migration, the VM crashes and restarts automatically and a hostError is logged.
  • Host OS and BIOS upgrades.
  • Security-related updates, with the need to respond quickly.
  • System configuration changes, including changing the size of the host root partition, for storage 🗄 of the host image and packages.

Compute Options

Machine Types

Predefined machine types Ratio of GB of memory per VCPU

  • Standard machine types
  • High-memory machine types
  • High-CPU machine types
  • Memory-optimized machine types
  • Compute-optimized machine types
  • Shared core machine types

Custom machine types:

  • You specify the amount of memory and number of VCPUs

Special compute configurations

Preemptible (ideal for running batch processing jobs)

  • Lower price for interruptible service (up to 80%)
  • VM might be terminated at any time ⏳
    • No charge if terminated in the first 10 minutes
    • 24 hours max
    • 30-second terminate warning, but not guaranteed
      • Time⏳ for a shutdown script
  • No live migrate; no auto restart
  • You can request that CPU quota for a region be split between regular and preemption
    • Default: preemptible VMs count against region CPU quota

Sole-tenant nodes -physically isolate workloads (Ideal for workloads that require physical isolation)

  • Sole-tenant node is a physical compute engine server that is dedicated to hosting VM Instances
  • if you have existing operating system licenses, you can bring them to compute engine using sole tenant notes while minimizing physical core usage with the in-place restart feature

Shielded VMs🛡offer verifiable integrity

  • Secure🔒 Boot
  • Virtual trusted platform module (vTPM)
  • Integrity Monitoring 🎛

Images

  • Boot loader
  • Operating system
  • File System Structure 🗂
  • Software
  • Customizations

Disk options

Boot disk

  • VM comes with a single root persistent disk
  • Image is loaded onto root disk during first boot:
    • Bootable: you can attach to a VM and boot from it
    • Durable: can survive VM terminate
  • Some OS images are customized for Compute Engine
  • Can survive VM deletion if “Delete boot disk when instance is deleted” is disabled.

Persistent disks

  • Network storage 🗄 appearing as a block device 🧱
    • Attached to a VM through the network interface
    • Durable storage 🗄: can survive VM terminate
    • Bootable: you can attach to a VM and boot from it
    • Snapshots: incremental backups
    • Performance: Scales ⚖️ with Size
  • Features
    • HDD or SSD
    • Disk resizing
    • Attached in read-only mode to multiple VMs
    • Encryption keys 🔑

Local SSD disks are physically attached to a VM

  • More IOPS, lower latency and higher throughput
  • 375-GB up to 8 disks (3TB)
  • Data Survives a reset but not VM Stop 🛑 or terminate
  • VM Specific cannot be reattached to a different VM

RAM disk

  • tmps
  • Faster than local disk, slower memory
    • Use when your application expects a file system structure and cannot directly store its data in memory
    • Fast scratch disk, or fast cache
  • Very volatile: erase on stop 🛑 or reset
  • May require larger machine is RAM sized for application
  • Consider using persistent disk to back up RAM disk data

Common Compute Engine actions

  • Metadata and scripts (Every VM instance store its metadata on a metadata server)
  • Move an instance to a new zone
    • Automated process (moving within region)
      • gcloud compute instance move
      • updates reference to VM; not automatic
    • Manual process (moving between regions):
      • Snapshots all persistent disks
      • Create new persistent disk in destination zone disks restored from snapshots
      • Create new VMs in the destination zone and attach new persistent
      • Assign static IP to new VM
      • Update references to VM
      • Delete the snapshots, original disks and original VM
  • Snapshot: Back up critical data
  • Snapshot: Migrate data between zones
  • Snapshot: Transfer to SSD to improve performance
  • Persistent disk snapshots
    • Snapshot is not available for local SSD
    • Creates an incremental backup to Cloud Storage 🗄
      • Not visible in your buckets; managed by the snapshot service
      • Consider cron jobs for periodic incremental backup
    • Snapshots can be restored to a new persistent disk
      • New disk can be in another region or zone in the same project
      • Basis of VM migration: “moving” a VM to a new zone
        • Snapshot doesn’t back up VM metadata, tags, etc.
      • Resize persistent disk

You can grow disks, but never shrink them!

“If you had my love❤️ and I gave you all my trust 🤝. Would you comfort🤗 me?”

Identity Access Management (IAM)

IAM is a sophisticated system built on top of email, like address names, job type roles in granular permissions.

Who, Which, What

It is a way of identifying who can do what on which resource the who could be a person, group or application. The what refers to specific privileges or actions and the resource could be any DCP service.

  • Google Cloud☁️ Platform Resource is organized hierarchically
  • If you change the resource hierarchy, the policy hierarchy also changes.
  • A best practice is to follow the “principle of least privilege”.
  • Organization node is the root node in this hierarchy. Represents your company.
  • Folders 📂 are the Children of the organization. A Folder📂 could represent a department Cloud☁️
  • Projects are the Children of the Folders📂. Projects provide a trust boundary for a company
  • Resources are the Children of projects. Each Resource has exactly one parent Cloud☁️.

Organization

  • An organization node is a root node for Google Cloud☁️ resources
  • Organization roles:
  • Organization Admin: Control over all Cloud☁️ resources: useful for auditing
  • Project Creator: Controls project creation: Control project creation: control over who can create projects
  • Creating and managing Organization when a Workspace or IAM account creates a GCP Project. There are two roles assigned to t users or groups:
  • Super administrator:
  • Assign the Organization admin role to some users
    • Be the point of contact in case of recovery issues
    • Control the lifecycle 🔄 of the Workspace of Cloud☁️ Identity account and Organization
  • Organization admin:
  • Define IAM policies
    • Determine the structure of the resource hierarchy
    • Delegate responsibility over critical components such as Network, billing, and Resource Hierarchy through IAM roles

Folders 📂

  • Additional grouping mechanism and isolation boundaries between projects:
    • Different legal entities
    • Departments
    • Teams
  • Folders 📂 allow delegation of administration rights.

Roles

  • There are three types of roles in GCP:
  1. Primitive roles apply across all GCP services in a project
    • Primitive roles offer fixed, coarse-grained levels of access
      • Owner – Full privileges
      • Editor – Deploy, modify & configure
      • Viewer 👓 – Read-only access
      • *Billing Administrator – Manage Billing, Add Administrators
  2. Predefined roles apply to a particular service in a project
  1. Predefined roles offer more fine-grained permissions on a particular service
    • Example: Compute Engine IAM roles:
      • Compute Admin – Full control of Compute Engine
      • Network Admin – Create, modify, delete Network Resources (except FW rules and SSL Certs)
      • Storage Admin– Create, modify, delete disks, Images, and Snapshots
  2. Custom roles define a precise set of permissions

Members

Defined the “who” part of who can do what on which resource.

There are five different types of members:

  1. Google account represents a developer, an administrator or any other person who interacts with GCP. Any email address can be associated with a Google account
  2. Service account is an account that belongs to your application instead of to an individual and user.
  3. Google group is unnamed collection of Google accounts and service accounts.
  4. Workspace domains represent your organization’s Internet domain name
  5. Cloud Identity domains manage users and groups using the Google Admin console, but you do not pay for or received Workspace collaboration products

Google Cloud Directory Sync ↔️ synchronizes ↔️ users and groups from your existing active directory or LDAP system with the users and groups in your Cloud identity domain. Synchronization ↔️ is one way only

Single Sign-om (SSO)

  • Use Cloud Identity to configure SAML, SSO
  • IF SAML2 isn’t supported, use a third-party solution

Service Accounts

  • Provide an identity for carrying out server-to-server interactions
    • Programs running within Compute Engine instances can automatically acquire access tokens with credentials
    • Tokens are used to access any service API or services in your project granted access to a service account
    • Service accounts are convenient when you’re not accessing user data
  • Service accounts are identified by an email address
    • Three types of Service accounts:
      • User-created (custom)
        • Built-in
          • Compute Engine and App Engine default service accounts
        • Google APIs Service account
          • Runs internal Google processes on your behalf
      • Default Compute Engine Service account
        • Automatically created per project with auto-generated name and email address:
        • Name has -compute suffix
        • [email protected]
        • Automatically added as a project Editor
        • By default, enabled on all instances created using glcoud or GCP console
  • Service account permissions
    • Default service accounts: primitive and predefined roles
    • User-created service accounts: predefined roles
    • Roles for Service accounts can be assigned to groups or users

Authorization is the process of determining what permissions and authenticated identity has on a set of specified resource(s)

Scopes are used to determine whether unauthenticated identity is authorized.

Customizing Scopes for a VM

  • Scopes can be changed after an instance is created
  • For user-created service accounts, use IAM roles instead.

IAM Best Practices

  1. Leverage and understand the resource hierarchy
  • Use Projects to group resources that share the same trust boundary
  • Check the policy granted on each resource and make sure you understand inheritance
  • Use “Principles of Least Privilege” when granting roles
  • Audit policies in Cloud☁️ audit logs: setiampolicy
  • Audit membership of groups used in policies
  • Grant roles to Google groups instead of individuals
  • Update group membership instead of changing IAM Policy
    • Audit membership of groups used in policies
    • Control the ownership of the Google group used in IAM policies
  • Service accounts
  • Be very careful granting serviceAccountUser role
    • When you create a service account, give it a display name that clearly identifies its purpose
    • Establish a naming convention for service accounts
    • Establish key rotation policies and methods
    • Audit with serviceAccount.keys.list() method

Cloud Identity-Aware Proxy (Cloud IAP)

Enforce access control policies for application and resources:

  • Identity-based access control
  • Central authorization layer for applications accessed by HTTPS

IAM policy is applied after authentication

“Never gonna give you up… Never gonna say goodbye.”

Storage 🗄 and Database Services 🛢

Cloud Storage 🗄 (Object Storage 🗄) – It allows worldwide🌎 storage 🗄 and retrieval of any amount of data at any Time ⏳.

  • Scalable ⚖️ to exabytes
  • Time⏳ to first byte in milliseconds
  • Very high availability across all storage 🗄 classes
  • Single API across storage 🗄 classes

Use Cases:

  • Website content
  • Storing data for archiving and disaster recovery
  • Distributing large data objects to users via direct download

Cloud Storage 🗄 has four storage 🗄 classes:

  1. Regional storage 🗄 enables you to store data at lower cost with the tradeoff of data being stored in a specific regional location.
  2. Multi regional storage 🗄 is geo redundant, Cloud Storage 🗄 stores, your data redundantly and at least two geographic locations separated by at least 100 miles within the multi-regional location of the bucket 🗑.
  3. Near line storage 🗄 is a low cost, highly durable storage 🗄 service for storing infrequently accessed data.
  4. Cold Line storage 🗄 is a very low cost, highly durable storage 🗄 service for data, archival, online backup and disaster recovery. Data is available within milliseconds, not hours or days.

         Buckets🗑

  • Naming requirements
  • Cannot be nested
  • Regional bucket 🗑 & Multi-Regional cannot be changed
  • Objects can be moved from bucket 🗑 to bucket 🗑

Objects

  • Inherit storage 🗄 class of bucket 🗑 when created
  • No minimum size: unlimited storage 🗄

Access

  • gsutil command
  • (RESTful) JSON API or XML API

Access control lists (ACLs)

For some applications, it is easier and more efficient to grant limited Time ⏳ access tokens that can be used by any user instead of using account-based authentication for controlling resource access.

Signed URLs

“Valet Key” access to buckets 🗑 and objective via ticket:

  • Ticket is a cryptographically signed URL
  • Time-limited
  • Operations specified in ticket: HTTP GET, PUT DELETE (not POST)
  • Any user with URL can invoke permitted operations

Cloud Storage 🗄 Features

  • Customer-supplied encryption key (CSEK)
    • Use your own key instead of Google-managed keys 🔑
  • Object Lifecycle 🔄Management
    • Automatically delete or archive objects
  • Object Versioning
    • Maintain multiple versions of objects
      • Objects are immutable
      • Object Versioning:
        • Maintain a history of modification of objects
        • List archived versions of an object, restore an object to an older state, or delete a version
  • Directory synchronization ↔️
    • Synchronizes a VM directory with a bucket 🗑
  • Object change notification
  • Data import
  • Strong 💪 consistency

Object Lifecycle 🔄 Management policies specify actions to be performed on objects that meet certain rules

  • Examples:
    • Downgrade storage 🗄 class on objects older than a year.
    • Delete objects created before a specific date.
    • Keep only the 3 most recent versions of an object
  • Object inspection occurs in asynchronous batches
  • Changes can take 24 hours to apply

Object change notification can be used to notify an application when an object is updated or added to a bucket 🗑

Recommended: Cloud Pub/Sub Notifications for Cloud Storage 🗄

Data import services

  • Transfer Appliance: Rack, capture and then ship your data to GCP
  • Storage Transfer Service: Import online data (another bucket 🗑, S3 bucket 🗑, Web Service)
  • Offline Media Import: Third-party provider uploads the data from physical media

Cloud Storage 🗄 provides Strong 💪 global consistency

Cloud SQL is a fully managed database 🛢 service (MySQL or PostgreSQL)

  • Patches and updates automatically applied
  • You administer MySQL users
  • Cloud SQL supports many clients
    • gCloud sql
    • App Engine, Workspace scripts
    • Applications and tools 🛠
      • SQL Workbench Toad
      • External applications using standard MySQL drivers
  • Cloud SQL delivers high performance and scalability ⚖️ with up to 30 TBs of storage 🗄 capacity, 40,000 IOPS and 416 GB of RAM
  • Replica service that can replicate data between multiple zones
  • Cloud SQL also provides automated and on demand backups.
  • Cloud SQL scales ⚖️ up (require a restart)
  • Cloud SQL scales ⚖️ out using read replicas.

Cloud Spanner is a service built for the Cloud☁️ specifically to combine the benefits of relational database 🛢 structure with non-relational horizontal scale⚖️

Data replication is synchronized across zones using Google’s global fiber network

  • Scale⚖️ to petabytes
  • Strong 💪 consistency
  • High Availability
  • Used for financial and inventory applications
  • Monthly uptime ⏳
    • Multi-regional:99.999%
    • Regional: 99.99%

Cloud Firestore is a fast, fully managed serverless Cloud☁️ native NoSQL document database 🛢 that simplifies storing, syncing ↔️ and querying data for your Mobile 📱 Web and ioT applications a global scale⚖️.

  • Simplifies storing, syncing ↔️, and querying data
  • Mobile📱, web🕸, and IoT apps at global scale⚖️
  • Live synchronization ↔️ and offline support
  • Security🔒 features
  • ACID transactions
  • Multi-region replication
  • Powerful query engine

Datastore mode (new server projects):

  • Compatible with Datastore applications
  • Strong 💪 consistency
  • No entity group limits

Native mode (new Mobile 📱 and web🕸 apps):

  • Strongly 💪 consistent storage 🗄 layer
  • Collection and document 📄 data model
  • Real-time updates
  • Mobile📱 and Web🕸 Client libraries📚

Cloud Bigtable (Wide Column DB) is a fully managed, no SQL database 🛢 with petabytes scale⚖️ and very low latency.

  • Petabyte-scale⚖️
  • Consistent sub-10ms latency
  • Seamless scalability⚖️ for throughput
  • Learns and adjusts to access patterns
  • Ideal for Ad Tech, FinTech, and IoT
  • Storage 🗄 engine for ML applications
  • Easy integration with open source big data tools 🛠

Cloud MemoryStore is a fully managed Redis Service built on scalable ⚖️, Secure🔒 and highly available infrastructure managed by Google Applications.

  • In-memory data store service
  • Focus on building great apps
  • High availability, failover, patching and Monitoring 🎛
  • Sub-millisecond latency
  • Instances up to 300 GB
  • Network throughput of 12 Gbps
  • “Easy Lift-and-Shift”

Resource Management lets you hierarchically manage resources

  • Resources can be categorized by Project, Folder📂, and Organization
  • Resources are global🌎, regional, or zonal
  • Resource belongs to only one project
  • Resources inherent policies from their parents
  • Resource consumption is measured in quantities like rate of use or Time⏳
  • Policies contain a set of roles, and members and policies are set on
  • Policy is less restrictive; it overrides the more restrictive resource policy.
  • Organization node is root node for GCP resources
  • Organization contains all billing accounts.
  • Project accumulates the consumption of all its resources
    • Track resource and quota usage
    • Enable billing
    • Manage permissions and credentials
    • Enable services and APIs
  • Project use 3 identifying attributes
    • Project Name
    • Project Number (unique)
    • Project ID (unique)

Quotas

All resources are subject to project quotas or limits

Examples:

  • Total resources you can create per project: 5 VPC networks/project
    • Rate you make API requests in a project: 5 admin actions/second (Cloud☁️ Spanner)
    • Total resources you can create per region: 24 CPUs region/project

Increase: Quotas page in GCP Console or a support ticket

Your use of GCP expands over time, your quotas may increase accordingly.

Project Quotas:

  • Prevent runaway consumption in case of an error or malicious attack
    • Prevent billing spikes or surprises
    • Forces sizing consideration and periodic review

Labels and names

Labels 🏷 are a utility for organizing GCP resources

  • Attached to resources: VM, disk, snapshot, image
    • GCP Console, gCloud, or API
  • Example uses of labels 🏷 :
    • Inventory
    • Filter resources
    • In scripts
      • Help analyze costs
      • Run bulk operations

Comparing labels and tags

  • Labels 🏷 are a way to organize resources across GCP
  • Disks, image, snapshots
  • User-defined strings.in key-value format
  • Propagated through billing
  • Tags are applied to instances only
  • User-defined strings
  • Tags are primarily used for networking (applying firewall rules🔥)

Billing

  • Set a budget lets you track how you spend
  • Set a budget alerts 🔔send alerts 🔔and emails📧to Billing Admin
  • Use Cloud☁️ pubsub notifications to programmatically receive spend updates about this budget.
  • Optimize your GCP spend by using labels
  • Visualize GCP spend with Data Studio

Its recommended labeling all your resources and exporting billing data to Big Query to Analyze spend.

Every single day. Every word you say… Every game you play. Every night you stay. I’ll be watching you”

Resource Monitoring 🎛

Stackdriver

  • Integrated Monitoring 🎛, Logging, Error Reporting, Tracing and Debugging
  • Manages across platforms
    • GCP and AWS
    • Dynamic discovery of GCP with smart defaults
    • Open-source agents and integrations
  • Access to powerful data and analytics tools 🛠
  • Collaboration with third-party software

Monitoring 🎛 is important to Google because it is at the base of Site Reliability Engineering (SRE).

  • Dynamic config and intelligent defaults
  • Platform, system, and application metrics
    • Ingests data: Metrics, events, metadata
    • Generates insights through dashboards, charts, alerts
  • Uptime /health checks⛑
  • Dashboards
  • Alerts 🔔

Workspace is the root entity that hold Monitoring 🎛 and configuration information

  • “Single pane of glass 🍸”
    • Determine your Monitoring 🎛 needs up front
    • Consider using separate Workspace for data and control isolation

To access an AWS account, you must configure a project in GCP to hold the AWS connector because workspaces can monitor all of your GCP projects in a single place.

Stack Driver Monitoring 🎛 allows you to create custom dashboards that contain charts 📊of the metrics that you want a monitor.

Uptime checks test the availability of your public services

Stock driver Monitoring 🎛 can access some metrics without the Monitoring 🎛 agent, including CPU utilization, some disk traffic 🚥metrics, network traffic and up Time⏳ information.

Stack Driver Logging provides logging, error, reporting, tracing and debugging.

  • Platform, systems, and application logs
    • API to write to logs
    • 30-day retention
  • Log search/view/filter
  • Log-based metrics
  • Monitoring 🎛 alerts 🔔can be set on log events
  • Data can be exported to Cloud Storage 🗄, BigQuery, and Cloud Pub/Sub
  • Analyze logs in BigQuery and visualize in Data Studio

Stack driver Error Reporting counts, analyzes and aggravates the errors in your running Cloud☁️ services

  • Error notifications
  • Error dashboard
  • Go, Java☕️, .NET, Node.js, PHP, Python 🐍, and Ruby♦️

Stack Driver Tracing is a distributed tracing system that collects Layton. See data from your applications and displays it in the GCP console.

  • Displays data in near real-time ⏳
  • Latency reporting
  • Per-URL latency sampling
  • Collects latency data
    • App Engine
    • Google HTTP(S) load balancers🏋️‍♀️
    • Applications instrumented with the Stackdriver Trace SDKs

Debugging

  • Inspect an application without stopping 🛑 it or slowing it down significantly
  • Debug snapshots:
    • Capture call stack and local variables of a running application
  • Debug logpoints:
    • Inject logging into a service without stopping 🛑 it
  • Java☕️, Python🐍, Go, Node.js and Ruby♦️

“At last the sun☀️ is shining, the clouds☁️ of blue roll by”

We will continue next week with Part III of this series….

Thanks –

–MCS

Week of October 9th

Part I of a Cloud☁️ Journey

“On a cloud☁️ of sound 🎶, I drift in the night🌙”

Hi All –

Ok Happy Hump 🐫 Day!

The last few weeks we spent some quality time ⏰ visiting with Microsoft SQL Server 2019. A few weeks back, we kicked 🦿the tires 🚗 with IQP and the improvements made to TempDB. Then week after we were fully immersed with Microsoft’s most ambitious offering in SQL Server 2019 with Big Data Clusters (BDC).

This week we make our triumphant return back to the cloud☁️. If you have been following our travels this past summer☀️ we focused on the core concepts of AWS and then we concentrated on the fundamentals of Microsoft Azure. So, the obvious natural progression of our continuous cloud☁️ journey✈️ would be to further explore the Google Cloud Platform or more affectionately known as GCP. We had spent a considerable amount time 🕰 before learning many of the exciting offerings in GCP but our long awaited return has been long overdue. Besides we felt the need to gives some love ❤️ and oxytocin 🤗 to our friends from Mountain View

“It starts with one☝️ thing…I don’t know why?”

Actually, Google has 10 things when it comes to their philosophy but more on that later. 😊

One of Google’s strong 💪 beliefs is that in “in the future every company will be a data company because making the fastest and best use of data is a critical source of a competitive advantage.”

GCP is Google’s Cloud Computing☁️ solution that provides a wide variety of services such as Compute, Storage🗄, Big data, and Machine Learning for managing and getting value from data and doing that at infinite scale⚖️. GCP offers over 90 products and Services.

“If you know your history… Then you would know where you coming from”

In 2006, AWS began offering cloud computing☁️ to the masses, several years later Microsoft Azure followed suit and shortly right after GCP joined the Flexible🧘‍♀️, Agile, Elastic, Highly Available and scalable⚖️ party 🥳. Although, Google was a late arrival to the cloud computing☁️ shindig🎉 their approach and strategy to Cloud☁️ Computing is far from a Johnny-come-lately” 🤠

“Google Infrastructure for Everyone” 😀

Google does not view cloud computing☁️ as a “commodity” cloud☁️. Google’s methodology to cloud computing☁️ is of a “premier💎 cloud☁️”, one that provides the same innovative, high-quality, deluxe set of services, and rich development environment with the advanced hardware that Google has been running🏃‍♂️ internally for years but made available for everyone through GCP.

“No vendor lockin 🔒. Guaranteed. 👍

In addition, Google who is certainly no stranger to Open Source software promotes a vision🕶 of the “open cloud☁️”. A cloud☁️ environment where companies🏢🏭 large and small 🏠can seamlessly move workloads from one cloud☁️ provider to another. Google wants customers to have the ability to run 🏃‍♂️their applications anywhere not just in Google.

“Get outta my dreams😴… Get into my car 🚙

Now that I extolled the virtues of Google’s vision 🕶 and strategy for Cloud computing☁️, it’s time to take this car 🚙 out for a spin. Fortunately, the team at Google Cloud☁️ have put together one of the best compilations since the Zeppelin Box Set 🎸in there Google Cloud Certified Associate Cloud Engineer Path on Pluralsight.   

Since there is some much to unpack📦, we will need to break our learnings down into multiple parts. So to help us put our best foot 🦶forward through the first part our journey ✈️ will be Googler Brice Rice and former Googler Catherine Gamboa through their Google Cloud Platform Fundamentals – Core Infrastructure course.

In a great introduction, Brian expounds on the definition of Cloud Computing☁️ and a brief history on Google’s transformation from the virtualization model to a container‑based architecture, an automated, elastic, third‑wave cloud☁️ built from automated services.

Next, Brian reviews GCP computing architectures:

Infrastructure as a Service (IaaS) – provide raw compute, storage🗄, and network organized in ways that are familiar from data centers. You pay for what you allocate

Platform as a Service (PaaS) – binds application code you write to libraries📚 that give access to the infrastructure your application needs. You pay 💰 for what you use.

Software as a Service (SaaS) – applications in that they’re consumed directly over the internet by end users. Popular examples: Search 🔎, Gmail 📧, Docs 📄, and Drive 💽

Then we had an overview of Google’s network which according to some estimates carries as much as 40% of the world’s 🌎 internet traffic🚦. The network interconnects at more than 90 internet exchanges and more than 100 points of presence worldwide 🌎 (and growing). One of the benefits of GCP is that it leverages Google’s robust network. Allowing GCP resources to be hosted in multiple locations worldwide 🌎. At granular level these locations are organized by regions and zones. A region is a specific geographical 🌎 location where you can host your resources. Each region has one or more zones (most regions have three or more zones).

All of the zones within a region have fast⚡️network connectivity among them. A zone is like as a single failure domain within a region. A best practice in building a fault‑tolerant application, is to deploy resources across multiple zones in a given region to protect against unexpected failures.

Next, we had summary on Google’s Multi-layered approach to security🔒.

Highlights:

  • Server boards and the networking equipment in Google data centers are custom‑designed by Google.
  • Google also designs custom chips, including a hardware security🔒 chip (Titan) deployed on both servers and peripherals.
  • Google Server machines use cryptographic signatures✍️ to make sure they are booting the correct software.
  • Google designs and builds its own data centers (eco friendly), which incorporate multiple layers of physical security🔒 protections. (Access to these data centers is limited to only a few authorized Google Employees)
  • Google’s infrastructure provides cryptographic🔐 privacy🤫 and integrity for remote procedure‑called data on the network, which is how Google Services communicate with each other.
  • Google has multitier, multilayer denial‑of‑service protections that further reduces the risk of any denial‑of‑service 🛡 impact.

Rounding out the introduction was a sneak peek 👀 into the Budgets and Billing 💰. Google offers customer-friendly 😊 pricing with a per‑second billing for its IaaS compute offering, Fine‑grained billing is a big cost‑savings for workloads that are bursting. GCP provides four tools 🛠to help with billing:

  • Budgets and alerts 🔔
  • Billing export🧾
  • Reports 📊
  • Quotas

Budgets can be a fixed limit, or you can tie it to another metric, for example a percentage of the previous month’s spend.

Alerts 🔔 are generally set at 50%, 90%, and 100%, but are customizable

Billing export🧾 store detailed billing information in places where it’s easy to retrieve for more detailed analysis

Reports📊 is a visual tool in the GCP console that allows you to monitor your expenditure. GCP also implements quotas, which protect both account owners and the GCP community as a whole 😊.

Quotas are designed to prevent the overconsumption of resources, whether because of error or malicious attack 👿. There are two types of quotas, rate quotas and allocation quotas. Both get applied at the level of the GCP project.

After a great intro, next Catherine kick starts🦵 us with GCP. She begins with a discussion around resource hierarchy 👑 and trust🤝 boundaries.

Projects are the main way you organize the resources (all resources belong to a project) you use in GCP. Projects are used to group together related resources, usually because they have a common business objective. A project consists of a set of users, a set of APIs, billing, quotas, authentication, and monitoring 🎛 settings for those APIs. Projects have 3 identifying attributes:

  1. Project ID (Globally 🌎 unique)
  2. Project Name
  3. Project Number (Globally 🌎 Unique)

Projects may be organized into folders 🗂. Folders🗂 can contain other folders 🗂. All the folders 🗂 and projects used by an organization can be put in organization nodes.

Please Note: If you use folders 🗂, you need to have an organization node at the top of the hierarchy👑.

Projects, folders🗂, and organization nodes are all places where the policies can be defined.

A policy is a set on a resource. Each policy contains a set of roles and members👥.

  • A role is a collection of permissions. Permissions determine what operations are allowed on a resource. There are three kinds of roles (primitive):
  1. Owner
  2. Editor
  3. Viewer

Another role made available in IAM is to control the billing for a project without the right to change the resources in the project is billing administrator role.

Please note IAM provides finer‑grained types of roles for a project that contains sensitive data, where primitive roles are too generic.

A service account is a special type of Google account intended to represent a non-human user ⚙️ that needs to authenticate and be authorized to access data in Google APIs.

  • A member(s)👥 can be a Google Account(s), a service account, a Google group, or a Google Workspace or Cloud☁️ Identity domain that can access a resource.

Resources inherit policies from the parent.

Identity and Access Management (IAM) allows administrators to manage who (i.e. Google account, a Google group, a service account, or an entire Work Space) can do what (role) on specific resources There are four ways to interact with IAM and the other GCP management layers:

  1. Web‑based console 🕸
  2. SDK and Cloud shell (CLI)
  3. APIs
    1. Cloud Client Libraries 📚
    1. Google API Client Library 📚
  4. Mobile app 📱

When it comes to entitlements “The principle of least privilege” should be followed. This principle says that each user should have only those privileges needed to do their jobs. In a least privilege environment, people are protected from an entire class of errors.  GCP customers use IAM to implement least privilege, and it makes everybody happier 😊.

For example, you can designate an organization policy administrator so that only people with privilege can change policies. You can also assign a project creator role, which control who can spend money 💵.

Finally, we checked into Marketplace 🛒 which provides an easy way to launch common software packages in GCP. Many common web 🕸 frameworks, databases🛢, CMSs, and CRMs are supported. Some Marketplace 🛒 images charge usage fees, like third parties with commercially licensed software. But they all show estimates of their monthly charges before you launch them.

Please Note: GCP updates the base images for these software packages to fix critical issues 🪲and vulnerabilities, but it doesn’t update the software after it’s been deployed. However, you’ll have access to the deployed system so you can maintain them.

“Look at this stuff🤩 Isn’t it neat? Wouldn’t you think my collection’s complete 🤷‍♂️?

Now with basics of GCP covered, it was time 🕰 to explore 🧭 some the computing architectures made available within GCP.

Google Compute Engine

Virtual Private Cloud (VPC) – manage a networking functionality for your GCP resources. Unlike AWS (natively), GCP VPC is global 🌎 in scope. They can have subnets in any GCP region worldwide 🌎. And subnets can span the zones that make up a region.

  • Provides flexibility🧘‍♀️ to scale️ and control how workloads connect regionally and globally🌎
  • Access VPCs without needing to replicate connectivity or administrative policies in each region
  • Bring your own IP addresses to Google’s network infrastructure across all regions

Much like physical networks, VPCs have routing tables👮‍♂️and Firewall🔥 Rules which are built in.

  • Routing tables👮‍♂️ forward traffic🚦from one instance to another instance
  • Firewall🔥 allows you to restrict access to instances, both incoming(ingress) and outgoing (egress) traffic🚦.

Cloud DNS manages low latency and high availability of the DNS service running on the same infrastructure as Google

Cloud VPN securely connects peer network to Virtual Private Cloud (VPC) network through an IPsec VPN connection.

Cloud Router lets other networks and Google VPC exchange route information over the VPN using the Border Gateway Protocol.

VPC Network Peering enables you to connect VPC networks so that workloads in different VPC networks can communicate internally. Traffic🚦stays within Google’s network and doesn’t traverse the public internet.

  • Private connection between you and Google for your hybrid cloud☁️
  • Connection through the largest partner network of service providers

Dedicated Interconnect which allows direct private connections providing highest uptimes (99.99% SLA) for their interconnection with GCP

Google Compute Engine (IaaS) delivers Linux or Windows virtual machines (VMs) running in Google’s innovative data centers and worldwide fiber network. Compute Engine offers scale ⚖️, performance, and value that lets you easily launch large compute clusters on Google’s infrastructure. There are no upfront investments, and you can run thousands of virtual CPUs on a system that offers quick, consistent performance. VMs can be created via Web 🕸 console or the gcloud command line tool🔧.

For Compute Engine VMs there are two kinds of persistent storage🗄 options:

  • Standard
  • SSD

If your application needs high‑performance disk, you can attach a local SSD. ⚠️ Beware to store data of permanent value somewhere else because local SSD’s content doesn’t last past when the VM terminates.

Compute Engine offers innovative pricing:

  • Per second billing
  • Preemptible instances
  • High throughput to storage🗄 at no additional cost
  • Only pay for hardware you need.

At the time of this post, N2D standard and high-CPU machine types have up to 224 vCPUs and 128 GB of memory which seems like enough horsepower 🐎💥 but GCP keeps upping 🃏🃏 the ante 💶 on maximum instance type, vCPU, memory and persistent disk. 😃

Sample Syntax creating a VM:

$ gcloud compute zones list | grep us-central1

$ gcloud config set compute/zone us-central1-c
$ gcloud compute instances create “my-vm-2” –machine-type “n1-standard-1” –image-project “debian-cloud” –image “ebian-9-stretch-v20170918” –subnet “default”

Compute Engine also offers auto Scaling ⚖️ which adds and removes VMs from applications based on load metrics. In addition, Compute Engine VPCs offering load balancing 🏋️‍♀️ across VMs. VPC supports several different kinds of load balancing 🏋️‍♀️:

  • Layer 7 load balancing 🏋️‍♀️ based on load
  • Layer 4 load balancing 🏋️‍♀️ based on non-http SSL load
  • Layer 4 load balancing 🏋️‍♀️ based on non-http SSL traffic🚦
  • Any Traffic🚦 (TCP, UDP)
  • Traffic🚦inside a VPC

Cloud CDN –accelerates💥 content delivery 🚚 in your application allowing users to experience lower network latency. The origins of your content will experience reduced load, and cost savings. Once you’ve set up HTTPS load balancing 🏋️‍♀️, simply enable Cloud CDN with a single checkbox.

Next on our plate 🍽 was to investigate the storage🗄 options that are available in GCP

Cloud Storage 🗄 is fully managed, high durability, high availability, scalable ⚖️ service. Cloud Storage 🗄 can be used for lots of use cases like serving website content, storing data for archival and disaster recovery, or distributing large data objects.

Cloud Storage🗄 offers 4 different types of storage 🗄 classes:

  • Regional
  • Multi‑regional
  • Nearline 😰
  • Coldline 🥶

Cloud Storage🗄 is comprised of buckets 🗑 which create, and configure, and use to hold storage🗄 objects.

Buckets 🗑 are:

  • Globally 🌎 Unique
  • Different storage🗄 classes
  • Regional or multi-regional
  • Versioning enabled (Objects are immutable)
  • Lifecycle 🔄 management Rules

Cloud Storage🗄supports several ways to bring data into Cloud Storage🗄.

  • Use gsutil Cloud SDK.
  • Drag‑and‑drop in the GCP console (with Google Chrome browser).
  • Integrated with many of the GCP products and services:
    • Import and export tables from and to BigQuery and Cloud SQL
    • Store app engine logs
    • Cloud data store backups
    • Objects used by app engine
    • Compute Engine Images
  • Online storage🗄 transfer service (>TB) (HTTPS endpoint)
  • Offline transfer appliance (>PB) (rack-able, high capacity storage🗄 server that you lease from Google)

“Big wheels 𐃏 keep on turning”

Cloud Bigtable is afully managed, scalable⚖️ NoSQL database🛢 service for large analytical and operational workloads. The databases🛢 in Bigtable are sparsely populated tables that can scale to billions of rows and thousands of columns, allowing you to store petabytes of data. Data encryption inflight and at rest are automatic

GCP fully manages the surface, so you don’t have to configure and tune it. It’s ideal for data that has a single lookup keys🔑 and for storing large amounts of data with very low latency.

Cloud Bigtable is offered through the same open source API as HBase, which is the native database🛢 for the Apache Hadoop 🐘 project.

Cloud SQL is a fully managed relational database🛢 service for MySQL, PostgreSQL, and MS SQL Server which provides:

  • Automatic replication
  • Managed backups
  • Vertical scaling ⚖️ (Read and Write)
  • Horizontal Scaling ⚖️
  • Google integrated Security 🔒

Cloud Spanner is a fully managed relational database🛢 with unlimited scale⚖️ (horizontal), strong consistency & up to 99.999% high availability.

It offers transactional consistency at a global🌎 scale ⚖️, schemas, SQL, and automatic synchronous replication for high availability, and it can provide petabytes of capacity.

Cloud Datastore is a highly scalable ⚖️ (Horizontal) NoSQL database🛢 for your web 🕸 and mobile 📱 applications.

  • Designed for application backends
  • Supports transactions
  • Includes a free daily quota

Comparing Storage🗄 Options

Cloud Datastore is the best for semi‑structured application data that is used in App Engine applications.

Bigtable is best for analytical data with heavy read/write events like AdTech, Financial 🏦, or IoT📲 data.

Cloud Storage🗄 is best for structured and unstructured binary or object data, like images🖼, large media files🎞, and backups.

Cloud SQL is best for web 🕸 frameworks and existing applications, like storing user credentials and customer orders.

Cloud Spanner is best for large‑scale⚖️ database🛢 applications that are larger than 2 TB, for example, for financial trading and e‑commerce use cases.

“Everybody, listen to me… And return me my ship⛵️… I’m your captain👩🏾️, I’m your captain👩🏾‍✈️”

Containers, Kubernetes ☸️, and Kubernetes Engine

Containers provide independent scalable ⚖️ workloads, that you would get in a PaaS environment, and an abstraction layer of the operating system and hardware, like you get in an IaaS environment. Containers virtualize the operating system rather than the hardware. The environment scales⚖️ like PaaS but gives you nearly the same flexibility as Infrastructure as a Service

Kubernetes ️ is an open source orchestrator for containers. K8s ☸️ make it easy to orchestrate many containers on many hosts, scale ⚖️ them, roll out new versions of them, and even roll back to the old version if things go wrong 🙁. K8s ☸️ lets you deploy containers on a set of nodes called a cluster.

A cluster is set of master components that control the system as a whole, and a set of nodes that run containers.

K8s ☸️ deploys a container or a set of related containers, it does so inside an abstraction called a pod.

A pod is the smallest deployable unit in Kubernetes.

Kubectl starts a deployment with a container running in a pod. A deployment represents a group of replicas of the same pod. It keeps your pods running 🏃‍♂️, even if a node on which some of them run on fails.

Google Kubernetes Engine (GKE) ☸️ is Secured and managed Kubernetes service ️ with four-way auto scaling ⚖️ and multi-cluster support.

  • Leverage a high-availability control plane ✈️including multi-zonal and regional clusters
  • Eliminate operational overhead with auto-repair 🧰, auto-upgrade, and release channels
  • Secure🔐 by default, including vulnerability scanning of container images and data encryption
  • Integrated Cloud Monitoring 🎛 with infrastructure, application, and Kubernetes-specific ☸️ views

GKE is like an IaaS offering in that it saves you infrastructure chores and it’s like a PaaS offering in that it was built with the needs of developers 👩‍💻 in mind.

Sample Syntax building a K8 cluster:

gcloud container clusters create k1

In GKE to make the pods in your deployment publicly available, you can connect a load balancer🏋️‍♀️ to it by running the kubectl expose command. K8s ☸️ then creates a service with a fixed IP address for your pods.

A service is the fundamental way K8s ️ represents load balancing 🏋️‍♀️. A K8s ☸️ attaches an external load balancer🏋️‍♀️ with a public IP address to your service so that others outside the cluster can access it.

In GKE, this kind of load balancer🏋️‍♀️ is created as a network load balancer🏋️‍♀️. This is one of the managed load balancing 🏋️‍♀️ services that Compute Engine makes available to virtual machines. GKE makes it easy to use it with containers.

Service groups is a set of pods together and provides a stable end point for them

Imperative commands

kubectl get services shows you your service’s public IP address

kubectl scale – scales ⚖️ a deployment

kubectl expose – creates a service

kubectl get pods watch the pods come online

The real strength 💪of K8s ☸️ comes when you work in a declarative of way. Instead of issuing commands, you provide a configuration file (YAML) that tells K8s ☸️ what you want your desired state to look like, and Kubernetes ☸️ figures out how to do it.

When you choose a rolling update for a deployment and then give it a new version of the software it manages, Kubernetes will create pods of the new version one by one, waiting for each new version pod to become available before destroying one of the old version pods. Rolling updates are a quick way to push out a new version of your application while still sparing your users from experiencing downtime.

“Going where the wind 🌬 goes… Blooming like a red rose🌹

Introduction to Hybrid and Multi-Cloud Computing (Anthos)

Modern hybrid or multi‑cloud☁️ architectures allows you to keep parts of your system’s infrastructure on‑premises, while moving other parts to the cloud☁️, creating an environment that is uniquely suited to many company’s needs.

Modern distributed systems allow a more agile approach to managing your compute resources

  • Move only some of you compute workloads to the cloud ☁️
  • Move at your own pace
  • Take advantage of cloud’s☁️ scalability️ and lower costs 💰
  • Add specialized services to compute resources stack

Anthos is Google’s modern solution for hybrid and multi-cloud☁️ systems and services management.

The Anthos framework rests on K8s ☸️ and GKE deployed on‑prem, which provides the foundation for an architecture that is fully integrated with centralized management through a central control plane that supports policy‑based application life cycle🔄 delivery across hybrid and multi‑cloud☁️ environments.

Anthos also provides a rich set of tools🛠 for monitoring 🎛 and maintaining the consistency of your applications across all of your network, whether on‑premises, in the cloud☁️ K8s ☸️, or in multiple clouds☁️☁️.

Anthos Configuration Management provides a single source of truth for your cluster’s configuration. That source of truth is kept in the policy repository, which is actually a Git repository.

“And I discovered🕵️‍♀️ that my castles 🏰 stand…Upon pillars of salt🧂 and pillars of sand 🏖

App Engine (PaaS) builds a highly scalable ⚖️ application on a fully managed serverless platform.

App Engine makes deployment, maintenance, autoscaling ⚖️ workloads easy allowing developers 👨‍💻to focus on innovation

GCP provides an App Engine SDK in several languages so developers 👩‍💻 can test applications locally before uploaded to the real App Engine service.

App Engine’s standard environment provides runtimes for specific versions of Java☕️, Python🐍, PHP, and Go. The standard environment also enforces restrictions🚫 on your code by making it run in a so‑called sandbox. That’s a software construct that’s independent of the hardware, operating system, or physical location of the server it runs🏃‍♂️ on.

If these constraints don’t work for a given applications, that would be a reason to choose the flexible environment.

App Engine flexible environment:

  • Builds and deploys containerized apps with a click
  • No sandbox constraints
  • Can access App Engine resources

App Engine flexible environment apps use standard runtimes, can access App Engine services such as

  • Datastore
  • Meme cache
  • Task Queues

Cloud Endpoints – Develop, deploy, and manage APIs on any Google Cloud☁️ back end.

Cloud Endpoints helps you create and maintain APIs

  • Distributed API management through an API console
  • Expose your API using a RESTful interface

Apigee Edge is also a platform for developing and managing API proxies.

Apigee Edge focus on business problems like rate limiting, quotas, and analytics a

  • A platform for making APIs available to your customers and partners
  • Contains analytics, monetization, and a developer portal

Developing in the Cloud ☁️

Cloud Source Repositories – Fully featured Git repositories hosted on GCP

Cloud Functions – Scalable ⚖️ pay-as-you-go functions as a service (FaaS) to run your code with zero server management.

  • No servers to provision, manage, or upgrade
  • Automatically scale⚖️ based on the load
  • Integrated monitoring 🎛, logging, and debugging capability
  • Built-in security🔒 at role and per function level based on the principle of least privilege
  • Key🔑 networking capabilities for hybrid and multi-cloud☁️☁️ scenarios
  • Deployment: Infrastructure as code

Deployment: Infrastructure as code

Deployment Manager – creates and manages cloud☁️ resources with simple templates

  • Provides repeatable deployments
  • Create a .yaml template describing your environment and use Deployment Manager to create resources

“Follow my lead, oh, how I need… Someone to watch over me”

Monitoring 🎛: Proactive instrumentation

Stackdriver is GCP’s tool for monitoring 🎛, logging and diagnostics. Stackdriver provides access to many different kinds of signals from your infrastructure platforms, virtual machines, containers, middleware and application tier; logs, metrics and traces. It provides insight into your application’s health ⛑, performance and availability. So, if issues occur, you can fix them faster.

Here are the core components of Stackdriver;

  • Monitoring 🎛
  • Logging
  • Trace
  • Error Reporting
  • Debugging
  • Profiler

Stackdriver Monitoring 🎛 checks the end points of web 🕸 applications and other Internet‑accessible services running on your cloud☁️ environment.

Stackdriver Logging view logs from your applications and filter and search on them.

Stackdriver error reporting tracks and groups the errors in your cloud☁️ applications and it notifies you when new errors are detected.

Stackdriver Trace sample the latency of App Engine applications and report per URL statistics.

Stackdriver Debugger of connects your application’s production data to your source code so you can inspect the state of your application at any code location in production

“Whoa oh oh oh oh… Something big I feel it happening”

GCP Big Data Platform – services are fully managed and scalable ⚖️ and Serverless

Cloud Dataproc is a fast, easy, managed way to run🏃‍♂️ Hadoop 🐘 MapReduce 🗺, Spark 🔥, Pig 🐷 and Hive 🐝 Service

  • Create clusters in 90 seconds or less on average
  • Scale⚖️ cluster up and down even when jobs are running 🏃‍♂️
  • Easily migrate on-premises Hadoop 🐘 jobs to the cloud☁️
  • Uses Spark🔥 Machine Learning Libraries📚 (MLib) to run classification algorithms

Cloud Dataflow🚰 – Stream⛲️ and Batch processing; unified and simplified pipelines

  • Processes data using Compute Engine instances.
  • Clusters are sized for you
  • Automated scaling ⚖️, no instance provisioning required
  • Managed expressive data Pipelines
  • Write code once and get batch and streaming⛲️.
  • Transform-based programming model
  • ETL pipelines to move, filter, enrich, shape data
  • Data analysis: batch computation or continuous computation using streaming
  • Orchestration: create pipelines that coordinate services, including external services
  • Integrates with GCP services like Cloud Storage🗄, Cloud Pub/Sub, BigQuery and BigTable
  • Open source Java☕️ and Python 🐍 SDKs

BigQuery🔎 is a fully‑managed, petabyte scale⚖️, low‑cost analytics data warehouse

  • Analytics database🛢; stream data 100,000 rows /sec
  • Provides near real-time interactive analysis of massive datasets (hundreds of TBs) using SQL syntax (SQL 2011)
  • Compute and storage 🗄 are separated with a terabit network in between
  • Only pay for storage 🗄 and processing used
  • Automatic discount for long-term data storage 🗄

Cloud Pub/Sub – Scalable ⚖️, flexible🧘‍♀️ and reliable enterprise messaging 📨

Pub in Pub/Sub is short for publishers

Sub is short for subscribers.

  • Supports many-to-many asynchronous messaging📨
  • Application components make push/pull subscriptions to topics
  • Includes support for offline consumers
  • Simple, reliable, scalable ⚖️ foundation for stream analytics
  • Building block🧱 for data ingestion in Dataflow, IoT📲, Marketing Analytics
  • Foundation for Dataflow streaming⛲️
  • Push notifications for cloud-based☁️ applications
  • Connect applications across GCP (push/pull between Compute Engine and App Engine

Cloud Datalab🧪 is a powerful interactive tool created to explore, analyze, transform and visualize data and build machine learning models on GCP.

  • Interactive tool for large-scale⚖️ data exploration, transformation, analysis, and visualization
  • Integrated, open source
    • Built on Jupyter

“Domo arigato misuta Robotto” 🤖

Cloud Machine Learning Platform🤖

Cloud☁️ machine‑learning platform🤖 provides modern machine‑learning services🤖 with pre‑trained models and a platform to generate your own tailored models.

TensorFlow 🧮 is an open‑source software library 📚 that’s exceptionally well suited for machine‑learning applications🤖 like neural networks🧠.

TensorFlow 🧮 can also take advantage of Tensor 🧮 processing units (TPU), which are hardware devices designed to accelerate machine‑learning 🤖 workloads with TensorFlow 🧮. GCP makes them available in the cloud☁️ with Compute Engine virtual machines.

Generally, applications that use machine‑learning platform🤖 fall into two categories, depending on whether the data worked on is structured or unstructured.

For structured data, ML 🤖 can be used for various kinds of classification and regression tasks, like customer churn analysis, product diagnostics, and forecasting. In addition, Detection of anomalies like fraud detection, sensor diagnostics, or log metrics.

For unstructured data, ML 🤖 can be used for image analytics, such as identifying damaged shipment, identifying styles, and flagging🚩content. In addition, ML🤖 can be used for text analytics like a call 📞 center log analysis, language identification, topic classifications, and sentiment analysis.

Cloud Vision API 👓 derives insights from your images in the cloud☁️ or at the edge with AutoML Vision👓 or use pre-trained Vision API👓 models to detect emotion, understand text, and more.

  • Analyze images with a simple REST API
  • Logo detection, label detection
  • Gain insights from images
  • Detect inappropriate content
  • Analyze sentiment
  • Extract text

Cloud Natural Language API 🗣extracts information about people, places, events, (and more) mentioned in text documents, news articles, or blog posts

  • Uses machine learning🤖 models to reveal structure and meaning of text
  • Extract information about items mentioned in text documents, news articles, and blof posts

Cloud Speech API 💬 enables developers 👩‍💻 to convert audio to text.

  • Transcribe your content in real time or from stored files
  • Deliver a better user experience in products through voice 🎤 commands
  • Gain insights from customer interactions to improve your service

Cloud Translation API🈴 provides a simple programmatic interface for translating an arbitrary string into a supported language.

  • Translate arbitrary strings between thousands of language pairs
  • Programmatically detect a document’s language
  • Support for dozens of languages

Cloud Video Intelligence API📹 enable powerful content discovery and engaging video experiences.

  • Annotate the contents of videos
  • Detect scene changes
  • Flag inappropriate content
  • Support for a variety of video formats

“Fly away, high away, bye bye…” 🦋

We will continue next week with Part II of this series….

Thanks –

–MCS

Week of June 19th

“I had some dreams, they were clouds ☁️ in my coffee☕️ … Clouds ☁️ in my coffee ☕️ , and…”

Hi All –

Last week, we explored Google’s fully managed “No-Ops” Cloud ☁️ DW solution, BigQuery🔎. So naturally it made sense to drink🍹more of the Google Kool-Aid and further discover the data infrastructure offerings within the Google fiefdom 👑. Besides we have been wanting to find what all the hype was about with Datafusion ☢️ for some time now which we finally did and happily😊 wound up getting a whole lot more than we bargained for…

To take us through the Google’s stratosphere☁️ would be no other than some of more prominent members of the Google Cloud Team;  Evan JonesJulie Price, and Gwendolyn Stripling. Apparently, these Googlers (all of which seemed have mastered the art of using their hands👐 while speaking) collaborated with other data aficionados at Google on a 6 course compilation of  awesomeness😎 for the Data Engineering on the Google Cloud☁️Path.  The course that fit the bill to start this week’s learning off was Building Batch Data Pipelines on GCP

Before we were able to dive right into DataFusion☢️, we first started off with a brief review of EL (Extract and Load), ELT (Extract, Transform and Load), and ETL (Extract, Load, and Transform) .

The best way to think of these types of data extraction is the following:

  • EL is like a package📦 delivered right to your door🚪 where the contents can be taken right out of the box and used. (data can be imported “as is”) 
  • ELT is like a hand truck 🛒 which allows you to move packages easily, but the packages 📦 📦 stilled need to be unloaded and items possibly stored a particular way.
  • ELT is like a forklift 🚜 this is when heavy lifting needs to be done to transfer packages and have them fit in the right place

In the case of EL and ELT our flavor du jour in the Data Warehouse space, Bigquery🔎 is an ideal target 🎯system but when you need the heavy artily (ELT) that’s when you got to bring an intermediate solution. The best way to achieve these goals is the following: 

  • Data pipelines 
  • Manage pipelines 
  • UI to build those pipelines

Google offers several data transformation and streaming pipeline solutions (Dataproc🔧 and Dataflow🚰) and one easy to use UI (DataFusion☢️) that makes it easy to build those pipelines. Our first stop was Dataproc🔧 which is a fast, easy-to-use, fully managed cloud☁️ service meant for Apache Spark⚡️ and Apache Hadoop🐘 clusters. Hadoop🐘 solutions are generally not really our area of expertise but nevertheless we spent some time here to get a good general understanding of how this solution works and since Datafusion ☢️ sits on top of Dataproc🔧. It was worth our while to understand how it all works

Next, we ventured over too the much anticipated Datafusion☢️ which was more than worth our wait! Datafusion☢️ uses ephemeral Dataproc🔧VMs to perform all the transforms in batch data pipelines (Streaming currently not supported but coming soon through Dataflow🚰 support). Under the hood Datafusion☢️ leverages five main components

1.     Kubernetes☸️ Engine (runs in a containerized environment on GKE)

2.     Key🔑 Management Service (For Security)

3.     Persistent Disk

4.     Google Cloud☁️ Storage (GCS) (For long term storage)

5.     Cloud☁️ SQL – (To manages user and Pipeline Data)

The good news is that you don’t really need to muck around with any of these components. In fact, you shouldn’t even concern yourself with them at all.  I just mentioned them because I thought it was kind of cool stack 😎. The most important part of datafusion☢️ is the data fusion☢️ studio which is the graphical “no code” tool that allows Data Analysts and ETL Developers to wrangle data and build batch data pipelines. Basically, it allows you to build pretty complex pipelines by simply “drag and drop”.

“Don’t reinvent the wheel, just realign it.” – Anthony J. D’Angelo

So now with a cool 😎 and easy to build batch pipeline UI under our belt, what about a way to orchestrate all these pipelines? Well, Google pulled no punches🥊🥊 and gave us Cloud☁️ Composer which is fully managed data workflow and orchestration service that allows you to schedule and monitor pipelines. Following the motto of “not reinventing the wheel”, Cloud☁️ Composer leverages Apache Airflow 🌬.

For those who don’t know Apache Airflow 🌬is the popular Data Pipeline orchestration tool originally developed by the fine folks at Airbnb. Airflow🌬 is written in Python 🐍(our new favorite programming language), and workflows are created via Python 🐍 scripts (1 Python🐍 file per DAG).  Airflow 🌬uses directed acyclic graphs (DAGs) to manage workflow orchestration. Not to be confused with a uncool person or an unpleasant sight on a sheep 🐑  A DAG* is a simply a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies.

Take a bow for the new revolution… Smile and grin at the change all around me

Next up on our adventures was onto Dataflow🚰 which is a fully managed streaming 🌊 analytics service that minimizes latency, processing time, and cost through autoscaling as well as batch processing.  So why Dataflow🚰 and not DataProc🔧?

No doubt, Dataproc🔧 is a solid data pipeline solution which meets most requirements for either Batch or Streaming🌊Pipelines but it’s a bit clunky and requires existing knowledge of Hadoop🐘/Spark⚡️ infrastructure. 

Dataproc🔧 is still an ideal solution for those who want to bridge 🌉 the gap by moving their on-premise Big Data Infrastructure to GCP.  However, if you have a green field project than Dataflow🚰 definitely seems like the way to go.

DataFlow🚰 is “Server-less” which means the service “just works” and you don’t need to manage it! Once again, Google holds true to form with our earlier mantra (“not reinventing the wheel”) as Cloud Dataflow🚰 is built on top of the popular batch and streaming pipeline solution Apache Beam.

For those not familiar with Apache BEAM (Batch and StrEAM)  it was also developed by Google to ensure the perfect marriage between batch and streaming data-parallel processing pipelines. A true work of art!

The show must go on….

So ya… Thought ya. Might like to go to the show…To feel the warm🔥 thrill of confusion. That space cadet glow

Now that we were on role with our journey through GCP’s Data Ecosystem it seemed logical to continue our path with the next course Building Resilient Streaming Analytics Systems on GCP.  This exposition was taught by the brilliant Raiyann Serang who maintains a well kempt hairdo throughout his presentations and the distinguished Nitin Aggarwal as well as the aforementioned Evan Jones.

First, Raiyann’s takes us through a brief introduction on streaming 🌊 data (data processing for unbounded data sets). In addition, he provides The reasons for streaming 🌊 data and value that streaming🌊 data provides to the business by enabling real time information in a dashboard or another means to see the current state. He touches on the ideal architectural model using Google Pub/Sub and Dataflow🚰 to construct a data pipeline to minimize latency at each step during the ingestion process.

Next, he laments about the infamous 3Vs  in regards to streaming 🌊 data and how might a data engineer deal with these challenges.

Volume

  • How to ingest this data into the system?
  • How to store and organize data to process this data quickly?
  • How will the storage layer be integrated with other processing layers?

Velocity

  • 10,000 records/sec being transferred (Stock market Data, etc.)
  • How systems need to be able handle the load change?

Variety

  • Type and format of data and the constraints of processing

Next, he provides a preview to the rest of the course as he unveils Google’s triumvirate to the streaming data challenge. Pub/Sub to deal with variable volumes of data, Dataflow🚰 to process data without undue delays and Bigquery🔎 to address need of ad-hoc analysis and immediate insights.

Pure Gold!

After a great Introduction, Raiyann’s takes us right to Pub/Sub. Fortunately, we has been to this rodeo before and were well aware of the value of Pub/Sub  Pub/Sub Is a ready to use asynchronous distribution system that fully manages data ingestion for both on cloud ☁️ and on premise environments. It’s a highly desirable solution when it comes to streaming solutions because of how well it addresses Availability, Durability, and Scalability.

The short story around Pub/Sub is a story of two data structures, the topic and the subscription.  The Pub/Sub Client that creates the topic is called the Publisher and the Cloud Pub/Sub client that creates the subscription is the subscriber. Pub/Sub provides both Push (periodically calling for messages) and Pull (Clients have to acknowledge the message as separate step) deliveries.

Now, that we covered how to process data, it was time to move to the next major piece in our data architectural model and that is how to process the data without undue delays Dataflow🚰 

Taking us through this part of the journey would be Nitin. We had already covered Dataflow🚰 earlier in the week in the previous course but that was only in regards to batch data (Bound data or unchanging data) pipelines. 

DataFlow🚰 if you remember is built on Apache Beam, so in another words it has “the need for speed” and can support streams🌊 of data.  Dataflow🚰 is highly scalable with low latency processing pipelines for incoming messages. Nitin further discusses the major challenges with handling streaming or real time data and how DataFlow🚰 tackles these obstacles.

  • Streaming 🌊 data generally only grows larger and more frequent
  • Fault Tolerance – Maintain fault tolerance despite increasing volumes of data
  • Model – Is it streaming or repeated batch?
  • Time – (Latency) what if data arrives late

Next, he discusses one of DataFlow🚰 key strengths “Windowing” and provides details in the three kinds of Windows.

  • Fixed – Divides Data into time Slices
  • Sliding – Those you use for computing (often required when doing aggregations over unbounded data)
  • Sessions – defined by minimum gap duration and timing is triggered by another element (communication is bursty)

Then Nitin rounds it off with one of the key concepts when it comes to Streaming🌊 data pipelines the “watermark trigger”. The summit of this module is the lab on Streaming🌊 Data Processing which requires building a full end to end solution using Pub/Sub, Dataflow🚰, and Bigquery. In addition,  he gave us a nice introduction to Google Cloud☁️ Monitoring which we had not seen before.

So much larger than life… I’m going to watch it growing 

We next headed over to another spoke in the data architecture wheel 🎡with Google’s Bigtable . Bigtable (built on Colossus) is Google’s NoSQL solution for high performance applications. We hadn’t done much so far with NoSQL up until this point.  So, this module offered us a great primer for future travels.

Bigtable is ideal for storing very large amounts of data in a key-value store or non-structured data and it supports high read and write throughput at low latency for fast access to large datasets. However, Bigtable is not a good solution for Structured data, small data (< TB) or data that requires SQL Joins. Bigtable is good for specific use cases like real-time lookups as part of an application, where speed and efficiency are desired beyond that of a database. When Bigtable is a good match for specific workloads “it’s so consistently fast that it is magical 🧙‍♂️”.

“And down the stretch they 🐎 come!”

Next up, Evan takes us down the homestretch by surveying Advanced BigQuery 🔎 Functionality and Performance. He first begins with an overview and a demo of BigQuery 🔎 and GIS (Geographic Information Systems) functions which allows you to analyze and visualize Geo Spatial data in BigQuery🔎. This is a little beyond the scope of our usual musings but it’s good to know from an informational standpoint. Then Evan covers  a critical topic for any data engineer or analyst to understand, which is how to break apart a single data set into groups or Window Functions .

Followed by a lab that demonstrated some neat tricks on how to reduce I/O, Cache results, and perform efficient joins by using the the WITH Clause, Changing the parameter of Region location, and denormalization of the data respectively. Finally, Evan leaves use with a nice parting gift by providing a handy cheatsheet and a quick lab on Partitioned Tables in Google BigQuery🔎

* DAG is a directed graph data structure that uses a topological ordering. The sequence can only go from earlier to later. DAG is often applied to problems related to data processing, scheduling, finding the best route in navigation, and data compression.

“It’s something unpredictable, but in the end it’s right”

Below are some topics I am considering for my travels next week:

  • NoSQL – MongoDB, Cosmos DB
  • More on Google Data Engineering with
  • Google Cloud Path <-Google Cloud Certified Professional Data Engineer
  • Working JSON Files
  • Working with Parquet files 
  • JDBC Drivers
  • More on Machine Learning
  • ONTAP Cluster Fundamentals
  • Data Visualization Tools (i.e. Looker)
  • Additional ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks)

Stay safe and Be well

—MCS

Week of June 12th

“Count to infinity, ones and zeroes”

Happy Russia 🇷🇺 Day !

Earth🌎 below us, Drifting, falling, Floating, weightless, Coming, coming home

After spending the previous two weeks in the far reaches 🚀 of space 🌌 , it was time 🕰 to return to my normal sphere of activity and back to the more familiar data realm. This week we took a journey into Google’s Data Warehouse solution better known as BigQuery🔎. This was our third go around in GCP as we previously looked at Google’s Compute and Data messaging service Pub/Sub.

It seems to me like the more we dive into the Google Cloud Platform ecosystem the more impressed we become with the GCP offerings. As for Data Warehouse, we had previously visited with Snowflake❄️ which we found to be a nifty SaaS solution for DW. After tinkering around a bit with BigQuery🔎 we found it to be equally utilitarian. BigQuery🔎 like its strong competitor offers a “No-Ops” approach to data warehousing while adhering to the 3Vs of Big Data. Although we didn’t benchmark either DW solution both offerings are highly performant based on 99 TPC-DS industry benchmarks 

In the case of Snowflake❄️, it offers the flexibility to data professionals to scale the compute and storage resources up and down independently based on workloads whereas BigQuery 🔎 is “server-less” and all scaling is done automatically . BigQuery🔎 doesn’t use indexes but rather it relies on its awesome clustering technology to make its queries scream 💥. 

Both Snowflake❄️ and BigQuery🔎 are quite similar with their low maintenance and minimal task administration but where both products make your head spin🌪 is trying to make heads or tails out of their pricing models.  Snowflake’s❄️ pricing is a little bit easier to try to interrupt whereas BigQuery🔎 seems like you need to have P.H.D. in cost models just to read through the fine print.

To accurately determine which product, offers a better TCO, really depends on your individual workloads. From what I gather if you’re running lots of queries sporadically, with high idle time than BigQuery🔎 is the place to be from a pricing standpoint. However, if your workloads are more consistent than its probably more cost effective to go with Snowflake❄️ based on their pay as you go model.

To assist us on our exploration of BigQuery was the founder of LoonyCorn, the bright and talented Janani Ravi through her excellent Pluralsight course.  Janani gives a great overview of the solution and keeps the flow of the course at a reasonable pace as we took a deep dive into this complex data technology.

One Interesting observations about the course as it was published about a year and half ago (15 Oct 2018) is how much improvements Google has made to the product offering since then including a refined UI, more options for partition keys and enhancements to Python Module. The course touches on design, comparison to RDBMS, and other DWs and shows us different ingestion of file types including the popular Binary format Avaro.

The meat🥩 and potatoes🥔 of the course is the section of Programmatically Accessing BigQuery from Client Programs. This is where Jani goes into some advanced programming options like UNNEST, ARRAY_AGG, STRUCT Operators and the powerful Windowing Operations.

See log for more details

To round out the course, she takes us through some additional nuggets in GCP like Google Data Studio (https://datastudio.google.com/) for Data Visualization and Cloud☁️ Notebooks📓 and Python by utilizing Google Datalab🧪.

Stay, ahhh Just a little bit longer Please, please, please, please, please Tell me that you’re going to….

Below are some topics I am considering for my travels next week:

  • Google Cloud Data Fusion (EL/ETL/ELT)
  • More on Google Big Query
  • More on Data Pipelines
  • NoSQL – MongoDB, Cosmos DB
  • Working JSON Files
  • Working with Parquet files 
  • JDBC Drivers
  • More on Machine Learning
  • ONTAP Cluster Fundamentals
  • Data Visualization Tools (i.e. Looker)
  • ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks)

Stay safe and Be well –

–MCS

Week of May 22nd

And you know that notion just cross my mind…

Happy Bitcoin PizzaEmoji Day!

All aboard! This week our travels would take us on the railways far and high but before, we can hop on the knowledge express we had some unfinished business to attended too.

“Oh, I get by with a little help from my friends”

If you have been following my weekly submissions for the last few weeks I listed as future action item “create/configure a solution that leverages Python to stream market data and insert it into a relational database.

Well last week, I found just the perfect solution. A true master piece by Data Scientist/Physicist extraordinaire AJ Pryor, Ph.D. AJ had created a brilliant multithreaded work of art that continuously queries market data from IEX  and then writes it to a PostgreSQL database. In addition, he built a data visualization front-end that leverages Pandas and Bokeh so the application can run interactively through a standard web browser. It was like a dream come true! Except that the code was written like 3 years ago and referenced a deprecated API from IEX.

Ok, no problem. We will just simply modify AJ’s “Mona Lisa” to reference the new IEX API and off we will go.  Well, what seemed like was a dream turned into a virtual nightmare. I spent most of last week spinning my wheels trying to get the code to work but to no avail. I even reached out to the community on Stack overflow but all I received was crickets..

As I was ready to cut my loses, but I reached out to a longtime good friend who happens to be all-star programmer and a fellow NY Yankees baseball enthusiast. Python wasn’t his specialty (he is really an amazing Java programmer) but he offered to take a look at the code when he had some time… So we set up a zoom call this past Sunday and I let his wizardry take over… After about hour or so he was in a state of flow and had a good pulse of what our maestro AJ’s work was all about. After a few modifications my good chum had the code working and humming along. I ran into a few hiccups along the way with the brokeh code, but my confidant just referred me to run some simpler syntax and then abracadabra… this masterpiece was now working on the Mac!Emoji As the new week started, I was still basking in the radiance of this great coding victory. So, I decided to be a bit ambitious and move this gem Emoji to the cloud Emoji which would be like the crème de la crème of our learnings thus far. Cloud, Python/Pandas, Streaming market data, and Postgres all wrapped up in one! Complete and utter awesomeness! 

Now the question was for which cloud platform to go with? We were well versed in the compute area in all 3 of the major providers as a result of our learnings.

So with a flip of the coin ,we decided to go with Microsoft Azure. That and we had some free credits still available. Emoji

With sugar plum fairies dancing Emoji in our head, we spun up our Ubuntu Image and we followed along the well documented steps on AJ’s Github project 

Now, we were now cooking Emoji with gasoline Emoji! We cloned AJ’s Github repo, modified the code with our new changes, and executed the syntax and just as we were ready to declare victory… Stack overflow Error! Emoji Oh, the pain.

Fortunately I didn’t waste any time, I went right back to my ace Emoji in the hole but with some trepidation that I wasn’t being too much of irritant.

I explained my perplexing predicament and without hesitation my Fidus Achates offered some great trouble shooting tips and quite expeditiously we had the root cause pinpointed. For some peculiar reason, the formatting of URL that worked like a charm on the MacEmoji was a dyspepsia on Ubuntu on Azure. It was certainly a mystery but one that can only be solved by simply rewriting the code.

So once again, my comrade in arms helped me through another quagmire. So, without further ado, may I introduce to you the one and only…

http://stockstreamer.eastus.cloudapp.azure.com:5006/stockstreamer

We’ll hit the stops along the way We only stop for the best

After feeling victorious after my own personal Battle of Carthage and with our little streaming market data saga out of our periphery it was to time to hit the rails… Emoji

Our first stop was messaging services which is all the rage now a days.  There are so many choices with data messaging services out there.. So where to start with? We went with Google’s Pub/Sub which turned out to be a marvelous choice! To get enlightened with this solution, we went to Pluralsight where we found excellent course on Architecting Stream Processing Solutions Using Google Cloud Pub/Sub by Vitthal Srinivasan 

Vitthal was a great conductor who navigated us through an excellent overview of Google’s impressive solution, uses cases, and even touched on a rather complex pricing structure in our first lesson. He then takes us deep into the weeds showing us how to create Topics, Publishers, and Subscribers. He goes on further by showing us how to leverage some other tremendous offerings in GCP like Cloud Functions, API & Services, and Storage. 

Before this amazing course my only exposure was just limited to GCP’s Compute Engine so this was eye opening experience to see the great power that GCP had to offer! To round out the course, he showed us how to use GCP Pub/Sub with some client Libraries which was excellent tutorial on how to use Python with this awesome product. There was even two modules on how to integrate Google Hangout Chatbot with Pub/Sub but that required you to be a G Suite User. (There was free trial but skipped the set up and just watched the videos) Details on the work I did on Pub/Sub can be found at

“I think of all the education that I missed… But then my homework was never quite like this”

For Bonus this week, I spent enormous amount of time brushing up my 8th grade Math and Science Curriculum 

  1. Liner Regression
  2. Epigenetics
  3. Protein Synthesis

Below are some topics I am considering for my Journey next week:

  • Vagrant with Docker
  • Continuing with Data Pipelines
  • Google Cloud Data Fusion (ETL/ELT)
  • More on Machine Learning
  • ONTAP Cluster Fundamentals
  • Google Big Query
  • Data Visualization Tools (i.e. Looker)
  • ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks) .
  • Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well –

–MCS 

Week of April 17th

“Seems so good… Hard to believe an end to it.. Warehouse is bare”

Greetings and Salutations..

My odyssey this week didn’t quite have the twists and turns of last week’s pilgrimage. Perhaps, it’s was a bit of hangover from “Holy Week” or just the sheer lassitude from the an over abundance of learning new skills during this time of quarantine?

Last weekend, I finally got a chance to tinker around with Raspberry PI. I managed to get it setup on my Wifi network.. Thanks to finding an HDMI cable to hook up to my 50′ Samsung TV. I also learned how to set VSCode on the Amanda’s Mac and connect via SSH to my PI with out using a password (creating a token instead) which is convenient little trick. In addition, I got to play around with the Sense HAT Python module and made the Raspberry PI display some trippy flashing lights with the on-top LED board add-on.

So after being on Cloud 9 most of the last week, I decided I should stay a little longer in the stratosphere and bang on the Google Cloud Platform or more endearing known as GCP and their Compute Engine. Again, I would rebuild my SQL Server 2016 Always On environment (previously created several weeks back on AWS and last week on Azure). Once again, the goal would be to compare and contrast now all 3 major cloud providers. In effort to avoid redundancy here, I won’t reiterate the same prerequisites taken to build out my SQL farm. However, I will share my observations with GCP

GCP – Compute Engine Issues – Observations:

  • More options than Azure but less then AWS – Interface was bit confusing to navigate through
  • 1st instance built (Domain Controller) got corrupt whenI upgraded my account from Free to Paid Instance need to be rebuilt
  • Windows Domain failed to be created as Administrator account was set to blank (disabled on AWS EC2 and Azure VMs)
  • Disks can only be detached from an instance that is an “Stopped” state

Here are my final rankings based on the the 3 Cloud providers I evaluated

Pros

1. Best UI: AWS2.

2. Most Features: AWS

3. Easiest to Navigate UI: Azure

4. Most suited for Microsoft workloads (i.e. SQL Server): Azure

5. Most enticing to use it: GCP (free offer)

6. Most potential: GCP

Cons

1. Hardest to Navigate UI: GCP

2. Hardest to configure (security): AWS

3. Least amount of features: Azure

So after burning out my fuse up there alone… it was time to touch down and bring me round again.. And that led me back to SQL Server and Columnstore Indexes. Here is what I did:

  • Downloaded/Restored modified AdventureWorksDW2012
    • Created 60 million rows for the Internet Sales Fact table
    • Created Columnstore Index and benchmarked performance vs old Row based Index
    • Created Table partition for Internet Sales Fact table and designed Partition Switching scenario (workaround used with SQL 2012 and non-clustered CIs) Old Lava Archive database design.

Next Steps.. 
Below are some topics I am considering for my voyage next week:

  •  SQL Server Advanced Features:
    • Best practices around SQL Server AlwaysOn (Snapshot Isolation/sizing of Tempdb, etc)
  • Data Visualization Tools (i.e. Looker)
  • ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks) .
  • Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS