Blog | SQL Squirrels

Week of June 11 Detailed Log

Posted on June 12, 2020 by Mark Shay

SELECT * FROM bigquery-public-data.samples.shakespeare Limit 10

Enable GCP Shell

gcloud auth login

$ bq ls

$ bq ls publicdata:samples

$ bq mk spikeysales_dataset

$ bq ls

$ bq rm spikeysales_dataset

$ bq show publicdata:samples

$ bq show publicdata:samples.shakespeare

$ bq query “SELECT * FROM publicdata:samples.shakespeare LIMIT 10”

Creating and Editing Access to a Dataset

bq show –format=prettyjson spikeysales_dataset > dataset_spikeysales.json

bq update –source dataset_spikeysales.json spikeysales_dataset

bq show –format=prettyjson spikeysales_dataset

Creating and Querying Ingestion Time Partitioned Tables

bq show –schema –format=prettyjson spikeysales_dataset.e_commerce_data

[

{

“mode”: “NULLABLE”,

“name”: “RowID”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “OrderID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “OrderDate”,

“type”: “DATE”

{

“mode”: “NULLABLE”,

“name”: “ShipDate”,

“type”: “DATE”

{

“mode”: “NULLABLE”,

“name”: “ShipMode”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “CustomerID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “CustomerName”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Segment”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_Country”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_City”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_State”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_Postal_Code”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “Address_Region”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Product_ID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Category”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Sub_Category”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Product_Name”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Sales”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “Quantity”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “Discount”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “Profit”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “row_number”,

“type”: “INTEGER”

}

]

Creating Column Based Partitioned Tables Using the Command Line

bq mk –table –schema OrderID:STRING,OrderDate:DATE,Quantity:INTEGER –time_partitioning_field OrderDate spikey-bq-279714:spikeysales_dataset.order_data

bq –location=US query –nouse_legacy_sql –destination_table=spikey-bq-279714:spikeysales_dataset.order_data ‘SELECT OrderID,OrderDate,Quantity FROM spikeysales_dataset.e_commerce_data WHERE OrderDate is NOT NULL LIMIT 1000’

bq show spikeysales_dataset.order_data

bq –location=US query –nouse_legacy_sql ‘SELECT * FROM `spikey-bq-279714.spikeysales_dataset.order_data` WHERE Orderdate=DATE(“2014-11-01”)’

Populating Data into a Table with Nested Fields Using STRUCT

bq show –schema –format=prettyjson spikeysales_dataset.e_commerce_partitioned

[

{

“mode”: “NULLABLE”,

“name”: “RowID”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “OrderID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “OrderDate”,

“type”: “DATE”

{

“mode”: “NULLABLE”,

“name”: “ShipDate”,

“type”: “DATE”

{

“mode”: “NULLABLE”,

“name”: “ShipMode”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “CustomerID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “CustomerName”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Segment”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_Country”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_City”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_State”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address_Postal_Code”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “Address_Region”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Product_ID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Category”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Sub_Category”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Product_Name”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Sales”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “Quantity”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “Discount”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “Profit”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “row_number”,

“type”: “INTEGER”

}

]

Design Schema with STRUCT

[

{

“mode”: “NULLABLE”,

“name”: “RowID”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “OrderInfo”,

“type”: “RECORD”,

“fields”: [

{

“mode”: “NULLABLE”,

“name”: “OrderID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “OrderDate”,

“type”: “DATE”

{

“mode”: “NULLABLE”,

“name”: “Sales”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “Quantity”,

“type”: “INTEGER”

}

]

{

“mode”: “NULLABLE”,

“name”: “ShipDate”,

“type”: “DATE”

{

“mode”: “NULLABLE”,

“name”: “ShipMode”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “CustomerID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “CustomerName”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Segment”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Address”,

“type”: “RECORD”,

“fields”: [

{

“mode”: “NULLABLE”,

“name”: “Country”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “City”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “State”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Postal_Code”,

“type”: “INTEGER”

{

“mode”: “NULLABLE”,

“name”: “Region”,

“type”: “STRING”

}

]

{

“mode”: “NULLABLE”,

“name”: “Product_ID”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Category”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Sub_Category”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Product_Name”,

“type”: “STRING”

{

“mode”: “NULLABLE”,

“name”: “Discount”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “Profit”,

“type”: “FLOAT”

{

“mode”: “NULLABLE”,

“name”: “row_number”,

“type”: “INTEGER”

}

]

Populate Data from existing Table

INSERT INTO

`spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

(ROWID,

OrderInfo,

ShipDate,

ShipMode,

CustomerID,

CustomerName,

Segment,

Address,

Product_ID,

Category,

Sub_Category,

Product_Name,

Discount,

Profit )

SELECT

RowID,

STRUCT (OrderID, OrderDate, Sales, Quantity),

ShipDate,

ShipMode,

CustomerID,

CustomerName,

Segment,

STRUCT (Address_Country,

Address_City,

Address_State,

Address_Postal_Code,

Address_Region),

Product_ID,

Category,

Sub_Category,

Product_Name,

Discount,

Profit

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_partitioned`

Working with Repeated Fields

INSERT INTO

`spikey-bq-279714.spikeysales_dataset.customer_order_details`

(customer_name,order_id, product_names)

VALUES

(“John”, 434, [‘Trousers’, ‘Notebook’, ‘Football’,’Avocado’])

Populating Tables with Repeated Fields Using ARRAY_AGG

INSERT INTO

`spikey-bq-279714.spikeysales_dataset.customer_product_price` (

CustomerID,

Products_Bought,

Price_Paid )

SELECT

CustomerID,

ARRAY_AGG(Product_Name) AS Products_Bought,

ARRAY_AGG(Sales) AS Price_Paid

FROM

`spikey-bq-279714.spikeysales_dataset.e_commerce_data`

GROUP BY

CustomerID

Using Nested and Repeated Fields Together

SELECT

Address_City,

ARRAY_AGG (STRUCT (CustomerID,

CustomerName)) AS CustomerInfo

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data`

GROUP BY

Address_City

ORDER BY

Address_City

SELECT

Address_State,

ARRAY_AGG (STRUCT (OrderDate,

OrderID,

Sales,

Quantity)) AS OrderInfoInfo

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data`

GROUP BY

Address_State

ORDER BY

Address_State

Using UNNEST to Query Repeated Fields

SELECT CustomerID, Products

FROM `spikey-bq-279714.spikeysales_dataset.customer_product_price`,UNNEST(Products_Bought) as Products

LIMIT 1000

SELECT CustomerID, Products, Price

FROM `spikey-bq-279714.spikeysales_dataset.customer_product_price`,UNNEST(Products_Bought) as Products, UNNEST(Price_paid) as

Price

LIMIT 1000

SELECT

Address_State,

Orders

FROM `spikey-bq-279714.spikeysales_dataset.orders_by_state` ,

UNNEST( OrderInfoInfo ) as Orders

LIMIT 1000

Aggregations

SELECT

ANY_VALUE(OrderInfo) AS Any_Order

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

WHERE

Address.State = “California”

LIMIT 1000

SELECT

CustomerID, CustomerName, AVG( OrderInfo.Sales) AS Average_Price

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

GROUP BY

CustomerID, CustomerName

ORDER BY

Average_Price DESC

SELECT

CustomerID, CustomerName, COUNTIF( OrderInfo.Sales>100) AS Num_Over_100

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

GROUP BY

CustomerID, CustomerName

ORDER BY

Num_Over_100 DESC

SELECT

STRING_AGG(CustomerName)

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

SELECT

FORMAT(“%T”, ARRAY_AGG(Address.City)) AS Cities

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

SELECT

FORMAT(“%T”, ARRAY_AGG(DISTINCT Address.City)) AS Cities

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

SELECT

FORMAT(“%T”, ARRAY_AGG(Address.City LIMIT 5)) AS Cities

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

Subqueries

SELECT

r.OrderInfo.OrderID,

r.Product_ID,

r.CustomerName

FROM (

SELECT * FROM

`spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`) As r

ORDER BY

OrderInfo.OrderDate ASC

SELECT

r.segment,

SUM(r.sales)

FROM (

SELECT * FROM

`spikey-bq-279714.spikeysales_dataset.e_commerce_data`

WHERE Address_State =”California”) As r

GROUP BY

Segment

SELECT

Address_State,

SUM(sales) AS Total_Sales

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data`

WHERE Address_State IN (

SELECT

Address_State

FROM

`spikey-bq-279714.spikeysales_dataset.e_commerce_data`

WHERE

Shipmode = ‘Same Day’)

GROUP BY Address_State

Performing Window Operations Using “Partition By” and “Order By”

SELECT

Segment, SUM(OrderInfo.Sales) OVER (PARTITION BY Segment)

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

SELECT DISTINCT

Segment, SUM(OrderInfo.Sales) OVER (PARTITION BY Segment)

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

SELECT CustomerID, CustomerName, OrderInfo.Sales,

Address.State,

RANK() OVER (PARTITION BY Address.State ORDER BY OrderInfo.Sales DESC)

FROM `spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

SELECT

state,

city,

Total_sales,

PERCENT_RANK() OVER (PARTITION BY State ORDER BY Total_Sales ASC) Percentile_Rank

FROM (

SELECT

Address.State AS State,

Address.City AS City,

SUM(OrderInfo.Sales) AS Total_Sales

FROM

`spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

GROUP BY

Address.State,

Address.City )

Windowing Operations Using a Window Range

SELECT

state,

city,

Total_sales,

SUM(Total_Sales) OVER (PARTITION BY State ORDER BY Total_Sales ASC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

FROM (

SELECT

Address.State AS State,

Address.City AS City,

SUM(OrderInfo.Sales) AS Total_Sales

FROM

`spikey-bq-279714.spikeysales_dataset.e_commerce_data_nested`

GROUP BY

Address.State,

Address.City )

datalab create bq-datalab-vm

Week of June 5th

Posted on June 5, 2020 by Mark Shay

“Another dimension, new galaxy Intergalactic, planetary”

Happy National Donut Day!

“There is an inexorable force in the cosmos, a place where time and space converge. A place beyond man’s vision…but not beyond his reach. Man has reached the most mysterious and awesome corner of the universe…a point where the here and now become forever…. A journey that takes you where no man has been before Experience the power⚡️! A journey that begins where ~~everything~~ nothing ends! You can’t escape the most powerful force in the ‘DevOps’ universe.”

Mission #7419I

So once again, we boarded the USS Palomino 🚀 and continued our exploration to the far depths of the DevOps Universe. Just to pick up where we last left off,👨‍✈️ Captain Bret Fisher had taken us through the Microservices galaxy 🌌 and straight to Docker🐳 and Containers. But.. “with so many light years to go.. And things to be found” we continued through the courseware Docker Mastery: with Kubernetes +Swarm from a Docker Captain and reconnoitered Docker🐳 Compose, Docker🐳 Swam, Docker🐳 Registries, and the infamous Kubernetes☸️.

Again, we leveraged the portability of HashiCorp’s Vagrant for both Docker🐳 with Docker🐳 Compose, our 3 Node Docker 🐳 Swarm , and the K8s☸️ environments. We were grateful that we had our previous experiences with Vagrant in earlier learnings as it made standing up these environments quite seamless.

We started off with Docker 🐳 Compose which can be quite a useful tool in development for defining and running multi container Docker🐳 applications. Next, we headed right over to Docker🐳 Swam to get our initiation into Container Orchestration. You might ask why not just go straight Kubernetes☸️ as they are the clear winner🏆 from famous Container Orchestration wars? Well, Orchestration is great for solving complex problems but Orchestrators themselves can be complex solutions to try to learn. From what we witnessed this week we were glad we started there. We also learned that the “combination of Docker 🐳 Swarm, Stacks, and Secrets are kind of like a trilogy of awesome features” that can really make things easier if we went this route in production.

“Resistance is Futile“

If you not familiar with the story of Kubernetes☸️ or affectionately known as k8s☸️.. It came out of Google by the original developers who worked on the infamous Google “Borg” project.. In fact, here is a little bit of trivia, the code name project for K8s☸️ was called Project Seven of Nine, a reference to the Star Trek🖖 character of the same name who was a “friendlier” Borg. K8s☸️ was certainly uncharted territories for us and bit out of my purview but it was a good learning experience to to get a high level overview of another important component of the infrastruture ecosystem.

Captains’ log Star date 73894.9 These are missions covered in earnest this week:

Created a 3-node Swarm cluster in the cloud
Installed Kubernetes and and learn the leading server cluster tools
Used Virtual IP’s for built-in load balancing in your cluster
Optimized Dockerfiles for faster building and tiny deploys
Built/Published custom application images
Learned the differences between Kubernetes and Swarm
Created an image registry
Used Swarm Secrets to encrypt your environment configs
Created the config utopia of a single set of YAML files for local dev, CI testing, and prod cluster deploys
Deployed apps to Kubernetes
Made Dockerfiles and Compose files
Built multi-node Swarm clusters and deploying H/A containers
Made Kubernetes YAML manifests and deploy using infrastructure-as-code methods
Built a workflow of using Docker in dev, then test/CI, then production with YAML

For more details see the complete Log

This turned out to be quite the intensive undertaking this week but we accomplished our mission and here is certificate to prove it

Below are some topics I am considering for my exploration next week:

Google Big Query
More with with Data Pipelines
Google Cloud Data Fusion (ETL/ELT)
More with with Data Pipelines
NoSQL – MongoDB, Cosmos DB
Working JSON Files
Working with Parquet files
JDBC Drivers
More on Machine Learning
ONTAP Cluster Fundamentals
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks)

Stay safe and Be well –

–MCS

Week of June 5 Detailed Log

Posted on June 5, 2020 by Mark Shay

Install Docker Compose

sudo curl -L “https://github.com/docker/compose/releases/download/1.25.5/docker-compose-$(uname -s)-$(uname -m)” -o /usr/local/bin/docker-compose

sudo chmod +x /usr/local/bin/docker-compose

sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose

#Clean up

docker ps -aq

docker stop $(docker ps -aq)

docker rm $(docker ps -aq)

docker rmi $(docker images -q)

docker system prune

docker container prune

docker image prune

docker volume prune

docker network prune

cd ..

cd compose-sample-2

docker-compose up

Control-C

docker-compose up -d

docker-compose logs

docker-compose ps

docker-compose top

docker-compose down

nano docker-compose.yml

# create your drupal and postgres config here, based off the last assignment

version: ‘2’

services:

drupal:

image: drupal

ports:

– “8080:80”

volumes:

– drupal-modules:/var/www/html/modules

– drupal-profiles:/var/www/html/profiles

– drupal-sites:/var/www/html/sites

– drupal-themes:/var/www/html/themes

postgres:

image: postgres

ports:

– “5432:5432”

environment:

– POSTGRES_PASSWORD=mypasswd

volumes:

drupal-modules:

drupal-profiles:

drupal-sites:

drupal-themes:

#cleanup

docker-compose down -v

#cleanup local images

docker-compose down –rmi local

#DockerFile

#create your custom drupal image here, based of official drupal

FROM drupal:8.2

RUN apt-get update && apt-get install -y git \

&& rm -rf /var/lib/apt/lists/*

WORKDIR /var/www/html/themes

RUN git clone –branch 8.x-3.x –single-branch –depth 1 https://git.drupal.org/project/bo$

&& chown -R www-data:www-data bootstrap

WORKDIR /var/www/html

#Docker Swarm

git clone https://github.com/BretFisher/udemy-docker-mastery

docker info

docker swarm init –advertise-addr 127.0.0.1

docker node ls

docker service create alpine ping 8.8.8.8

docker service ls

docker service ps reverent_goldberg

docker service update j9lrdop3by0d –replicas 3

docker service ps reverent_goldberg

docker container ls

docker container rm -f reverent_goldberg.3.p1ks7ok0w1sm1y7ikvh3gndow

docker service ls

docker service ps reverent_goldberg

docker service rm reverent_goldberg

docker service ls

docker container ls

docker swarm init

#Build Docker Swarm

#Manager Server

docker swarm init –advertise-addr 192.168.10.2

docker swarm join –token SWMTKN-1-6aynyb5mmgyb1qy6ofikeny5axuhzk303bau9vdncnpigkclcf-500wa21q29k5rl9pw3dbllimp 192.168.10.2:2377

#Node1

docker swarm join –token SWMTKN-1-6aynyb5mmgyb1qy6ofikeny5axuhzk303bau9vdncnpigkclcf-500wa21q29k5rl9pw3dbllimp 192.168.10.2:2377

#Node2

docker swarm join –token SWMTKN-1-6aynyb5mmgyb1qy6ofikeny5axuhzk303bau9vdncnpigkclcf-500wa21q29k5rl9pw3dbllimp 192.168.10.2:2377

#Manager Server # promote Node 1

docker node ls

docker node update –role manager worker1

docker node ls

docker swarm join-token manager

#Node3 # Join as Manager by default

docker swarm join –token SWMTKN-1-6aynyb5mmgyb1qy6ofikeny5axuhzk303bau9vdncnpigkclcf-d3j1ol33vs16ythghosga20n9 192.168.10.2:2377

#Manager Server

docker node ls

docker service create –replicas 3 alpine ping 8.8.8.8

docker service ls

docker node ps

docker node ps worker1

#Scaling Out with Overlay Networking

docker network create –driver overlay mydrupal

docker network ls

#Create Postgres Service on Swarm

#docker service create –name psql –network mydrupal -e POSTGRES_PASSWORD=mypass postgres

docker service create –name psql –network mydrupal –publish 5432:5432 -e POSTGRES_PASSWORD=mypass postgres

#Verify Image running

docker service ls

#Verify Node which is running Service

docker service ps psql

#Run to see if service is running on server

docker container logs psql.1.oey4ud9gbd87vvp7j8i11zv8b

#Create Drupal Service on Swarm

docker service create –name drupal –network mydrupal -p 80:80 drupal

docker service ls

#repeats command every 2 seconds

watch docker service ls

docker service ps drupal

# Scaling Out with Routing Mesh

docker service create –name search –replicas 3 -p 9200:9200 elasticsearch:2

docker service ps search

curl localhost:9200

#Create A Multi-Service Multi-Node Web App

docker node ls

docker ps -a

docker service ls

#Create Network backend

docker network create -d overlay backend

#Create Network frontend

docker network create -d overlay frontend

#Vote

docker service create –name vote -p 80:80 –network frontend –replicas 2 dockersamples/examplevotingapp_vote:before

#redis

docker service create —-name redis –network frontend –replicas 1 redis:3.2

#worker

docker service create –name worker –network frontend –network backend dockersamples/examplevotingapp_worker

#db

docker service create –name db –network backend –mount type=volume,source=db-data,target=/var/lib/postgresql/data postgres:9.4

#result

docker service create —name result —network backend -p 5001:80 dockersamples/examplevotingapp_vote:before:before

Docker service logs

#Swarm Stacks and Production Grade Compose

docker stack deploy -c example-voting-app-stack.yml voteapp

docker stack ls

docker stack ps voteapp

docker container ls

docker stack services voteapp

docker stack ps voteapp

docker network ls

nano example-voting-app-stack.yml

replicas: 5

docker stack deploy -c example-voting-app-stack.yml voteapp

#Secrets Storage for Swarm: Protecting Your Environment Variables

docker service create –name psql -e POSTGRES_USER_FILE=/run/secrets/psql_user -e POSTGRES_PASSWORD_FILE=/run/secrets/psql_pass postgres

docker secret ls

docker secret inspect psql_user

docker service ps psql

docker exec -it voteapp_db.1.e9oz0nv6nl41pk42lcridqv7p bash

ls /run/secrets/

cat /run/secrets/psql_user

docker container logs voteapp_db.1.e9oz0nv6nl41pk42lcridqv7p

#Using Secrets with Swarm Stacks

docker service create –name search –replicas 2 -p 9200:9200 elasticsearch:2

docker stack deploy -c docker-compose.yml mydb

docker stack rm mydb

# create your drupal and postgres config:

nano docker-compose.yml

version: ‘3.1’

# NOTE: move this answer file up a directory so it’ll work

services:

drupal:

image: drupal:8.2

ports:

– “8080:80”

volumes:

– drupal-modules:/var/www/html/modules

– drupal-profiles:/var/www/html/profiles

– drupal-sites:/var/www/html/sites

– drupal-themes:/var/www/html/themes

postgres:

image: postgres:12.1

ports:

– “5432:5432” environment:

– POSTGRES_PASSWORD_FILE=/run/secrets/psql-pw

secrets:

– psql-pw

volumes:

– drupal-data:/var/lib/postgresql/data

volumes:

– drupal-data:/var/lib/postgresql/data

volumes:

drupal-data:

drupal-modules:

drupal-profiles:

drupal-sites:

drupal-themes:

secrets:

psql-pw:

external: true

echo “oagy938hs” | docker secret create psql-pw –

docker stack deploy -c docker-compose.yml drupal

#Swarm App Lifecycle

cd secrets-sample-2

docker node ls

docker-compose up -d

docker-compose exec psql cat /run/secrets/psql_user

#Full App Lifecycle: Dev, Build and Deploy With a Single Compose

docker-compose up -d

docker-compose -f docker-compose.yml -f docker-compose.test.yml up -d

docker-compose -f docker-compose.yml -f docker-compose.prod.yml config

docker-compose -f docker-compose.yml -f docker-compose.prod.yml config > output.yml

#Service Updates: Changing Things In Flight

docker service create -p 8088:80 –name web nginx:1.13.7

docker service scale web=5

docker service update –image nginx:1.13.6 web

docker service update –publish-rm 8080 –publish-add 9090:80

#Health checks in Dockerfiles

docker container run –name p1 -d postgres

docker container ls

docker container run –name p2 -d –health-cmd=”pg_isready -U postgress|| exit 1″ postgres

docker container ls

docker container inspect p2

docker service create –name psql –network mydrupal –publish 5432:5432 -e POSTGRES_PASSWORD=mypass postgres

docker service create –name psql –network mydrupal –publish 5432:5432 -e POSTGRES_PASSWORD=mypass postgres –health-cmd=”pg_isready -U postgres || exit 1″ postgres

#Run a Private Docker Registry

docker container run -d -p 5000:5000 –name registry registry

docker container ls

docker image ls

docker pull hello-world

docker run hello-world

docker tag hello-world 127.0.0.1:5000/hello-world

docker image ls

docker push 127.0.0.1:5000/hello-world

docker image remove hello-world

docker image remove 127.0.0.1:5000/hello-world

docker pull 127.0.0.1:5000/hello-world

docker container rm registry -f

docker container run -d -p 5000:5000 –name registry -v $(pwd)/registry-data:/var/lib/registry registry

docker image ls

docker push 127.0.0.1:5000/hello-world

ll registry-data

#Docker Registry With Swarm

docker node ls

docker service create –name registry –publish 5000:5000 registry

docker service ps registry

docker pull hello-world

docker tag hello-world 127.0.0.1:5000/hello-world

docker push 127.0.0.1:5000/hello-world

docker pull nginx

docker tag nginx 127.0.0.1:5000/nginx

docker push 127.0.0.1:5000/nginx

docker service create –name nginx -p 80:80 127.0.0.1:5000/nginx

#Install Kubernetes

#Install & Configure MicroK8s Ubuntu

sudo snap install microk8s –classic –channel=1.17/stable

sudo usermod -a -G microk8s vagrant

nano .bashrc

alias kubectl=’microk8s.kubectl’

microk8s.enable dns

microk8s.status

#Our First Pod With kubectl run

kubectl version

kubectl run my-nginx –image nginx

# Pre Version 1.18

kubectl get pods

#Clean up

kubectl delete deployment my-nginx

kubectl get all

# Scaling ReplicaSets

kubectl run my-apache –image httpd

kubectl get all

kubectl scale deploy/my-apache –replicas 2 #kubectl scale deployment my-apache —replicas 2

kubectl get all

kubectl logs deployment/my-apache

kubectl logs deployment/my-apache –follow –tail 1

kubectl logs -l run=my-apache

kubectl get pods

kubectl describe pod/my-apache-5d589d69c7-jmbmh

#Watch Commmand

#Terminal Window 1

kubectl get pods -w

#Terminal Window 2

kubectl delete pod/my-apache-5d589d69c7-jmbmh

kubectl get pods

#Clean up

kubectl delete deployment my-apache

# K8s Services

#Creating a ClusterIP Service

#Terminal Window 1

kubectl get pods -w

#Terminal Window 2

kubectl create deployment httpenv –image=bretfisher/httpenv

kubectl scale deployment/httpenv –replicas=5

kubectl expose deployment/httpenv –port 8888

kubectl get service

kubectl run –generator run-pod/v1 tmp-shell –rm -it –image bretfisher/netshoot — bash

curl httpenv:8888

#Alternative method from Ubuntu

kubectl get service

curl 10.152.183.202:8888

#Creating a NodePort and LoadBalancer Service

kubectl get all

kubectl expose deployment/httpenv –port 8888 –name httpenv-np –type NodePort

kubectl get services

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

httpenv ClusterIP 10.152.183.202 <none> 8888/TCP 13m

httpenv-np NodePort 10.152.183.136 <none> 8888:31585/TCP 70s

kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 3h26m

kubectl expose deployment/httpenv –port 8888 –name httpenv-lb –type LoadBalancer

kubectl get services

curl 10.152.183.147:8888

#Clean up

kubectl delete service/httpenv service/httpenv-np

kubectl delete service/httpenv-lb deployment/httpenv

#Kubernetes Services DNS

kubectl get namespaces

#Run, Expose, and Create Generators

kubectl create deployment test –image nginx –dry-run

kubectl create deployment test –image nginx –dry-run -o yaml

kubectl create job test –image nginx –dry-run -o yaml

kubectl create deployment test –image nginx

kubectl expose deployment/test –port 80 –dry-run -o yaml

#Clean up

kubectl delete deployment test

#Moving to Declarative Kubernetes YAML

#Building Your YAML Files

kubectl api-resources

kubectl api-versions

#Building Your YAML Spec

kubectl explain services –recursive

kubectl explain services.spec

kubectl explain services.spec.type

kubectl explain deployment.spec.template.spec.volumes.nfs.server

# Dry Runs and Diff’s

kubectl apply -f app.yml –dry-run

kubectl apply -f app.yml

kubectl apply -f app.yml –server-dry-run

kubectl diff -f app.yml

nano app.yml

metadata:

name: app-nginx-deployment

servers: dmz

spec:

replicas: 2

kubectl diff -f app.yml

# Clean up

kubectl get all

kubectl delete deployment.apps/app-nginx-deployment

kubectl delete service/app-nginx-service

#Namespaces and Context

kubectl get namespaces

kubectl get all –all-namespaces

kubectl config get-contexts

What’s in a name?

Posted on June 1, 2020 by Mark Shay

“What’s in a name? that which we call a rose…By any other name would smell as sweet.”

If you been following us last few weeks on Facebook you have might noticed that our bushy tailed friend has been scampering about… In search of technical kernels all around NYC.

However, our little buddy has been like “a horse through the a desert with no name”. So we would like to give our furry pal a name..

Please leave a comment below with you name ideas. Be creative

Week of May 29th

Posted on May 29, 2020 by Mark Shay

“This is the return of the space cowboy…Interplanetary, good vibe zone”

Happy Friday!

A short time ago in a galaxy not that far, far away….”

Actually it was only just a few weeks ago that we traveled 🚀 through time 🕰 and space💫 and into the DevOps Universe. For those who recall, we just merely scratched the surface as we dipped our toe🦶in the Lunar 🌚 mare 🌊 with HashiCorp’s Vagrant. At the time, we played around with combination of Vagrant and Oracle’s Virtualbox. This time we wanted to go full tilt into the Microservices galaxy 🌌 with the infamous “batteries🔋 included but removable approach” a.k.a. Docker.

We had explored the rings🪐 of Docker🐳 once before or I should formally say images and containers but it had been about 4 years since we last journeyed there. At time, we were lead by the charismatic and often enthusiastic Docker🐳 Captain 👨‍✈️ Nigel Poulton

Since so much time has past we were overdue for a refresher and who better to lead us through this exploration than equally ebullient👨‍✈️ Captain Bret Fisher as he would take us through his brilliantly put together courseware Docker Mastery: with Kubernetes +Swarm from a Docker Captain

Bret’s courseware contains a chock full of DevOps awesomeness 😎 including full lessons on Docker🐳 , Docker🐳 Compose, Docker🐳 Swam, Docker🐳 Registries, and Kubernetes. There were even quizzes after most lessons to see how much you were able to absorb. Unfortunately, time would only allow for us time to explore just the Docker🐳 portion. Leaving us with Docker🐳 Compose, Docker🐳 Swam, Docker🐳 Registries, and Kubernetes for future missions. Our initial goal was to use to utilize Vagrant with Docker 🐳 (similarly to how we used Vagrant with VBox a few weeks back). However, to get into the spirt of how ubiquitous Docker🐳 is we decided we would try to install Docker🐳 on multiple platforms. All was well, until we decided to install Docker CE on Windows 10.

Since we are primarily running on the Mac book💻 we would need to create a Windows 10 Pro Image on VirtualBox.. Sure no problem.. After several minutes we had our Windows Desktop up and running. For the finishing touches we just need to install Docker🐳 before we would head over to the next lesson. After Docker 🐳 completed it’s installation on Windows, we tried to fire🔥 up Docker 🐳… Bam💥, Slam, Crash ☄️! Uh, oh what now?? Well, actually this was a familiar scenario we had seen a few weeks back when we tried to install Vagrant & VirtualBox under an Ubuntu VM that was already running on VirtualBox for the Mac book💻.

At the time we just chalked this up to an overly convoluted solution because it was virtualization inside of virtualization. But apparently, there is an actual technical term for this which is called “nested virtualization”. This configuration is not generally recommended but it is supported in certain scenarios. The way to make this solution fly Under VBox is to enable “Enable Nested VT -x/AMD-V” Settings -> Processor in Virtualbox.
Of course on the Mac book 💻 this is inconveniently greyed out. Not to fret, just like the hidden trick to increase the VDI disk size . Their is hidden trick to enable Nested Virtualization on VBox on the Mac book 💻

Ok, so now we enabled the Nested Virtualization setting and made some additional recommended local setting changes on the Windows VM. After rebooting the VM, we would try to install Docker 🐳 but again no luck. It’s turns out that Docker 🐳 will only work with nested virtualization with VMware Fusion and we weren’t going to go there. So after spending enough time on this we decided to cut or loses and just work with Vagrant with Docker 🐳 and Ubuntu which in reality is the best way to run docker (Under a native Linux OS)..

Now with our a Docker🐳 environment ready to go and fired 🔥 up it was time to have 👨‍✈️ Captain Fisher take us to warp drive 💨. As we charted our course, we hit a minor bug when building the popular Docker🐳 container nginx image on our environment. We would execute the below syntax which worked fine without any issues:

docker container run –publish 8080:80 nginx:latest

However, after we killed the container with control-c and ran docker info we noticed the following error message

WARNING: No swap limit support

Although, the nginx container still ran fine, we are not big fans of those pesky little error messages. So we resolved the issue by running the below syntax in the Docker 🐳 container:

vagrant@node-1:~$ sudo nano /etc/default/grub

GRUB_CMDLINE_LINUX=”cgroup_enable=memory swapaccount=1

$ sudo update-grub

$ reboot -iThen From: Mac book 💻

vagrant halt

vagrant up

For good measure we filed a bug with the developer on Github. Fortunately we know the developer, so this minor bug was fixed almost immediately. So it can be used with future builds. Next, we moved on as our journey touched on all the essential topics for Docker 🐳:

Manage Multiple Containers
Management of Docker Networking
- DNS and How Containers Find Each Other
- Using Containers for CLI Testing
- DNS Round Robin Test
Images and Their Layers
Image Tagging and Pushing to Docker Hub
Dockerfiles and Run Containers From it
Persistent Data: Data Volumes, Bind Mounting
Database Upgrades with Named Volumes
Running In Containers with Bind Mounts

For more details on syntax see log

“DevOps is the union of people, process, and products to enable continuous delivery of value to our end users.”

Below are some topics I am considering for my Journey next week:

More with Captain Brett on Compose, Swam, Registries, and Kubernetes
More with with Data Pipelines
Google Cloud Data Fusion (ETL/ELT)
More on Machine Learning
ONTAP Cluster Fundamentals
Google Big Query
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well –

–MCS

Week of May 29 Detailed Log

Posted on May 29, 2020 by Mark Shay

docker image rm nginx –force

docker container run –publish 8000:80 –detach nginx:latest

docker container ls -a

docker ps -aq

docker stop $(docker ps -aq)

rm $(docker ps -aq) –force

docker rmi $(docker images -q) –force

docker system prune

docker container logs webhost

docker container top webhost

docker container rm -f 41b

docker run –name mongo -d mongo

docker top mongo

ps aux

docker stop mongo

docker ps

docker top mongo

docker start mongo

docker container run -d -p 3306:3306 –name db -e MYSQL_RANDOM_ROOT_PASSWORD=yes mysql

docker container logs db

GENERATED ROOT PASSWORD: yi0xe8fohPhaboiF9cicaidoow4yaCie

docker container run -d –name webserver -p 8080:80 httpd

docker ps

docker container run -d –name proxy -p 8000:80 nginx

docker container ls -a

docker stop dacbe46dd93d 623768eb2e8b 476615d3fe3e 72259e479419

docker ps -a

docker container rm dacbe46dd93d 623768eb2e8b 476615d3fe3e 72259e47941

docker image ls

docker container run -d –name nginx -p 8000:80 nginx

docker container run -d –name mysql -e MYSQL_RANDOM_ROOT_PASSWORD=true mysql

docker container ls

# List processes running in container

docker container top mysql

docker container top nginx

#details of one container config

docker container inspect mysql

# performance stats for all containers

docker container stats

docker container run -it –name proxy nginx bash

ls -al

Exit

docker container ls -a

# Start new container Interactively

docker container run -it –name ubuntu ubuntu

apt-get update

apt-get install -y curl

curl google.com

exit

docker container start -ai ubuntu

# Run additional command in existing container

docker container exec -it mysql bash

exit

#Image is still running

docker pull alpine

docker image ls

docker container run -p 80:80 –name webhost -d nginx

docker container port webhost

80/tcp -> 0.0.0.0:80

docker container inspect –format ‘{{ .NetworkSettings.IPAddress }}’ webhost

172.17.0.4

# Show Network

docker network ls

NETWORK ID NAME DRIVER SCOPE

b5fc5fd2d6d5 bridge bridge local

613749eac668 host host local

2f8551c3a295 none null local

docker network inspect bridge

docker network create my_app_net

docker network ls

docker container run -d –name new_nginx –network my_app_net nginx

docker network inspect my_app_net

# Add another container on custom Network

docker network connect deb100b57c4d 900ecc0f5b4c

docker network inspect my_app_net

docker network disconnect deb100b57c4d 900ecc0f5b4c

docker container run -d –name my_nginx –network my_app_net nginx

docker inspect deb100b57c4d

[

#Testing DNS Resolution

#Preq

#docker container exec -it my_nginx bash

#apt update

#apt install iputils-ping

#exit

docker container exec -it my_nginx ping new_nginx

# Reverse Scenario

#Preq

#docker container exec -it my_nginx bash

#apt update

#apt install iputils-ping

#exit

docker container exec -it my_nginx ping new_nginx

# CLI Testing

Terminal Window #1

$ docker container run –rm -it centos:7 bash

yum update curl

curl –version

Terminal Window #2

docker ps -a

docker container run –rm -it ubuntu:14.04 bash

apt-get update && apt-get install -y curl

curl –version

# DNS Round Robin Testing

# Create a Network

$ docker network create dude

docker container run -d –net dude –net-alias search elasticsearch:2

docker container ls

docker container run –rm –net alpine nslookup search

# Elastic Search Container 1

#PRE

docker container exec -it d7f343e6e0e2 bash

apt update

apt install dnsutils

Exit

# Elastic Search Container 2

docker container exec -it 37d385dd9bb5 bash

apt update

apt install dnsutils

exit

docker container run –rm –net dude centos curl -s search:9200

{

“name” : “Sangre”,

“cluster_name” : “elasticsearch”,

“cluster_uuid” : “1kbpTYo-TZSLS35CPkR25Q”,

“version” : {

“number” : “2.4.6”,

“build_hash” : “5376dca9f70f3abef96a77f4bb22720ace8240fd”,

“build_timestamp” : “2017-07-18T12:17:44Z”,

“build_snapshot” : false,

“lucene_version” : “5.5.4”

“tagline” : “You Know, for Search”

}

docker container run –rm –net dude centos curl -s search:9200

{

“name” : “Firearm”,

“cluster_name” : “elasticsearch”,

“cluster_uuid” : “ES9_lTNoQ2Kp9ykTkr2_-g”,

“version” : {

“number” : “2.4.6”,

“build_hash” : “5376dca9f70f3abef96a77f4bb22720ace8240fd”,

“build_timestamp” : “2017-07-18T12:17:44Z”,

“build_snapshot” : false,

“lucene_version” : “5.5.4”

“tagline” : “You Know, for Search”

docker container ls

#Cleanup

docker container rm -f boring_allen youthful_jackson

boring_allen

youthful_jackson

docker image ls

# Images and Their Layers:Discover the Image Cache

docker pull nginx:1.18.0-alpine

docker image ls

docker history nginx:latest

#Image Tagging and Pushing to Docker Hub

docker Hubdocker pull mysql/mysql-server

docker pull nginx:mainline

docker image tag nginx ms4446/nginx

docker image push ms4446/nginx

docker login

cat .docker/config.json

# Push repository

docker image push ms4446/nginx:testing

docker image tag ms4446/nginx ms4446/nginx:testing

docker image push ms4446/nginx:testing

# Building Images: Running Docker Builds

docker image build -t customnginx .

nano Dockerfile

FROM node:6-alpine

EXPOSE 3000

RUN apk add –update tini

RUN mdkdir -p /usr/src/app

WORKDIR /usr/src/app

COPY package.json package.json

RUN npm install && npm cache clean

COPY . .

CMD [ “tini”, “–“,”node”, “./bin/www”]

docker build -t testnode .

# Run New Docker Build

docker container run –rm -p 88:3000 testnode

#Create Tag and Push to Dockerhub

docker images

docker tag testnode ms4446/testing-node

docker push ms4446/testing-node

#Delete local Custom Image

docker image rm ms4446/testing-node

#Download Custom Image and run

docker container run –rm -p88:3000 ms4446/testing-node

# Persistent Data: Data Volumes

#Clean up

docker system prune -a

docker ps -aq

docker stop $(docker ps -aq)

docker images -a

docker rmi 30f937e841c8 9beeba249f3e -f

docker volume prune

docker rm 900ecc0f5b4c8e86f67f13fe6062e1730da168c6feb3028f8c589a7b93d63c95 -f

# Pull mysql from repo

docker pull mysql

docker image inspect mysql

docker container run -d –name mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=True mysql

docker container ls

docker container inspect mysql

docker volume ls

docker volume inspect fb395bf87187c8e3eefe7e7e16faf5ec5b7c10b479d679232d2fc02619ddcea2

docker container run -d –name mysql2 -e MYSQL_ALLOW_EMPTY_PASSWORD=True mysql

docker volume ls

docker container stop mysql

docker container stop mysql2

docker container ls

docker container ls -a

docker volume ls

docker container rm mysql mysql2

docker volume ls

#named volumes

docker container run -d –name mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=True -v mysql-db:/var/lib/mysql mysql

docker volume ls

docker volume inspect mysql-db

docker container rm -f mysqlmysql

docker container run -d –name mysql3 -e MYSQL_ALLOW_EMPTY_PASSWORD=True -v mysql-db:/var/lib/mysql mysql

docker volume ls

#Bind Mappings

docker container run -d –name nginx -p 81:80 -v $(pwd):/usr/share/nginx/html nginx

docker container run -d –name nginx2 -p 82:80 -v $(pwd):/usr/share/nginx/html nginx

touch testme.txt

ls -al

echo “is it me you’re looking for?” >testme.txt

#From Second Command line Window

docker container exec -it nginx bash

cd /usr/share/nginx/html

ls -al

#Database Upgrades with Named Volumes

docker container run -d –name psql -v psql:/var/lib/postgresql/data postgres:9.6.1

docker container logs -f psql

docker container ls

docker container stop 9a5a40be80ff

docker container run -d –name psql2 -v psql:/var/lib/postgresql/data postgres:9.6.2

docker container ps -a

docker volume ls

docker container logs -f fac039e1cb84

#Edit code Running In Containers With Bind Mounts

docker pull bretfisher/jekyll-serve

docker run 80:4000 -v $(pwd):/site bretfisher/jekyll-serve

Week of May 22nd

Posted on May 22, 2020 by Mark Shay

“And you know that notion just cross my mind…“

Happy Bitcoin Pizza Day!

All aboard! This week our travels would take us on the railways far and high but before, we can hop on the knowledge express we had some unfinished business to attended too.

“Oh, I get by with a little help from my friends”

If you have been following my weekly submissions for the last few weeks I listed as future action item “create/configure a solution that leverages Python to stream market data and insert it into a relational database.“

Well last week, I found just the perfect solution. A true master piece by Data Scientist/Physicist extraordinaire AJ Pryor, Ph.D. AJ had created a brilliant multithreaded work of art that continuously queries market data from IEX and then writes it to a PostgreSQL database. In addition, he built a data visualization front-end that leverages Pandas and Bokeh so the application can run interactively through a standard web browser. It was like a dream come true! Except that the code was written like 3 years ago and referenced a deprecated API from IEX.

Ok, no problem. We will just simply modify AJ’s “Mona Lisa” to reference the new IEX API and off we will go. Well, what seemed like was a dream turned into a virtual nightmare. I spent most of last week spinning my wheels trying to get the code to work but to no avail. I even reached out to the community on Stack overflow but all I received was crickets..

As I was ready to cut my loses, but I reached out to a longtime good friend who happens to be all-star programmer and a fellow NY Yankees baseball enthusiast. Python wasn’t his specialty (he is really an amazing Java programmer) but he offered to take a look at the code when he had some time… So we set up a zoom call this past Sunday and I let his wizardry take over… After about hour or so he was in a state of flow and had a good pulse of what our maestro AJ’s work was all about. After a few modifications my good chum had the code working and humming along. I ran into a few hiccups along the way with the brokeh code, but my confidant just referred me to run some simpler syntax and then abracadabra… this masterpiece was now working on the Mac! As the new week started, I was still basking in the radiance of this great coding victory. So, I decided to be a bit ambitious and move this gem to the cloud which would be like the crème de la crème of our learnings thus far. Cloud, Python/Pandas, Streaming market data, and Postgres all wrapped up in one! Complete and utter awesomeness!

Now the question was for which cloud platform to go with? We were well versed in the compute area in all 3 of the major providers as a result of our learnings.

So with a flip of the coin ,we decided to go with Microsoft Azure. That and we had some free credits still available.

With sugar plum fairies dancing in our head, we spun up our Ubuntu Image and we followed along the well documented steps on AJ’s Github project

Now, we were now cooking with gasoline ! We cloned AJ’s Github repo, modified the code with our new changes, and executed the syntax and just as we were ready to declare victory… Stack overflow Error! Oh, the pain.

Fortunately I didn’t waste any time, I went right back to my ace in the hole but with some trepidation that I wasn’t being too much of irritant.

I explained my perplexing predicament and without hesitation my Fidus Achates offered some great trouble shooting tips and quite expeditiously we had the root cause pinpointed. For some peculiar reason, the formatting of URL that worked like a charm on the Mac was a dyspepsia on Ubuntu on Azure. It was certainly a mystery but one that can only be solved by simply rewriting the code.

So once again, my comrade in arms helped me through another quagmire. So, without further ado, may I introduce to you the one and only…

http://stockstreamer.eastus.cloudapp.azure.com:5006/stockstreamer

We’ll hit the stops along the way We only stop for the best

After feeling victorious after my own personal Battle of Carthage and with our little streaming market data saga out of our periphery it was to time to hit the rails…

Our first stop was messaging services which is all the rage now a days. There are so many choices with data messaging services out there.. So where to start with? We went with Google’s Pub/Sub which turned out to be a marvelous choice! To get enlightened with this solution, we went to Pluralsight where we found excellent course on Architecting Stream Processing Solutions Using Google Cloud Pub/Sub by Vitthal Srinivasan

Vitthal was a great conductor who navigated us through an excellent overview of Google’s impressive solution, uses cases, and even touched on a rather complex pricing structure in our first lesson. He then takes us deep into the weeds showing us how to create Topics, Publishers, and Subscribers. He goes on further by showing us how to leverage some other tremendous offerings in GCP like Cloud Functions, API & Services, and Storage.

Before this amazing course my only exposure was just limited to GCP’s Compute Engine so this was eye opening experience to see the great power that GCP had to offer! To round out the course, he showed us how to use GCP Pub/Sub with some client Libraries which was excellent tutorial on how to use Python with this awesome product. There was even two modules on how to integrate Google Hangout Chatbot with Pub/Sub but that required you to be a G Suite User. (There was free trial but skipped the set up and just watched the videos) Details on the work I did on Pub/Sub can be found at

May_22_logDownload

“I think of all the education that I missed… But then my homework was never quite like this”

For Bonus this week, I spent enormous amount of time brushing up my 8th grade Math and Science Curriculum

Liner Regression
Epigenetics
Protein Synthesis

Below are some topics I am considering for my Journey next week:

Vagrant with Docker
Continuing with Data Pipelines
Google Cloud Data Fusion (ETL/ELT)
More on Machine Learning
ONTAP Cluster Fundamentals
Google Big Query
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well –

–MCS

Week of May 15th

Posted on May 15, 2020 by Mark Shay

“Slow down, you move too fast…You got to make the morning last.”

Happy International Day of Families and for those celebrating in the US Happy Chocolate Chip Day!

This week’s Journey was a bit of a laggard in comparisons to previous week’s journeys but still productive, nonetheless. This week we took a break from Machine Learning while sticking to our repertoire and with our reptilian programing friend. Our first stop was to head over to installing Python on Windows Server which we haven’t touched on so far.As we tend to make things more challenging than they need to be we targeted an Oldie but a goodie Windows Server 2012 R2 running SQL Server 2016. Our goal to configure a SQL Server Scheduled Job that runs a simple Python Script which seemed liked a pretty simple task. We found an nice example of this exact scenario on MSSQL Tips – Run Python Scripts in SQL Server Agent

First, we installed Python and followed the steps and lo and behold it didn’t work right away. To quote the great Gomer Pyle “Surprise, surprise, surprise”. No worries we had this… After a little bit of troubleshooting and trying to interpret the vague error messages in the SQL Server Agent Error log we got it working… In turns out, we had a multitude of issues ranging from the FID that was running the SQL Agent service not having the proper entitlements to the directory where the py script lived and the more important prerequisite of Python not being in the User Environment Variables for the Service account to know where to launch the executable. Once resolved, we were off to the races or at least we got the job working.

At this point we were feeling pretty ambitious, so we decided rather than using the lame MS Dos style batch file we would use a cool PowerShell Script as a wrapper for our python code for the job… Cool but not so cool on Windows Server 2012 R2. First, we started out with set-executionpolicy remotesigned command which needs to be specified in order to execute PowerShell but because we were using an old jalopy OS we had to upgrade the version of the .Net runtime as well as the version of PowerShell. Once upgraded and we had executed a few additional commands and then we were good to go…

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

Install-PackageProvider -Name NuGet -RequiredVersion 2.8.5.201 -Force

Install-Module -Name SqlServer -AllowClobber

After spending a few days here, we decided to loiter a little bit longer and crank out some SQL maintenance tasks in Python like a simple backup Job. This was pretty straight forward once we executed a few prerequisites.

python -m pip install –upgrade pip

Pip install pip

pip install pyodbc

pip install pymssql-2.1.4-cp38-cp38-win_amd64.whl

pip install –upgrade pymssql

Our final destination for the week was to head back over to a previous jaunt and play with streaming market data and Python. This time we decided to stop being cheap and pay for an IEX account

Fortunately, they offer pay by the month option with opt out any time so it shouldn’t get too expensive. To get re-acclimated we leveraged Jupyter notebooks and banged out a nifty python/pandas/matlib script that generates the top 5 US Banks and there 5-year performance. See attachment.

“I have only come here seeking knowledge… Things they would not teach me of in college”

Below are some topics I am considering for my adventures next week:

Vagrant with Docker
Data Pipelines
- Google Cloud Pub/Sub (Streaming Data Pipelines)
- Google Cloud Data Fusion ( ETL/ELT)
Back to Machine Learning
ONTAP Cluster Fundamentals
Google Big Query
Python -> Stream Data from IEX -> Postgres
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay Safe and Be well –

–MCS

Week of May 8th

Posted on May 8, 2020 by Mark Shay

“Now it’s time to leave the capsule if you dare..”

Happy Friday!

Before we could end our voyage and return our first mate Slonik to the zookeeper, we would first need to put a bow on our Postgres journey (for now) by covering a few loose ends on advanced features. Saturday, we kicked it off with a little review on Isolation levels in Postgres (including a deep dive on Serializable Snapshot Isolation (SSI)) Then on to Third-parting monitoring for database health and streaming replication, and for the la cerise sur le gâteau… Declarative Partitioning and Sharding!

Third-Party Monitoring

We evaluated 2 solutions OpsDash and PgDash. Both were easy to set up and both gave valuable information in regards to Postgres. OpsDash provided more counters and is it can monitor system information as well as other services running on Linux where as PgDash is Postgres specific and will give you a deeper look into Postgres and Streaming Replications than just querying the native system views

Declarative Partitioning

It was fairly straight forward to implement Declarative partitioning. We reinforcement such concepts by turning to Creston’s plethora of videos on the topics as well as turning to several blog posts. See below for detailed log.

Sharding Your Data with PostgreSQL

There are third party solutions like Citus Data that seem to offer a more scalable solution but out of the box you can implement Sharding with using Declarative Partitioning set up on a Primary Server and using a Foreign Data Wrapper configured on a remote Server. Then you combine Partitioning and FDW to create Sharding. This was quite an interesting solution although I have strong doubts about how scalable this would be in production.

On Sunday, we took a much-needed respite as the weather was very agreeable in NYC to escape the quarantine…

On Monday, with our rig now dry docked. We would travel through different means to another dimension, a dimension not only of sight and sound but of mind. A journey into a wondrous land of imagination. Next stop, the DevOps Zone!

To begin our initiation into this realm we would start off with HashiCorp’s Vagrant.

For those who not familiar with Vagrant it is not a transient mendicant that the name would otherwise imply but a nifty open-source solution for building and maintaining lightweight and portable DEV environments.

It’s kind of similar to docker for those more familiar but it generally works with virtual machines (although can be used with containers).

At the most basic level, Vagrant uses a smaller version of VMs whereas Docker is kind of the “most minimalistic version for process and OS bifurcation by leveraging containers”.

The reason to go this route opposed to the more popular Docker was that it is generally easier to standup a DEV environment.

With that being said we wound up spending a considerable amount of time on Monday and Tuesday this week Working on this. As I ran into some issues with SSH and “Vagrant UP” process. The crux of issue was related using Vagrant/VirtualBox under an Ubuntu VM that was already running VirtualBox on a Mac. This convoluted solution didn’t seem to play nice. Go figure?

Once we decided to install Vagrant with VirtualBox natively on the Mac we were up and running were easily able to spin up and deploy VMs seamlessly.

Next, we played a little bit with Git. Getting some practicing with the work flow of editing configuration files and pushing the changes straight to the repo.

On Wednesday, we decided to begin our expropriation of a strange new worlds, to seek out new life and new civilizations and of course boldly go where maybe some have dared to go before? That would be of course Machine Learning where the data is the oil and the algorithm is the engine. We would start off slow by just trying to grasp the jargon like training data, training a model, testing a model, Supervised learning, and Unsupervised Learning.

The best way for us to absorb this unfamiliar lingo would be to head over to Pluralsight where David Chappell offered a great introductory course on Understanding Machine Learning

“Now that she’s back in the atmosphere… With drops of Jupiter in her hair, hey, hey”

On Thursday we would go further down the rabbit hole of Machine Learning with Jerry Kurata’s Understanding Machine Learning with Python

There we would be indoctrinated by the powerful tool of Jupyter Notebook. Now armed with this great “Bat gadet” we would reunite with some of our old heroes from the “Guardians of the Python” like “Projectile” Pandas, matplotlib “the masher” and of course numpy “ the redhead step child of Thanos”. In addition, we would also be introduced to a new and special super hero scikit-learn.

For those not familiar with this powerful library “scikit-lean” has unique and empathic powers to our friends Numpy, Pandas and SciPy. This handy py lib ultimately unlocks the key to the Machine Learning Universe through Python.

Despite all this roistering with our exemplars of Python, our voyage wasn’t all rainbows and Unicorns.

We got introduced to all sorts of new space creatures like Bayesian and Gaussian Algos each conjuring up bêtes noires. The mere thought of Bayes theorem drudged up old memories buried deep in the bowls back in college when I was studying probability and just the mere mention of Gaussian functions jarred memories from the Black Swan (and not the ballet movie with fine actresses Natalie Portman and Mila Kunis) but the well-written and often irritating NYT Best seller by Nassim Nicholas Taleb.

Unfortunately, It didn’t get any cozier when we started our course for powerful and complex ensemble of the Random Forrest Algo. There we got bombarded by meteorites such as “Over Fitting”, “Regularization Hyper-parameters” , and “Cross Validation”, and not to mention the dreaded “Bias – variance tradeoff”. Ouch! My head hurts…

Here is the detailed log of my week’s journey

“With so many light years to go…. And things to be found (to be found)”

Below are some topics I am considering for next week’s odyssey :

Run Python Scripts in SQL Server Agent
More with Machine Learning
ONTAP Cluster Fundamentals
Google Big Query
Python -> Stream Data from IEX ->
MSSQLData Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks)
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS

Week of May 8 Detailed Log

Posted on May 8, 2020 by Mark Shay

Videos Watched

Reference

Table Partitioning

Blog Post Read

Declarative Partitioning

CREATE TABLE test04 (data jsonb) partition by range ((data->’id’));

CREATE TABLE test04_01 PARTITION OF test04
FOR VALUES FROM (‘1’) TO (‘2500000’);

CREATE TABLE test04_02 PARTITION OF test04
FOR VALUES FROM (‘2500001’) TO (‘5000000’);

CREATE TABLE test04_03 PARTITION OF test04
FOR VALUES FROM (‘5000001’) TO (‘7500000’);

CREATE TABLE test04_04 PARTITION OF test04
FOR VALUES FROM (‘7500001’) TO (‘11000000’);

CREATE INDEX ON test04_01 USING BTREE ((data->’id’));
CREATE INDEX ON test04_02 USING BTREE ((data->’id’));
CREATE INDEX ON test04_03 USING BTREE ((data->’id’));
CREATE INDEX ON test04_04 USING BTREE ((data->’id’));

Vagrant Installation

1.  git clone https://github.com/stuartshay/Vagrant
2.  Create multiple directories under home home user dir. Only /home/ms4446/vagrant-testVagrant was created from github /home/ms4446/vagrant-test/
·       /home/ms4446/vagrant-test/conf
·       /home/ms4446/vagrant-test/postgres
·       /home/ms4446/vagrant-test/scripts
·       /home/ms4446/vagrant-test/provision
1.  Edited Vagrantfile
2.  Ensure pg_hba.conf  postgresql.conf exist in /home/ms4446/vagrant-test/conf
3.  Ensure database_provision.sh  vagrant_provision.s /home/ms4446/vagrant-test/provision
4.  Run `vagrant init ubuntu/trusty64`
5.  cd postgres
6.  vagrant up

1) Get Docker to Run to the Vagrant VM

Git Editing

https://github.com/stuartshay/Vagrant/blob/master/ubuntu/provision/vagrant_provision.sh

Uncomment the Docker Part

2) Save the Changes you did to Github

git add .

git commit -m “Updated File”

git push

Machine Learning

JerryKurata/MachineLearningWithPython

Week of May 1st

Posted on May 1, 2020 by Mark Shay

“And once again, I will be… In a march to the sea.”

Happy May Day for one and for all!

This week’s expedition continued with our first mate Slonik through the relational database version of the Galápagos Islands. Last week, we rendezvous with our old friend the “SQL Authority” who gave us a mere beginner’s guide to the world of PostgreSQL.

Now, no longer a neophyte to Postgres, we needed to kick it up a notch and touch on some more advanced topics like Database architecture, Logical Replication, Streaming Replication, Monitoring of Replication scenarios, and a migration path from other relational databases i.e. MS SQL Server to Postgres.

So once again, we turned back to Pluralsight for some insights on these sophisticated areas. Only to find zilch in this realm! So now where to turn? Who can help us navigate these uncharted territories?

Well… Google of course.. Or I should say the Google’s Streaming video Service A.K.A youtube.com. There much to our delight we found a treasure trove of riches of the high-level topics related to Postgres.

However, we didn’t have our true Eureka moment until we encountered Creston not be confused with “The Amazing Kreskin“. Although great possibly equally as amazing or possibly even better? This RDBS enthusiast put together a bunch of spectacular videos which included for a wide variety of nuggets on Postgres. So now armed with these knowledge bombs it was full speed ahead!

On Saturday, we kicked it off with quick jolt of architectural review and then we dove right into a high level overview on High Availability in Postgres. To finish it off, we successfully implemented logical replication and we were feeling pretty good about our first day in deep blue sea…

“Like a red morn that ever yet betokened, Wreck to the seaman, tempest to the field, Sorrow to the shepherds, woe unto the birds, Gusts and foul flaws to herdmen and to herds.” – William Shakespeare

..Or more eloquently phrased

“Red sky at night, sailor’s delight. Red sky in morning, sailor’s warning” – Some unknown Sailer Dude

Well, that warning came quite quickly on Sunday when we took the plunge and went right to Streaming Replication. We immediately queued up the Creston video on the subject and began to follow along. …And follow along… and follow along …and follow along …and so on… After bleary 15 hours of trying we were left with what turned out to a red herring of erroneous error messages in the Postgres DB log but more importantly we had no replication in place…

We finished Sunday and continued into wee hours on Monday with no progress, a disappointing end to an Eighty consecutive day step goal, and no exercises done.. In fact, I did less 500 steps for whole day as my gluteus maximus glued to the chair and my head locked on theMac Book. Feeling dejected, I decided to call it quits and start over after some shut eye.

On Monday, I spent the entire day trying to troubleshoot the issue with more re-watching of the videos and countless google searches but to no avail.

The sun’ll come out Tomorrow…Bet your bottom dollar.. That tomorrow There’ll be sun – Orphan Annie

…And then Tuesday had arrived, but the Sun actually didn’t show up until later in the day (~3:30 PM). My plan was to tear it all down and start from scratch and rebuilt a shiny new pristine environment and follow the video methodically step by step. When I finished I was back where I was before no replication but this time I had written off the meaningless error messages in the log and just focused on why replication was not working.. So I went back to google and even gave a cry out for help to the community on DB Stack exchange

And then finally it hit me “like the time I was standing on the edge of my toilet hanging a clock, the porcelain was wet, I slipped, hit my head on the sink, and when I came too.. I came up with the flux capacitor”.

Actually, that’s a different story but I realized that I had inadvertently put the recovery.conf file in the wrong directory. Doh! Once I placed the file in its proper place and restarted my Postgres servers and the magic began.

Overwhelmed, with utter jubilation I decided it was time to celebrate with a victory lap or 15. To my amazement I actually set my best PR for a mile and ran and solid 7:02 per mile for 3.6 miles ran… But that’s something I can write about somewhere else… Not here not now..

After Tuesday’s afternoon catharsis, it was time further the mission on Wednesday, We spent some time with more on Replication Slots and Replication monitoring.. Then over to migrating SQL Server sample database AdventureworksDW2012 to Postgres. First step was generate the full table schema from MS SQL Server and then modify the script to translate so Postgres can interrupt it. See the below log for more details.

On Thursday, before we could pick up where we last left off we needed to recover our Ubuntu Server which was hosting our Primary and local Replica from a dreaded disk crash.

In effort to simulate latency during our replication monitoring test we wrote a an INSERT statement using generate_series which worked great as the Replica’s started to fall behind as we continuously pump in data that until we ran out of space and our Ubuntu server sh*t the bed. Now, we had to flex our Virtual Box and Linux skills to get our system back online

First, we had to increase the size of the disk in our VDI which of course is unsupported in the UI and needs to be done at the command line. Now with an increased Disk we needed to boot up VB straight to the Ubuntu Installer ISO and run our trusted gpart command to extend our volume so Ubuntu could see our newly added free space.

After a quick reboot our system was back online. We re-enable our streaming replication and now we ready we picked were we left off this time migrating the data from MS SQL Server to Postgres. Of course, just like Data types conversion it’s not so intuitive. Our conclusion is that if you going to migrate from MS SQL Server to Postgres its best to invest in a third party tool.

However, as a POC we were able to migrate a small table. We were able to accomplish this by using BCP to export all the table data out to individual text files and then import it into Postgres.

On Friday, we got ambushed with some 8^th grade Algebra and Science homework but we managed to find some time test to drive one of the third party tools used for MS SQL Server to Postgres migrations with some success. We used Ispirer Migration and Modernization Toolkit to migrates all the tables from AdventureworksDW2012 and transfer all data from Microsoft SQL Server to PostgreSQL. Unfortunately, we weren’t so successful with the Views and Functions as it requires further code re-writes but that was to be expected. Here is the detailed log of my week’s journey

Below are some topics I am considering for my exploration next week:

Advanced Topics In PostgreSQL
- Partitioning
Using Vagrant to build the image.
Run Python Scripts in SQL Server Agent
Welcome to Machine Learning (Just to grasp concepts)
Google Big Query
Python -> Stream Data from IEX -> MSSQL
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS

Week of May 1 Detailed Log

Posted on May 1, 2020 by Mark Shay

Saturday

Logical Replication Steps

Source Server

User and Authentication setup
wal_level configuration
Create Publication
Grant rights to user

Target Server

Create target table
CREATE Subscription
Test Replication

Source Server

sudo su – postgres

psql

CREATE ROLE Replicator REPLICATION LOGIN PASSWORD *****;

Edit Configuration Files

sudo vim /etc/postgresql/10/main/pg_hba.conf

host all replicator 192.168.56.103 md5

sudo vim /etc/postgresql/10/main/postgresql.conf

wal_level = logical

Restart postgres

sudo service postgresql restart

psql

\c dvdrental

Film

CREATE PUBLICATION bpub FOR TABLE film;

GRANT All ON film to Replicator;

Target Server:

create database dvdrental;

sudo vim /etc/postgresql/10/main/pg_hba.conf
Edit configuration Files

host all replicator 192.168.56.102 md5

Source Server:

Schema only dump

pg_dump dvdrental -t film -s

pg_dump dvdrental -t film -s | psql dvdrental -h 192.168.56.103

Target Server

\c dvdrental

CREATE SUBSCRIPTION bsub CONNECTION ‘dbname=dvdrental host=192.168.56.102 user=replicator password=Welcome1’ PUBLICATION bpub;

Verify data exists Node 2

select * from film;

Test Replication with adding a new record

Source Server

INSERT INTO public.film(

film_id, title, description, release_year, language_id, rental_duration, rental_rate,

length, replacement_cost, rating, last_update, special_features, fulltext)

VALUES (1001, ‘Mark Shay’, ‘Crazy Yankee Fan’, 2020, 1, 5, 6.12, 120, 1.99, ‘R’, ‘2020-04-25 14:50:58.951’, ‘{“Deleted Scenes”}’, ‘ace’);

Target Server

select * from film where title =’Mark Shay’;

Sunday, Monday, Tuesday

# Install initial PostgreSQL 10 cluster and verify it exists

sudo pg_lsclusters

# create a second postgres cluster

sudo pg_createcluster 10 replica1

sudo pg_ctlcluster 10 replica1 status

sudo systemctl status postgresql@10-main

# create archive directories for both clusters

sudo -H -u postgres mkdir /var/lib/postgresql/pg_log_archive/main

sudo -H -u postgres mkdir /var/lib/postgresql/pg_log_archive/replica1

## Configure Main Cluster (Primary / Master) ###################################################

# edit configuration file

sudo nano /etc/postgresql/10/main/postgresql.conf

wal_level = replica

wal_log_hints = on

archive_mode = on # (change requires restart)

archive_command = ‘test ! -f /var/lib/postgresql/pg_log_archive/main/%f && cp %p /var/lib/postgresql/pg_log_archive/main/%f’

max_wal_senders = 10

wal_keep_segments = 64

hot_standby = on

# edit host based access file

sudo nano /etc/postgresql/10/main/pg_hba.conf

local replication rep_user trust # DO NOT USE. Configure your own connection and authentication information

# create replication user

sudo -H -u postgres psql -c “CREATE USER rep_user WITH replication;”

# restart the main cluster

sudo systemctl restart postgresql@10-main

## Configure Replica1 Cluster ###################################################################

# stop replica1 cluster

sudo systemctl stop postgresql@10-replica1

# edit configuration file

sudo nano /etc/postgresql/10/replica1/postgresql.conf

wal_level = replica

wal_log_hints = on

archive_mode = on # (change requires restart)

archive_command = ‘test ! -f /var/lib/postgresql/pg_log_archive/replica1/%f && cp %p /var/lib/postgresql/pg_log_archive/replica1/%f’

max_wal_senders = 10

wal_keep_segments = 64

hot_standby = on

# edit host based access file

sudo nano /etc/postgresql/10/replica1/pg_hba.conf

local replication rep_user trust

## Setup Replica1 Cluster Replication ###########################################################

# remove replica1 existing database files

sudo su – postgres

rm -rf /var/lib/postgresql/10/replica1

# sync replica1 with main cluster

pg_basebackup -D /var/lib/postgresql/10/replica1 -U rep_user -w -P -R # -X stream

#configure recovery.conf

# https://www.postgresql.org/message-id/attachment/8070/recovery.conf

nano /var/lib/postgresql/10/replica1/recovery.conf

restore_command = ‘cp /var/lib/postgresql/pg_log_archive/replica1/%f %p’

recovery_target_timeline = ‘latest’

# start replica cluster and verify in sync

sudo pg_ctlcluster 10 replica1 start

tail -n 100 /var/log/postgresql/postgresql-10-replica1.log

## Verify Replica1 Cluster In Sync #############################################################

# create database with some data

sudo su – postgres

psql -c “create database test;” -p 5432

psql test -c ”

create table posts (

id integer,

title character varying(100),

content text,

published_at timestamp without time zone,

type character varying(100)

);

insert into posts (id, title, content, published_at, type) values

(100, ‘Intro to SQL’, ‘Epic SQL Content’, ‘2018-01-01’, ‘SQL’),

(101, ‘Intro to PostgreSQL’, ‘PostgreSQL is awesome!’, now(), ‘PostgreSQL’);

”

# verify data has been replicated on replica1

psql test -c “select * from posts;” -p 5433

# stop main cluster (simulate failure condition)

sudo systemctl status postgresql@10-main

# promote replica1

sudo pg_ctlcluster 10 replica1 promote

# verify replica1 is now a master / primary cluster

tail -n 100 /var/log/postgresql/postgresql-10-replica1.log

psql test -c “insert into posts (id, title, content, type) values

(102, ‘Intro to SQL Where Clause’, ‘Easy as pie!’, ‘SQL’),

(103, ‘Intro to SQL Order Clause’, ‘What comes first?’, ‘SQL’);” -p 5433

psql test -c “select * from posts;” -p 5433

Ignore error in log

2020-04-27 00:12:26.278 EDT [3084] LOG: archive command failed with exit code 1

2020-04-27 00:12:26.278 EDT [3084] DETAIL: T

ed archive command was: test ! -f /var/lib/postgresql/pg_log_archive/main/00000001000000000000000C && cp pg_wal/00000001000000000000000C /var/lib/postgresql/pg_log_archive/replica/00000001000000000000000C

Wednesday:

PostgreSQL Replication Slots

# Create replication slot on main

sudo su – postgres

psql

select * from pg_create_physical_replication_slot(‘replica’);

psql

“select * from pg_replication_slots;”

sudo nano /var/lib/postgresql/10/replica/recovery.conf

## Add to existing file

primary_slot_name =’replica’

# Restart replica to start using the new slot

sudo systemctl restart postgresql@10-replica

tail -n 1000 /var/log/postgresql/postgresql-10-replica.log

# Check Status on Slot

select * from pg_replication_slots;

# Verify Replication is still working

Test Replica as Primary

# Run on Primary

INSERT INTO posts (id, title, content, type) VALUES

(102, ‘Intro to SQL Where Clause’, ‘Easy as pie!’, ‘SQL’),

(103, ‘Intro to SQL Order Clause’, ‘What comes first?’, ‘SQL’);

# Run on Replica

Select * from posts;

# failover to replica

# Stop Primary Cluster

sudo systemctl stop postgresql@10-main

#verify Secondary log

tail -n 1000 /var/log/postgresql/postgresql-10-replica.log

sudo pg_ctlcluster 10 replica promote

PostgreSQL Replication Monitoring

— on main

select * from pg_stat_replication;

select * from pg_replication_slots;

— on replica check if in standby_mode

select pg_is_in_recovery();

— stop replica and check change in status

sudo systemctl stop postgresql@10-replica

— on main

select * from pg_stat_replication;

select * from pg_replication_slots;

— start replica again

sudo systemctl start postgresql@10-replica

— on main compare WAL status using lsn diff

select pg_wal_lsn_diff(‘0/23000738′,’0/230001B0’);

— on main query to track lag in bytes

— sending_lag could indicate heavy load on primary

— receiving_lag could indicate network issues or replica under heavy load

— replaying_lag could indicate replica under heavy load

select

pid,

application_name,

pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) sending_lag,

pg_wal_lsn_diff(sent_lsn, flush_lsn) receiving_lag,

pg_wal_lsn_diff(flush_lsn, replay_lsn) replaying_lag,

pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) total_lag

from pg_stat_replication;

— on replica can check replica locations or timing

— pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp()

SELECT

CASE

WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()

THEN 0

ELSE EXTRACT (EPOCH FROM now() – pg_last_xact_replay_timestamp())

END AS log_delay;

— Want to know what file a lsn refers to?

select pg_walfile_name(pg_current_wal_lsn());

Thursday:

Manual migrations Steps

Prepare Schema
Script out MSSQL Server Schema for table
Update all non-supported data types
- DATETIME-> TIMESTAMP
- MONEY -> NUMERIC(19,4)
- NVARCAR -> character varying
- sysname -> character varying
- tinyint -> int
- IDENTDITY INT -> NT GENERATED ALWAYS AS IDENTITY
- varbinary(max) ->bytea
- SELECT EXTRACT(MONTH -> Month)
- SELECT EXTRACT(DAY -> DAY)
- DATEDIFF -> DATEPART

Remove all references to:

NONCLUSTERED
CLUSTERED
WITH CHECK
Remove All ALTER Statements with CHECK Constraint statements
ASC

Friday:

Week of April 24th

Posted on April 24, 2020 by Mark Shay

“We’ll search for tomorrow on every shore..”

Last week, after our leisurely cruise had docked, it wouldn’t be too long before we would set an “open course for this week’s virgin sea“. As I was preparing my coordinates, “I look to the sea, reflections in the waves spark my memory” which led me down a familiar path. As some of you might know, I have spent a majority of my career working with Relational Databases (in particular Microsoft SQL Server).

Over the years MS SQL Server has become one of the most popular RDBMS but with each new release and awesome new features added to SQL Server in some cases it become highly restrictive from a licensing standpoint with a very high TCO especially if the database was large in size, or it was accessed by a many clients. With enterprise licensing skyrocketing this of course opened the door for open-source the RDBMS movement.

The leader in this category has been MySQL. However, after being acquired by the Oracle many have been dissuaded to use this database for new projects. Not to mention, the original creator of MySQL left after acquisition and subsequently forked the code and developed MariaDB which has had received a lukewarm response in the industry but the real little blue elephant or ” слоник” in the room was clearly PostgreSQL.

Both PostgreSQL and MySQL launched around same time but not until recent years has PostgreSQL really taken off. But it was always sort of lurking in the grasslands. Today, PostgreSQL has now emerged as one the leaders not only in the open-source world but for all relational databases. So after “a gathering of angels appeared above my head..They sang to me this song of hope, and this is what they said…”

Ok, Where to start?

Well, the basics..

First, I need a Postgres Environment to work with.. For this exercise, I wanted to avoid any additional charges in the cloud so I needed to developed my own Prem solution.

Here are the steps I took:

Install Oracle VirtualBox on the Mac book
Download Ubuntu 18.x.x
Mount ISO and Install Ubuntu
Change System Memory to higher value
Change display settings Memory to higher value
After Ubuntu install -> Power down
In Virtual Box Click Tools -> Network -> Create NIC
Under Ubuntu Image ->
Create a 2nd virtual NIC
Host-only Adapter

Get SSH working:

sudo apt update

sudo apt install openssh-server

sudo systemctl status ssh

sudo ufw allow ssh

Install PostgreSQL (Server) on Ubuntu:

sudo su –

apt-get install postgresql postgresql-contrib

update-rc.d postgresql enable

service postgresql start

Verify local connectionc(On Server):

sudo -u postgres psql -c “SELECT version();”

sudo su – postgres

Change Postgres Password from Blank to something meaningful

psql

ALTER USER postgres PASSWORD ‘newPassword’;

exit

Open up FW port to allow Postgress traffic

ufw allow 5432/tcp

Enable remote access to PostgreSQL server

Edit postgresql.conf

sudo vim /etc/postgresql/10/main/postgresql.conf

Change from listen_addresses = ‘localhost’ to listen_addresses = ‘*’

—Restart postgres

sudo service postgresql restart

—Verify Postgres listening on 5432

ss -nlt | grep 5432

Editing the pg_hba.conf file.

sudo vim /etc/postgresql/10/main/pg_hba.conf

host all all 0.0.0.0/0 md5

host all all ::/0 md5

—Restart postgres

sudo service postgresql restart

(On Mac Client)

Download pgAdmin 4 for MAC (Client)

Launched Browser
Created Postgres connection

Next, I downloaded and restored the sample Database – http://bit.ly/pagilia-dl and I was ready to take on some learning.

For my eduction on Postgres I turned to reliable source on Pluralsight. To no other than the SQLAuthority himself who produced a series of great courses! Below was my syllabus for the week:

Monday:

1. PostgreSQL: Getting Started by Pinal Dave

Tuesday:
2.PostgreSQL: Introduction to SQL Queries by Pinal Dave

Wednesday:
3. PostgreSQL: Advanced SQL Queries by Pinal Dave

Thursday:
4. PostgreSQL: Advanced Server Programming by Pinal Dave

Friday:
5. PostgreSQL: Index Tuning and Performance Optimization by Pinal Dave

Below are some topics I am considering for my exploration next week:

Advanced Topics In PostgreSQL
- High Availability
- Migration of SQL Server database to PostgreSQL
- Replication
Run Python Scripts in SQL Server Agent
Welcome to Machine Learning (Just to grasp concepts)
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS

Week of April 17th

Posted on April 17, 2020 by Mark Shay

“Seems so good… Hard to believe an end to it.. Warehouse is bare”

Greetings and Salutations..

My odyssey this week didn’t quite have the twists and turns of last week’s pilgrimage. Perhaps, it’s was a bit of hangover from “Holy Week” or just the sheer lassitude from the an over abundance of learning new skills during this time of quarantine?

Last weekend, I finally got a chance to tinker around with Raspberry PI. I managed to get it setup on my Wifi network.. Thanks to finding an HDMI cable to hook up to my 50′ Samsung TV. I also learned how to set VSCode on the Amanda’s Mac and connect via SSH to my PI with out using a password (creating a token instead) which is convenient little trick. In addition, I got to play around with the Sense HAT Python module and made the Raspberry PI display some trippy flashing lights with the on-top LED board add-on.

So after being on Cloud 9 most of the last week, I decided I should stay a little longer in the stratosphere and bang on the Google Cloud Platform or more endearing known as GCP and their Compute Engine. Again, I would rebuild my SQL Server 2016 Always On environment (previously created several weeks back on AWS and last week on Azure). Once again, the goal would be to compare and contrast now all 3 major cloud providers. In effort to avoid redundancy here, I won’t reiterate the same prerequisites taken to build out my SQL farm. However, I will share my observations with GCP

GCP – Compute Engine Issues – Observations:

More options than Azure but less then AWS – Interface was bit confusing to navigate through
1st instance built (Domain Controller) got corrupt whenI upgraded my account from Free to Paid Instance need to be rebuilt
Windows Domain failed to be created as Administrator account was set to blank (disabled on AWS EC2 and Azure VMs)
Disks can only be detached from an instance that is an “Stopped” state

Here are my final rankings based on the the 3 Cloud providers I evaluated

Pros

1. Best UI: AWS2.

2. Most Features: AWS

3. Easiest to Navigate UI: Azure

4. Most suited for Microsoft workloads (i.e. SQL Server): Azure

5. Most enticing to use it: GCP (free offer)

6. Most potential: GCP

Cons

1. Hardest to Navigate UI: GCP

2. Hardest to configure (security): AWS

3. Least amount of features: Azure

So after burning out my fuse up there alone… it was time to touch down and bring me round again.. And that led me back to SQL Server and Columnstore Indexes. Here is what I did:

Completed Joe Sack’s Pluralsight course: SQL Server 2012: Nonclustered Columnstore Indexes

Watched Row-based vs. Column-based Indexes Module in Kimberly L. Tripp awesome course in Pluralsight on SQL Server: Indexing for Performance

Watched Designing Columnstore Indexes for Analytic Queries Module in Gail Shaw’s Designing and Implementing SQL Server Database Indexes

Downloaded/Restored modified AdventureWorksDW2012
- Created 60 million rows for the Internet Sales Fact table
- Created Columnstore Index and benchmarked performance vs old Row based Index
- Created Table partition for Internet Sales Fact table and designed Partition Switching scenario (workaround used with SQL 2012 and non-clustered CIs) Old Lava Archive database design.

Started reading through Niko Neugebauer’s awesome blog posts on Columnstore – http://www.nikoport.com/columnstore/ (1- 11)

Next Steps..
Below are some topics I am considering for my voyage next week:

SQL Server Advanced Features:
- Best practices around SQL Server AlwaysOn (Snapshot Isolation/sizing of Tempdb, etc)
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS

Week of April 10th

Posted on April 10, 2020 by Mark Shay

“…When the Promise of a brave new world unfurled beneath a clear blue Sky”

“Forests, lakes, and rivers, clouds and winds, stars and flowers, stupendous glaciers and crystal snowflakes – every form of animate or inanimate existence, leaves its impress upon the soul of man.” — Orison Swett Marden

My journey for this week turned out to be a sort of potpourri of various technologies and solutions thanks to the wonderful folks at MSFT. After some heavy soul searching over the previous weekend, I decided that my time would be best spent this week on recreating the SQL Server 2016 with Always On environment (previously created several weeks back on AWS EC2) but in the MS Azure Cloud platform. The goal would be to better understand Azure and how it works. In addition, I would be able to compare and contrast both AWS EC2 vs. Azure VMs and be able to list both the pros and cons of these cloud providers.

But before I could get my head into the clouds I was still lingering around in the bamboo forests. This past weekend, I was presented with an interesting scenario to stream market data to pandas from the investors exchange (Thanks to my friend) . So after consulting with Mr. Google, I was pleasantly surprised to find that IEX offered an API that allows you to connect to there service and stream messages directly to Python and use Pandas for data visualization and analysis. Of course being the cheapskate that I am I signed up for a free account and off I went.

So I started tickling the keys, I produced a newly minted IEX Py script. After some brief testing, I started receiving an obscure error? Of course there was no documented solution on how to the address such an error..

So after some fruitless nonstop piping of several modules, I was still getting the same error. 🙁 After a moment of clarity of I deduced there was probably limitation on messages you can stream from the free IEX account..

So I took shot in the dark and decided to register for another account (under a different email address) this way I would receive a new token and give that a try

… And Oh là là! My script started working again! 🙂 Of course as I continued to add more functionality and test my script I ran back into the same error but this time I knew exactly how to resolve it.

So I registered for a third account (to yet again generate a new token ). FortunateIy, I completed my weekend project. See attachments Plot2.png and Plot3.png for pretty graphs

Now that I could see the forest through the trees and it was off to the cloud! I anticipated that it would take me a full week to explore Azure VMs but it actually only took a fews to wrap my head around it..

So this left me chance to pivot again and this time to a Data Warehouse/ Data Lake solution built for the Cloud. Turning the forecast for the rest of the week to Snow.

Here is a summary of what I did this week:

Sunday:

Developed Pandas/Python Script in conjunction with iexfinance & matplotlib modules to build graphs to show historical price for MSFT for 2020 and comparison of MSFT vs INTC for Jan 2nd – April 3rd 2020

Monday: (Brief summary)

Followed previous steps to build the plumbing on Azure for my small SQL Server farm (See Previous status on AWS EC2 for more details)

Created Resource Group
Create Application Security Group
Created 6 small Windows VMs in the same Region and an Availability Zone
Joined them to Windows domain

Tuesday: (Brief summary)

Created Windows Failover Cluster
Installed SQL Server 2016
Setup and configured AlwaysOn AGs and Listeners

Observations with Azure VMs:

Cons

Azure VMS are very slow first time brought up after build
Azure VMS has a longer provisioning time than EC2 Instances
No UI option to perform bulk tasks (like AWS UI) . Only option is Templating thru scripting
Can not move Resource Group from one Geographical location to another like VMs and other objects within Azure
When deleting a VM all child dependencies are not dropped ( Security Groups, NICs, Disks) – Perhaps this is by design?

– Objects need to be dissociated with groups and then deleted for clean up of orphan objects

Neutral

Easy to migrate VMs to higher T-Shirt Sizes
Easy to provision Storage Volumes per VM
Application Security Groups can be used to manage TCP/UDP traffic for entire resource group

Pros

You can migrate existing storage volumes to premium or cheaper storage seamlessly
Less network administration
- less TCP/UDP ports need to be opened especially ports native to Windows domains

Very Easy to build Windows Failover clustering services
- Natively works in the same subnet
- Less configuration to get Connectivity to working then AWS EC2

Very Easy to configure SQL Server 2016 Always On
- No need to create 5 Listeners (different per subnet) for a given specific AG
- 1 Listener per AG

Free Cost, Performance, Operation Excellence Recommendations Pop up after Login

Wednesday:

Registered for an Eval account for Snowflake instance
Attended Zero to Snowflake in 90 Minutes virtual Lab
- Created Databases, Data Warehouses, User accounts, and Roles
- Created Stages to be used for Data Import
- Imported Data Sources (Data in S3 Buckets, CSV, JSON formats) via Web UI and SnowSQL cmd line tool
- Ran various ANSI-92 T-SQL Queries to generate reports from SnowFlake

Thursday:

Completed Pluralsight course Snowflake? Why Should I Care?
Watched the following YouTube Videos:
See Snowflake in 8 Minutes
Snowflake Architecture – Learn How Snowflake Stores Table data
Started Series: Getting Started – Introduction to Worksheets & Queries
Imported CitiBike System Data
Configured the SnowFlake Python Module
Developed a Pandas/Python Script using snowflake.connector & matplotlib modules to build a graph to show Citibike total rides over 12 month period (in descending order by rides per month) . See attachment plot.png

Friday:

**Bonus Points **

More Algebra – Regents questions.
More with conjugating verbs in Español (AR Verbs)

Next Steps..
Below are some topics I am considering for my voyage next week:

SQL Server Advanced Features:

– Columnstore Indexes
– Best practices around SQL Server AlwaysOn (Snapshot Isolation/sizing of Tempdb, etc)

Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks) .
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS

Week of April 3rd

Posted on April 3, 2020 by Mark Shay

“The other day, I met a bear. A great big bear, a-way out there.”

As reported last week, I began to dip my toe into the wonderful world of Python.. Last week, I wasn’t able to complete the Core Python: Getting Started by Robert Smallshire and Austin Bingham Pluralsight course . So I had to do some extended learning over last weekend. So last weekend, I was able to finish the “Iteration and Iterables” module which I started last Friday and then spent the rest of the weekend with the module on “Classes” which was nothing short of a nightmare. I spent numerous hours on this module trying to debug my horrific code and rewatching this lessons in the module over and over again. This left me with the conclusion that I just simply don’t get object oriented programming and probably never will..

View Post

Ironically, a conclusion, I derived almost 25 years ago when I attended my last class at University at Albany which was in C++ Object Oriented programming. Fortunately, I escaped that one with a solid D- and was able to pass go and collect $200 and move on to the working world. So after languishing with Classes in Python, I was able to proceed with the final module on File IO and Resource Managements which seemed more straight forward and practical on Monday.

On Tuesday, life got a whole lot easier when I Installed Anaconda – Navigator. Up until this point I was writing my python scripts in TextWrangler Editor on the Mac which was not ideal.

Through Anaconda, I discovered Spider IDE which was like a breath of fresh air. No longer did I have to worry about aligned spaces, open and closed parenthesis, curly and square brackets. Now with the proper IDE environment I was able to begin my journey down the Pandas Jungle…

Here is what I did:

Completed the course of Pandas Fundamentals
Installed Anaconda Panda Python Module, SQL Lite
Created Pandas/Python Scripts:

Read in CSV file (Tate Museum Collection) and output to pickle file
Read in JSON file write output to screen
Traverse directories with multiple JSON files and write output to a file
Perform iteration, aggregation, and filtering (transformation)
Created indexes on data from CSV file for faster retrieval or data
Read data source (Tate Museum Collection) and output data to Excel Spreadsheets, with multiple columns, multiple sheets, and with colored columns options
Connects to RDBMS using SQLAlchemy module (Used SQL Lite Database as POC) which creates a table and writes data to the table from a data source (pickle file)
Create JSON file output from a data source (pickle file)
Create graph using matplotlib.pyplot and matplotlib modules. See attachment.

**Bonus Points ** Continued to drudge old nightmares from freshman year of Highs school as I took a stroll down memory lane with distribute binomials, perfect square binomials, difference of square binomials, factor perfect square trinomials and factor difference of squares, F.O.I.L. and other Algebraic muses.

In addition, revisited conjugating verbs in Español and writing descriptions (en Español) for 9 family members Next Steps..
There are many places I still need to explore..

Below are some topics I am considering:

A Return to SQL Server Advanced Features:

– Columnstore Indexes
– Best practices around SQL Server AlwaysOn (Snapshot Isolation/sizing of Tempdb, etc)

Getting Started with Kubernetes with an old buddy (Nigel)
Getting Started with Apache Kafka
Understanding Apache ZooKeeper and its use cases

I will give it some thought over the weekend and start fresh on Monday.
Stay safe and Be well

—MCS

Week of March 27th

Posted on March 27, 2020 by Mark Shay

“I am an enchanter. There are some who call me…Tim.“

This week I decided to take a break from SQL Server with AlwaysOn on AWS EC2 and focus my attention to the Python programming language. Despite gracefully shutting down all my Instances, I am still racking up over ~$120 of charges and growing.. This is stuff they don’t tell you about in the brochure.. Either case I will hold on to my environment for now and carry on with my re-tooling of my skills….

So far, I have learned quite a lot about Python. Firstly, Its quite an extensive programming language with many use cases and of course no matter how hard I try I still suck at programming.. It’s just not in my DNA. However, the goal was not to reinvent myself with skill that’s clearly one of my weaknesses but to have a good general understanding of the syntax and understand of the value preposition. And that I am happy to report is going that is going quite well so far

Here is what I did with Python:

1. Completed the course of Pluralsight The Big Picture by Jason Olson

2. Installed Home Brew on Amanda’s Mac with the latest version of Python 3.8.x with some nice extras

3. Created several python scripts to be used with the training modules and took notes

4. Completed the following modules Core Python: Getting Started by Robert Smallshire and Austin Bingham

5. To reenforce my leanings I recite ” The Zen of Python” each and every night before I go to bed

** Bonus Points ** I spent over 4 hours this week working on Parabolas and other fun Algebraic equations.

Not to mention a little español and some environmental sciences..
Since I didn’t complete the full course which I hoped too this week. 🙁 I plan to finish the remaining sections of Iteration and Iterables and the two remaining modules on Classes and File IO and Resource Managements over the weekend.. Doing work on the weekend is not a habit I am looking to make a regular practice.

Next Steps..

* Pandas Fundamentals

Have a nice weekend!

–MCS

Week of March 20th

Posted on March 20, 2020 by Mark Shay

Hey, you, get off of my cloud”! (that includes you Corona)

So after taking a brief victory lap for getting my virtual Windows Domain Sandbox up and running last week in AWS EC2. This week I attempted to get SQL Server AlwaysOn running with multiple Availability Groups and multiple Replicas (Async w/Read, Async w/ No read, Sync with Automatic Failover and Sync with Read Automatic Failover options)

After spending multiple days and countless attempts, I was left with numerous disappointments for most of the week as I fruitlessly chased numerous red herrings with non-useful Windows Cluster Log errors.

However, I refused to accept defeat! So I am happy to report that I finally persevered for this week’s mission

Here is what I did:

Removed my flaky implementation of AlwaysOn AGs and Listener with some lame hacks that had my AlwaysOn Implementation sudo running earlier in the week

Destroyed Windows Failover Cluster

Began the recovery mission:

Allocated 5 new class C subsets to existing VPC

Added additional Policies to Security Group in VPC
- Added UDP port for SQL Browser Service
- Added TCP ports used with AlwaysOn

Added a second virtual NIC to designated 5 SQL Severs EC2 instances

Configured 2nd IP for 2nd NIC using unique IP from designated Class C subnet

Designated DC/DNS Server as my Jump Server as all SQL EC2 instances are no longer routable from internet thru RDP

Recreated Windows Failover Cluster with 5 SQL Servers

Added 5 virtual Network IPs (1 for each Class C subnet) to be used with Windows Cluster.

Recreated Always On with 2 AGs with ASync w/Read, ASync w/ No read, Sync with AF and Sync with Read AF

Configured SQL Listener to use 5 IPs in each subnet that hosts SQL Servers.

Added 5 Network IP under AG resources in Windows Cluster to be used with SQL Listener

Enabled SQL Browsing Service on all 5 SQL EC2 instances (As I tried to be cute and make all SQL instances a Named Instances instead using default)

Removed all SQL Aliases previously created as they are no longer needed

Resolved “nslookup” issues that were plaguing name resolution on my Network

Added 5 new Class C subnets to Reverse Lookup Zones in DNS

Added Windows Cluster and SQL Listener Names for each subnet to DNS Forward Lookup Zones

Successfully Tested failover scenarios w/ SQL AlwaysOn

Installed and configured Redgate SQL Data Generator software on Jump Server

Successfully Tested various scenarios with pumping millions of records to the SQL Listener while initiating AlwaysOn failover

Next Steps..

Play around a little bit more with SQL AlwaysOn
Maybe take on Snakes and Bears (Start by watching some videos)…
Stayed Tuned.

Peace Out –
–MCS

Week of March 13

Posted on March 13, 2020 by Mark Shay

Forget Corona! Take me to the cloud

Creating and Configuring a SQL Server AlwaysOn Configuration in AWS EC2

Created VPC in AWS
- Designated Private Subnet address
- Specified proper routes

Created Security Groups
- Specified specific Protocols, Ports, Source/destination

Created 6 micro AWS Instances of Windows 2016 DC Edition
- 1 Domain Controller
- 1 Primary Database Server
- 4 Replica Database Servers

Configured Windows Active Directory, DNS
- Joined Windows Domain
- Created FID to be used with SQL Server Services

Established connectivity between “MCSSYSTEMS” AWS Instances
- Configured virtual NICs to point internal DNS and use static IP

Configured additional storage volumes
- 15 (300 GB) Volumes – 3 volumes (Data, Log, TempDB ) to be assigned to 5 designated SQL Servers)
- Attached storage to AWS Instances to be used with SQL Server
- Brought Disks online as Dynamic disks
- Formatted Disks (64K)

Migrated 5 AWS micros instances to medium t-shirt size Instance so to be used with SQL Server
- re-configured virtual NICs to point internal DNS and use static IP (as all settings were lost during migration)

Installed SQL Server 2016 w/ SP2 and SSMS 2018 on 5 Instances
- Disabled Windows FW
- enforced best practices (separation of Data, Log, Temp)
- Created multiple Temp DB files to match v CPUs
- Put SQL FID in Lock Pages in Memory Policy, instant file initialization
- Configured static port # to be used with SQL Server instances
- Configured SQL Server alias on instances can connect to friendly name (DNS is flaky)

Began SQL Server AlwaysOn Prep work
- Install/Configured Windows Fail Over Cluster Services on 5 SQL Server AWS Instances
- Enable each SQL Server for Always on availability groups
- verified all SQL Servers are running properly and prepped for Always

Next Steps..

Download and attach AdventureWorks database (Sample)
Configure AlwaysOn, Availability Group(s), Listener
Create Primary, Secondary, and Read Only Replicas, etc
Consult with the Big Guns