devops | SQL Squirrels

Week of June 5th

Posted on June 5, 2020 by Mark Shay

“Another dimension, new galaxy Intergalactic, planetary”

Happy National Donut Day!

“There is an inexorable force in the cosmos, a place where time and space converge. A place beyond man’s vision…but not beyond his reach. Man has reached the most mysterious and awesome corner of the universe…a point where the here and now become forever…. A journey that takes you where no man has been before Experience the power⚡️! A journey that begins where ~~everything~~ nothing ends! You can’t escape the most powerful force in the ‘DevOps’ universe.”

Mission #7419I

So once again, we boarded the USS Palomino 🚀 and continued our exploration to the far depths of the DevOps Universe. Just to pick up where we last left off,👨‍✈️ Captain Bret Fisher had taken us through the Microservices galaxy 🌌 and straight to Docker🐳 and Containers. But.. “with so many light years to go.. And things to be found” we continued through the courseware Docker Mastery: with Kubernetes +Swarm from a Docker Captain and reconnoitered Docker🐳 Compose, Docker🐳 Swam, Docker🐳 Registries, and the infamous Kubernetes☸️.

Again, we leveraged the portability of HashiCorp’s Vagrant for both Docker🐳 with Docker🐳 Compose, our 3 Node Docker 🐳 Swarm , and the K8s☸️ environments. We were grateful that we had our previous experiences with Vagrant in earlier learnings as it made standing up these environments quite seamless.

We started off with Docker 🐳 Compose which can be quite a useful tool in development for defining and running multi container Docker🐳 applications. Next, we headed right over to Docker🐳 Swam to get our initiation into Container Orchestration. You might ask why not just go straight Kubernetes☸️ as they are the clear winner🏆 from famous Container Orchestration wars? Well, Orchestration is great for solving complex problems but Orchestrators themselves can be complex solutions to try to learn. From what we witnessed this week we were glad we started there. We also learned that the “combination of Docker 🐳 Swarm, Stacks, and Secrets are kind of like a trilogy of awesome features” that can really make things easier if we went this route in production.

“Resistance is Futile“

If you not familiar with the story of Kubernetes☸️ or affectionately known as k8s☸️.. It came out of Google by the original developers who worked on the infamous Google “Borg” project.. In fact, here is a little bit of trivia, the code name project for K8s☸️ was called Project Seven of Nine, a reference to the Star Trek🖖 character of the same name who was a “friendlier” Borg. K8s☸️ was certainly uncharted territories for us and bit out of my purview but it was a good learning experience to to get a high level overview of another important component of the infrastruture ecosystem.

Captains’ log Star date 73894.9 These are missions covered in earnest this week:

Created a 3-node Swarm cluster in the cloud
Installed Kubernetes and and learn the leading server cluster tools
Used Virtual IP’s for built-in load balancing in your cluster
Optimized Dockerfiles for faster building and tiny deploys
Built/Published custom application images
Learned the differences between Kubernetes and Swarm
Created an image registry
Used Swarm Secrets to encrypt your environment configs
Created the config utopia of a single set of YAML files for local dev, CI testing, and prod cluster deploys
Deployed apps to Kubernetes
Made Dockerfiles and Compose files
Built multi-node Swarm clusters and deploying H/A containers
Made Kubernetes YAML manifests and deploy using infrastructure-as-code methods
Built a workflow of using Docker in dev, then test/CI, then production with YAML

For more details see the complete Log

This turned out to be quite the intensive undertaking this week but we accomplished our mission and here is certificate to prove it

Below are some topics I am considering for my exploration next week:

Google Big Query
More with with Data Pipelines
Google Cloud Data Fusion (ETL/ELT)
More with with Data Pipelines
NoSQL – MongoDB, Cosmos DB
Working JSON Files
Working with Parquet files
JDBC Drivers
More on Machine Learning
ONTAP Cluster Fundamentals
Data Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks)

Stay safe and Be well –

–MCS

Week of May 8th

Posted on May 8, 2020 by Mark Shay

“Now it’s time to leave the capsule if you dare..”

Happy Friday!

Before we could end our voyage and return our first mate Slonik to the zookeeper, we would first need to put a bow on our Postgres journey (for now) by covering a few loose ends on advanced features. Saturday, we kicked it off with a little review on Isolation levels in Postgres (including a deep dive on Serializable Snapshot Isolation (SSI)) Then on to Third-parting monitoring for database health and streaming replication, and for the la cerise sur le gâteau… Declarative Partitioning and Sharding!

Third-Party Monitoring

We evaluated 2 solutions OpsDash and PgDash. Both were easy to set up and both gave valuable information in regards to Postgres. OpsDash provided more counters and is it can monitor system information as well as other services running on Linux where as PgDash is Postgres specific and will give you a deeper look into Postgres and Streaming Replications than just querying the native system views

Declarative Partitioning

It was fairly straight forward to implement Declarative partitioning. We reinforcement such concepts by turning to Creston’s plethora of videos on the topics as well as turning to several blog posts. See below for detailed log.

Sharding Your Data with PostgreSQL

There are third party solutions like Citus Data that seem to offer a more scalable solution but out of the box you can implement Sharding with using Declarative Partitioning set up on a Primary Server and using a Foreign Data Wrapper configured on a remote Server. Then you combine Partitioning and FDW to create Sharding. This was quite an interesting solution although I have strong doubts about how scalable this would be in production.

On Sunday, we took a much-needed respite as the weather was very agreeable in NYC to escape the quarantine…

On Monday, with our rig now dry docked. We would travel through different means to another dimension, a dimension not only of sight and sound but of mind. A journey into a wondrous land of imagination. Next stop, the DevOps Zone!

To begin our initiation into this realm we would start off with HashiCorp’s Vagrant.

For those who not familiar with Vagrant it is not a transient mendicant that the name would otherwise imply but a nifty open-source solution for building and maintaining lightweight and portable DEV environments.

It’s kind of similar to docker for those more familiar but it generally works with virtual machines (although can be used with containers).

At the most basic level, Vagrant uses a smaller version of VMs whereas Docker is kind of the “most minimalistic version for process and OS bifurcation by leveraging containers”.

The reason to go this route opposed to the more popular Docker was that it is generally easier to standup a DEV environment.

With that being said we wound up spending a considerable amount of time on Monday and Tuesday this week Working on this. As I ran into some issues with SSH and “Vagrant UP” process. The crux of issue was related using Vagrant/VirtualBox under an Ubuntu VM that was already running VirtualBox on a Mac. This convoluted solution didn’t seem to play nice. Go figure?

Once we decided to install Vagrant with VirtualBox natively on the Mac we were up and running were easily able to spin up and deploy VMs seamlessly.

Next, we played a little bit with Git. Getting some practicing with the work flow of editing configuration files and pushing the changes straight to the repo.

On Wednesday, we decided to begin our expropriation of a strange new worlds, to seek out new life and new civilizations and of course boldly go where maybe some have dared to go before? That would be of course Machine Learning where the data is the oil and the algorithm is the engine. We would start off slow by just trying to grasp the jargon like training data, training a model, testing a model, Supervised learning, and Unsupervised Learning.

The best way for us to absorb this unfamiliar lingo would be to head over to Pluralsight where David Chappell offered a great introductory course on Understanding Machine Learning

“Now that she’s back in the atmosphere… With drops of Jupiter in her hair, hey, hey”

On Thursday we would go further down the rabbit hole of Machine Learning with Jerry Kurata’s Understanding Machine Learning with Python

There we would be indoctrinated by the powerful tool of Jupyter Notebook. Now armed with this great “Bat gadet” we would reunite with some of our old heroes from the “Guardians of the Python” like “Projectile” Pandas, matplotlib “the masher” and of course numpy “ the redhead step child of Thanos”. In addition, we would also be introduced to a new and special super hero scikit-learn.

For those not familiar with this powerful library “scikit-lean” has unique and empathic powers to our friends Numpy, Pandas and SciPy. This handy py lib ultimately unlocks the key to the Machine Learning Universe through Python.

Despite all this roistering with our exemplars of Python, our voyage wasn’t all rainbows and Unicorns.

We got introduced to all sorts of new space creatures like Bayesian and Gaussian Algos each conjuring up bêtes noires. The mere thought of Bayes theorem drudged up old memories buried deep in the bowls back in college when I was studying probability and just the mere mention of Gaussian functions jarred memories from the Black Swan (and not the ballet movie with fine actresses Natalie Portman and Mila Kunis) but the well-written and often irritating NYT Best seller by Nassim Nicholas Taleb.

Unfortunately, It didn’t get any cozier when we started our course for powerful and complex ensemble of the Random Forrest Algo. There we got bombarded by meteorites such as “Over Fitting”, “Regularization Hyper-parameters” , and “Cross Validation”, and not to mention the dreaded “Bias – variance tradeoff”. Ouch! My head hurts…

Here is the detailed log of my week’s journey

“With so many light years to go…. And things to be found (to be found)”

Below are some topics I am considering for next week’s odyssey :

Run Python Scripts in SQL Server Agent
More with Machine Learning
ONTAP Cluster Fundamentals
Google Big Query
Python -> Stream Data from IEX ->
MSSQLData Visualization Tools (i.e. Looker)
ETL Solutions (Stitch, FiveTran)
Process and Transforming data/Explore data through ML (i.e. Databricks)
Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS