Week of May 22nd

And you know that notion just cross my mind…

Happy Bitcoin PizzaEmoji Day!

All aboard! This week our travels would take us on the railways far and high but before, we can hop on the knowledge express we had some unfinished business to attended too.

“Oh, I get by with a little help from my friends”

If you have been following my weekly submissions for the last few weeks I listed as future action item “create/configure a solution that leverages Python to stream market data and insert it into a relational database.

Well last week, I found just the perfect solution. A true master piece by Data Scientist/Physicist extraordinaire AJ Pryor, Ph.D. AJ had created a brilliant multithreaded work of art that continuously queries market data from IEX  and then writes it to a PostgreSQL database. In addition, he built a data visualization front-end that leverages Pandas and Bokeh so the application can run interactively through a standard web browser. It was like a dream come true! Except that the code was written like 3 years ago and referenced a deprecated API from IEX.

Ok, no problem. We will just simply modify AJ’s “Mona Lisa” to reference the new IEX API and off we will go.  Well, what seemed like was a dream turned into a virtual nightmare. I spent most of last week spinning my wheels trying to get the code to work but to no avail. I even reached out to the community on Stack overflow but all I received was crickets..

As I was ready to cut my loses, but I reached out to a longtime good friend who happens to be all-star programmer and a fellow NY Yankees baseball enthusiast. Python wasn’t his specialty (he is really an amazing Java programmer) but he offered to take a look at the code when he had some time… So we set up a zoom call this past Sunday and I let his wizardry take over… After about hour or so he was in a state of flow and had a good pulse of what our maestro AJ’s work was all about. After a few modifications my good chum had the code working and humming along. I ran into a few hiccups along the way with the brokeh code, but my confidant just referred me to run some simpler syntax and then abracadabra… this masterpiece was now working on the Mac!Emoji As the new week started, I was still basking in the radiance of this great coding victory. So, I decided to be a bit ambitious and move this gem Emoji to the cloud Emoji which would be like the crème de la crème of our learnings thus far. Cloud, Python/Pandas, Streaming market data, and Postgres all wrapped up in one! Complete and utter awesomeness! 

Now the question was for which cloud platform to go with? We were well versed in the compute area in all 3 of the major providers as a result of our learnings.

So with a flip of the coin ,we decided to go with Microsoft Azure. That and we had some free credits still available. Emoji

With sugar plum fairies dancing Emoji in our head, we spun up our Ubuntu Image and we followed along the well documented steps on AJ’s Github project 

Now, we were now cooking Emoji with gasoline Emoji! We cloned AJ’s Github repo, modified the code with our new changes, and executed the syntax and just as we were ready to declare victory… Stack overflow Error! Emoji Oh, the pain.

Fortunately I didn’t waste any time, I went right back to my ace Emoji in the hole but with some trepidation that I wasn’t being too much of irritant.

I explained my perplexing predicament and without hesitation my Fidus Achates offered some great trouble shooting tips and quite expeditiously we had the root cause pinpointed. For some peculiar reason, the formatting of URL that worked like a charm on the MacEmoji was a dyspepsia on Ubuntu on Azure. It was certainly a mystery but one that can only be solved by simply rewriting the code.

So once again, my comrade in arms helped me through another quagmire. So, without further ado, may I introduce to you the one and only…

http://stockstreamer.eastus.cloudapp.azure.com:5006/stockstreamer

We’ll hit the stops along the way We only stop for the best

After feeling victorious after my own personal Battle of Carthage and with our little streaming market data saga out of our periphery it was to time to hit the rails… Emoji

Our first stop was messaging services which is all the rage now a days.  There are so many choices with data messaging services out there.. So where to start with? We went with Google’s Pub/Sub which turned out to be a marvelous choice! To get enlightened with this solution, we went to Pluralsight where we found excellent course on Architecting Stream Processing Solutions Using Google Cloud Pub/Sub by Vitthal Srinivasan 

Vitthal was a great conductor who navigated us through an excellent overview of Google’s impressive solution, uses cases, and even touched on a rather complex pricing structure in our first lesson. He then takes us deep into the weeds showing us how to create Topics, Publishers, and Subscribers. He goes on further by showing us how to leverage some other tremendous offerings in GCP like Cloud Functions, API & Services, and Storage. 

Before this amazing course my only exposure was just limited to GCP’s Compute Engine so this was eye opening experience to see the great power that GCP had to offer! To round out the course, he showed us how to use GCP Pub/Sub with some client Libraries which was excellent tutorial on how to use Python with this awesome product. There was even two modules on how to integrate Google Hangout Chatbot with Pub/Sub but that required you to be a G Suite User. (There was free trial but skipped the set up and just watched the videos) Details on the work I did on Pub/Sub can be found at

“I think of all the education that I missed… But then my homework was never quite like this”

For Bonus this week, I spent enormous amount of time brushing up my 8th grade Math and Science Curriculum 

  1. Liner Regression
  2. Epigenetics
  3. Protein Synthesis

Below are some topics I am considering for my Journey next week:

  • Vagrant with Docker
  • Continuing with Data Pipelines
  • Google Cloud Data Fusion (ETL/ELT)
  • More on Machine Learning
  • ONTAP Cluster Fundamentals
  • Google Big Query
  • Data Visualization Tools (i.e. Looker)
  • ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks) .
  • Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well –

–MCS 

Week of May 15th

“Slow down, you move too fast…You got to make the morning last.”

Happy International Day of Families and for those celebrating in the US Happy Chocolate Chip Emoji Day!

This week’s Journey was a bit of a laggard in comparisons to previous week’s journeys but still productive, nonetheless. This week we took a break from Machine Learning while sticking to our repertoire and with our reptilian programing friend. Our first stop was to head over to installing Python on Windows Server which we haven’t touched on so far.As we tend to make things more challenging than they need to be we targeted an Oldie but a goodie Windows Server 2012 R2 running SQL Server 2016. Our goal to configure a SQL Server Scheduled Job that runs a simple Python Script which seemed liked a pretty simple task. We found an nice example of this exact scenario on MSSQL Tips – Run Python Scripts in SQL Server Agent

First, we installed Python and followed the steps and lo and behold it didn’t work right away. To quote the great Gomer Pyle “Surprise, surprise, surprise”. No worries we had this… After a little bit of troubleshooting and trying to interpret the vague error messages in the SQL Server Agent Error log we got it working… In turns out, we had a multitude of issues ranging from the FID that was running the SQL Agent service not having the proper entitlements to the directory where the py script lived and the more important prerequisite of Python not being in the User Environment Variables for the Service account to know where to launch the executable. Once resolved, we were off to the races or at least we got the job working.

At this point we were feeling pretty ambitious, so we decided rather than using the lame MS Dos style batch file we would use a cool PowerShell Script as a wrapper for our python code for the job… Cool but not so cool on Windows Server 2012 R2. First, we started out with set-executionpolicy remotesigned command which needs to be specified in order to execute PowerShell but because  we were using an old jalopy OS we had to upgrade the version of the .Net runtime as well as the version of PowerShell.  Once upgraded and we had executed a few additional commands and then we were good to go…

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

Install-PackageProvider -Name NuGet -RequiredVersion 2.8.5.201 -Force

Install-Module -Name SqlServer -AllowClobber

After spending a few days here, we decided to loiter a little bit longer and crank out some SQL maintenance tasks in Python like a simple backup Job. This was pretty straight forward once we executed a few prerequisites.

python -m pip install –upgrade pip

Pip install pip

pip install pyodbc 

pip install pymssql-2.1.4-cp38-cp38-win_amd64.whl

pip install –upgrade pymssql

Our final destination for the week was to head back over to a previous jaunt and play with streaming market data and Python. This time we decided to stop being cheap and pay for an IEX account 

Fortunately, they offer pay by the month option with opt out any time so it shouldn’t get too expensive. To get re-acclimated we leveraged Jupyter notebooks and banged out a nifty python/pandas/matlib script that generates the top 5 US Banks and there 5-year performance. See attachment. 

“I have only come here seeking knowledge… Things they would not teach me of in college”

Below are some topics I am considering for my adventures next week:

  • Vagrant with Docker
  • Data Pipelines
    • Google Cloud Pub/Sub (Streaming Data Pipelines)
    • Google Cloud Data Fusion ( ETL/ELT)
  • Back to Machine Learning
  • ONTAP Cluster Fundamentals
  • Google Big Query
  • Python -> Stream Data from IEX -> Postgres
  • Data Visualization Tools (i.e. Looker)
  • ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks) .
  • Getting Started with Kubernetes with an old buddy (Nigel)

Stay Safe and Be well –

–MCS 

Week of April 10th

“…When the Promise of a brave new world unfurled beneath a clear blue Sky”

“Forests, lakes, and rivers, clouds and winds, stars and flowers, stupendous glaciers and crystal snowflakes – every form of animate or inanimate existence, leaves its impress upon the soul of man.” — Orison Swett Marden

My journey for this week turned out to be a sort of potpourri of various technologies and solutions thanks to the wonderful folks at MSFT.  After some heavy soul searching over the previous weekend, I decided that my time would be best spent this week on recreating the SQL Server 2016 with Always On environment (previously created several weeks back on AWS EC2) but in the MS Azure Cloud platform.  The goal would be to better understand Azure and how it works. In addition, I would be able to compare and contrast both AWS EC2 vs. Azure VMs and be able to list both the pros and cons of these cloud providers. 

But before I could get my head into the clouds I was still lingering around in the bamboo forests. This past weekend, I was presented with an interesting scenario to stream market data to pandas from the investors exchange (Thanks to my friend) . So after consulting with Mr. Google, I was pleasantly surprised to find that IEX offered an API that allows you to connect to there service and stream messages directly to Python and use Pandas for data visualization and analysis. Of course being the cheapskate that I am I signed up for a free account and off I went. 

So I started tickling the keys, I produced a newly minted IEX Py script. After some brief testing, I started receiving an obscure error? Of course there was no documented solution on how to the address such an error.. 

So after some fruitless nonstop piping of several modules, I was still getting the same error. 🙁 After a moment of clarity of I deduced there was probably limitation on messages you can stream from the free IEX account..

So I took shot in the dark and decided to register for another account (under a different email address) this way I would receive a new token and give that a try 

… And Oh là là!  My script started working again! 🙂 Of course as I continued to add more functionality and test my script I ran back into the same error but this time I knew exactly how to resolve it. 

So I registered for a third account (to yet again generate a new token ). FortunateIy, I completed my weekend project. See attachments Plot2.png and Plot3.png for pretty graphs

Now that I could see the forest through the trees and it was off to the cloud! I anticipated that it would take me a full week to explore Azure VMs but it actually only took a fews to wrap my head around it..

So this left me chance to pivot again and this time to a Data Warehouse/ Data Lake solution built for the Cloud. Turning the forecast for the rest of the week to Snow.

Here is a summary of what I did this week:

Sunday:

  • Developed Pandas/Python Script in conjunction with iexfinance & matplotlib modules to build graphs to show historical price for MSFT for 2020 and comparison of MSFT vs INTC for Jan 2nd – April 3rd 2020

Monday: (Brief summary)

  • Followed previous steps to build the plumbing on Azure for my small SQL Server farm (See Previous status on AWS EC2  for more details) 
  1. Created Resource Group
  2. Create Application Security Group   
  3. Created 6 small Windows VMs in the same Region and an Availability Zone
  4. Joined them to Windows domain

Tuesday: (Brief summary)

  1. Created Windows Failover Cluster
  2. Installed SQL Server 2016
  3. Setup and configured AlwaysOn AGs and Listeners    

 Observations with Azure VMs:

Cons

  • Azure VMS are very slow first time brought up after build
  • Azure VMS has a longer provisioning time than EC2 Instances
  • No UI option to perform bulk tasks (like AWS UI) . Only option is Templating thru scripting 
  • Can not move Resource Group from one Geographical location to another like VMs and other objects within Azure
  • When deleting a VM all child dependencies are not dropped ( Security Groups, NICs, Disks) – Perhaps this is by design?

– Objects need to be dissociated with groups and then deleted for clean up of orphan objects

    Neutral

  • Easy to migrate VMs to higher T-Shirt Sizes
  • Easy to provision Storage Volumes per VM
  • Application Security Groups can be used to manage TCP/UDP traffic for entire resource group

  Pros

  • You can migrate existing storage volumes to premium or cheaper storage seamlessly
  • Less network administration 
    • less TCP/UDP ports need to be opened especially ports native to Windows domains
  • Very Easy to build Windows Failover clustering services 
    • Natively works in the same subnet
    • Less configuration to get Connectivity to working then AWS EC2
  • Very Easy to configure SQL Server 2016 Always On
    • No need to create 5 Listeners (different per subnet) for a given specific AG 
    • 1 Listener per AG
  • Free Cost, Performance, Operation Excellence Recommendations Pop up after Login

Wednesday:

  • Registered for an Eval account for Snowflake instance
  • Attended Zero to Snowflake in 90 Minutes virtual Lab
    • Created Databases, Data Warehouses, User accounts, and Roles
    • Created Stages to be used for Data Import
    • Imported Data Sources (Data in S3 Buckets, CSV, JSON formats) via Web UI and SnowSQL cmd line tool
    • Ran various ANSI-92 T-SQL Queries to generate reports from SnowFlake

Thursday:

Friday:

**Bonus Points **

  • More Algebra – Regents questions. 
  • More with conjugating verbs in Español (AR Verbs)

Next Steps.. 
Below are some topics I am considering for my voyage next week:

  •  SQL Server Advanced Features:

           – Columnstore Indexes
           – Best practices around SQL Server AlwaysOn (Snapshot Isolation/sizing of Tempdb, etc)

  • Data Visualization Tools (i.e. Looker)
  • ETL Solutions (Stitch, FiveTran) 
  • Process and Transforming data/Explore data through ML (i.e. Databricks) .
  • Getting Started with Kubernetes with an old buddy (Nigel)

Stay safe and Be well

—MCS