The Cloud Architect's Blog: December 2016

Friday, December 30, 2016

Making Technology Choices for Personal Growth.

When I started doing IT as a profession, the number of available, commercially used technologies was considerably less. Back then, most of IT ran on IBM mainframes. The number of tools you needed to learn to be marketable could be counted on one or two hands. Today, there are far too many technologies/products/frameworks and not nearly enough time to learn them all. Should I invest in AngularJS or React? Java8 or Scala? Amazon or Azure? Python or Node.js? The list goes on and on.

If you're like me, your time to learn new products or frameworks is spare time. It's limited. There's just not enough spare time to learn everything, or even close. I often get asked whether they should learn this product or that framework. Very few ask about how they should be making their time investment choices. Let's face it, with so much educational material available on the internet for free or very low cash outlay, it's really a time investment we're talking about.

I look at learning technologies the same way I look at investments. Most software vendors, be they open source or commercial, publish documentation our even the ability to download and install most products you might consider for free. Thanks to Udemy and Amazon, who have pushed prices for online courses and eBooks to unbelievably low levels, cheap resources are available for the more popular choices. There's very little cash outlay for materials. However, it's an investment all the same. The investment is more time than cash.

I believe time is money. Time invested is time I can't spend consulting, writing, or anything else I do to make money. Consequently, I try to make good decisions as to which technologies I invest time in. That begs the important question: how do you choose? Furthermore, there's another dimension to this choice: how much time do you invest in one of those choices?

The amount of time you spend learning a new product, technology, or framework has diminishing returns. That is, the more time you invest, the less incremental payback you'll get for the investment. Like stocks, you don't necessarily buy a large portion to start out. Often, you invest a little and as the market develops, sometimes you buy more. Time investments in new technologies are the same way.

Like financial investments, investing your time has risk. Any time spent learning a new technology, product, or framework that never takes off is time you'll never get back. Your objective is to consciously manage that risk.

Research Tactics

There are several tactics I use to help me decide which technologies to invest in. Here are some of mine.

Ask a mentor or person whose opinion you value about technology choices you're considering. It's quick and easy. That person may have thought of issues or concerns that you haven't thought of so far. Often verbalizing your thoughts to another person helps you crystallize and more thoroughly think through your line of thought.

Look at what the market values.

The best way to do this is to use the Job Trends site from indeed.com. Indeed is a job posting site where hunters post resumes and firms or recruiters post jobs. Indeed keeps history and lets you graph postings and hunter skills listed on their resume over time. As an example, I'm using this comparison that compares top cloud vendors, perhaps for people looking at learning cloud technologies.

You want technologies with an upward job posting trend. You want technologies that firms value and recruit for. Given the time you'll need to invest, you want a realistic chance of a payback for your investment. The market can change at any time and the landscape might look different in six months, but hard numbers are more attractive than gut feelings. I've posted the job posting comparison of cloud vendors (taken on Dec. 30, 2016). I'm using Amazon Web Services, Azure, and Google Cloud in my comparison.

You can see that Amazon AWS job postings are on a sharp uptrend followed by a slightly less uptrend for Azure postings. It would be safe to infer that AWS skills have more of a market right now than either Azure or Google Cloud. If you believe I've left off vendors that you care about, you should surf to the site and change the vendors to your liking.

You want technologies with a postings vs. seeker interest ratio of at least 1.00. Anything less means that there might be oversupply in the market and that can drive salaries/consulting rates down. I say "might" as not all job seekers are honest. Some look for jobs with skills they don't have. With the demand I hear about for AWS skills from recruiters, I believe the 1.02 seekers per posting ratio to be overstated and there are more postings per seeker. Note: Pay attention to the description: "seekers per posting" vs. "Posting per seeker".

Test your search criteria. Toward the bottom of the comparison page, you're offered links that will list job postings for that particular search criteria. You should look at a sample to make sure your search term isn't picking something you're not interested in. For example, if one were to enter a term too general, you could be including irrelevant postings and seekers in your comparison.

A few words of warning: Indeed data has limits. Here are a few:

This data doesn't account for current labor salaries/rates. For example, "HTML" looks hot until you figure out that market rates for that skill are really low.
The data presented is three months old and the current graph might be slightly different.
This data works best when you enter competing products or frameworks. Comparing AWS to Java isn't useful; market drivers for those two are completely different and you'll get more double counting of postings (posts might have both terms).

Look at what people are interested in

Before technologies are listed in job postings or on resumes, they are searched for on the web. Google maintains search term history that it graphs or allows you to download. Furthermore, as with the Indeed Job Trend site, you can compare search data for multiple searches. As an example, I've graphed the same search criteria we used above on the Indeed site.

Internet search results appear to be congruent with what we see on the Indeed Job Trends site. The difference between AWS and Azure does appear to be less stark than what we see on the Indeed site. Here are some things to keep in mind:

It's important to test your search results just like with indeed.
Trend data for technologies with common names will be meaningless.
Poor paying technologies might have rising interest too just like on the Indeed site.

Deciding How Much Time to Invest

Deciding how much time to invest is more difficult. Everyone has different experience levels, talents, and abilities. Some technologies might take more investment than others. What amount I might need might be different than the amount you need. Here are some tactics I use.

You must invest in something all the time. Pick a commitment level; two hours a week, four hours a week, or even more. This should become a habit you don't even think about. If you don't you'll become stagnant. Your skills will get dated over time and your marketability will gradually decrease. Furthermore, you won't develop tactics for learning new things quickly. When you wake up and realize that you're very out of date, it'll take a long time to catch up. Unlike financial investments where you can adopt a cash position, it's too dangerous in a technology world to sit on the sidelines.

Time-box your investment. That is, set a rough amount of time you'll spend on a technology choice up front and then re-assess what you're willing to spend after that learning period. I typically use two hours or four hours as an initial time frame limit. That said, over the past couple of decades, I've honed tactics that allow me to spend less time than many others. Two hours might not be enough for you or it might be too much. You'll need to decide the amount. It's important that you roughly track the time you spend. Also, feel free to quit early if you learn enough to make the decision that you're not going to invest additional time.

Time-boxing mitigates your investment risk. What you want to avoid is going down a rabbit hole and spending boat loads of time unproductively. It also keeps the time you're spending at the forefront of your mind. You'll never get this time back.

Distinguish between "exploratory" learning and "objective" learning. Objective learning has a defined purpose such as doing upcoming work at your current employer or gearing up for a new job search or interview. Exploratory learning is for general knowledge you can use with colleagues. Exploratory learning doesn't require as much depth.

There's value in learning the basics and general advantages and disadvantages of a product. With general knowledge, you know whether or not you want to pursue work using this technology. You will know enough about a technology to participate intelligently in conversations with other developers or recruiters. You know enough to come back later and dig more deeply if the need arises (you decide to submit for a job that needs it). Some developers will resist this idea, but you don't need an advanced, in-depth knowledge of every product you become acquainted with.

Don't learn technologies in depth until there's a high probability you'll will use them at work or need them to support a job search. Technology moves too quickly for that. Your newly learned in-depth knowledge becomes dated quickly. Looking at this in investment terms, don't invest until there's a reasonable probability of a payback. Another way to think about this is that classic YAGNI applies.

Thanks for taking time to read this post. I would like to hear your thoughts on this topic.

Sunday, December 11, 2016

Automated Integration Testing in a Microservices World.

Everyone's dabbling with microservices these days. It turns out that writing distributed applications are difficult. They offer advantages to be sure. However, there is no free lunch. One of the difficulties is automated integration testing. That is, testing a microservice with all the external resources it needs including any other services it calls, the databases it uses, any queues it uses. Just setting up and maintaining these resources can be a daunting task that often takes specialized labor. All too often, integration testing is difficult enough that the task is sloughed off. Fortunately, Docker and it's companion product Docker Compose can make integration testing much easier.

Standardizing on Docker images for deployment artifacts makes integration testing easier. Most organizations writing microservices seem to adopt Docker as a deployment artifact anyway as it greatly speeds up interaction between application developers and operations. It facilitates integration testing as well as you can deploy services you consume (or mocks for the services you consume) temporarily and run integration tests against them. Additionally, consumers for your service can temporarily deploy your docker image to perform their own integration tests. However, as most services also need databases, message queues, and possibly other resources to function properly, that isn't the end of the story. I've previously written about how to dockerize (is that a word?) your own services and applications here.

Docker images already exist for most database software and message software. It's possible to leverage these docker deployments for your own integration testing. In other words, the community has done part of your setup work for you. For example, if my service needs a PostgreSQL database to function, I leverage the official Docker deployment for my integration tests. As it turns out, the Postgres docker deployment makes their image very easy to consumer for integration testing. All I need to do is mount the directory '/docker-entrypoint-initdb.d' and make sure that directory has any SQL files and/or shell scripts I need run to set the database up for use by my application. The MySQL docker deployment does something similar. For messaging, similar docker distributions exist for RabbitMQ, Active MQ, and Kafka. Note that ActiveMQ and Kafka aren't yet "official" docker deployments.

Docker Compose makes it very easy to assemble multiple images into a consistent and easily deployable environment. Docker-compose configurations are YAML files. Detailed documentation can be found here. It is out of scope for this blog entry to do a complete overview of Docker Compose, but I'll point you to an open source example and discuss a couple of snippets from the example as an illustration.

The screen shot on the left contains a snippet of a docker-compose configuration. The full source is here. Note that each section under services describes a docker image that's to be deployed and possibly built. In this snippet, images vote, redis, worker, and db are to be deployed. Note that vote and worker will be built (e.g. turned into a Docker image) before they are deployed. For images already built, it's only necessary to list the image name.

Other common compose directives are as follows:

volumes-- links a directory in the real world to a directory inside the container
ports-- links a port in the real world to a port on the inside of the container. For example, vote links port 5000 on the outside to port 80 on the inside.
command-- specifies the command within the Docker container that will be run at startup.
environment-- (not illustrated here) allows you to set environment variables within the Docker container

Assemble and maintain a Docker compose configuration for your services. This is for your own use in integration tests and so that your consumers can easily know what resources you require in case they want to run integration tests of their own. It's also possible for them to use that compose configuration directly and include it when they set up for their own integration tests.

The Docker environment for your integration tests should be started and shut down as part of the execution of the test. This has many advantages over maintaining the environment separately in an "always on" state. When integration tests aren't needed, they son't consume resources regardless. Those integration tests, along with their environment, can be easily run by developers locally if they need to debug issues; debugging separate environments is always more problematic. Furthermore, integration tests can be easily and painlessly be hosted anywhere (e.g. on-premise, in the cloud) and are host agnostic.

An Integration Test Example

I would be remiss if I didn't pull these concepts together for an integration test example for you. For my example, I'm leveraging an integration test generic health check written to make sure that a RabbitMQ environment is up and functioning. The source for the check is here, but we're more interested in its integration test today.

This test utilizes the DockerProcessAPI toolset as I don't currently work in environments that require a docker-machine and the Docker Remote API (Linux or Windows 10 Pro/Enterprise). If your environment requires a docker-machine (e.g. it is a Mac or an earlier version of Windows), then I recommend the Spotify docker-client instead.

The integration test for the health check uses Docker to establish a RabbitMQ environment before the test and shut it down after the test. This part is written as a JUnit test using the @BeforeClass and @AfterClass annotations to bring the environment up once for the entire test and not for each test individually.

In this example, I first pull the latest RabbitMQ image (official distribution). I then map a port for RabbitMQ to use and start the container. I wait five seconds for the environment to initialize, then cause a logging for the current docker environment running.

My log of what Docker containers are running isn't technically required. It does help sometimes if there are port conflicts where the test is running or other problems with a failed test that need to be investigated. As this test runs in a scheduled manner, I don't always know execution context.

After the test completes, the @AfterClass method will shut down the RabbitMQ instance I started and once again cause a container listing just in case something needs to be investigated.

That's a very short example. Had the integration test environment been more complicated and I needed Docker Compose, that would have been relatively simple with the DockerProcessAPI as well. Here's an example of bringing up a Docker Compose environment given a compose configuration YAML:

Here's an example after the test of bringing that same environment back down:

In addition, there are additional convenience methods on the DockerProcessAPI that can log compose environments that are running for investigative purposes later.

Thanks for taking time to read this entry. Feel free to comment or contact me if you have questions.

Resources

Docker and Docker Compose documentation
DockerProcessAPI toolset
Spotify Docker-Client
Integration Test Example from btm-DropwizardHealthChecks

Twitter

Friday, December 30, 2016

Making Technology Choices for Personal Growth.

Sunday, December 11, 2016

Automated Integration Testing in a Microservices World.