How to choose Automation Solution (Part 1)


  • Context
  • Assumptions
  • Attributes of a Solution
  • Terms/Jargon
  • A High Level Decision Model
  • Java Decision Model
  • Ruby Decision Model
  • JavaScript Decision Model


You might have skipped the Decision Models section just in case. When we are planning to introduce Automation into an organization/project/team, often we are left with so many choice and options that sometimes they tend to confuse us rather than help to move forward. In this post, I am making an attempt to give information that can help alleviate some confusion. After working for a # of companies that expanded my knowledge, both breadth and depth-wise, I wanted to share this information as many of us go through this. I will try my best to simplify information.


  • That you believe Automation accelerates software development.
  • That investing in Automation is a long term benefit.
  • That Automation aligns with Agile principles.
  • That Automation will help you get relevant with competitors in the market.
  • That Automation is not only an enhancement, but quintessential.

Let’s first talk about some benefits/attributes what most companies look for in a solution.

Benefits/Attributes of a Solution

Maintainability: How do we maintain the scripts, manage it, update them and keep them relevant so that they are value-add and not a burden. By that we mean, as we evolve by churning code, artifacts, configuration items, how do we have a smooth, optimal and less costly operational mode.

Scalability: How do we scale the solution to add capabilities as we need and
accelerate feedback loop to verify/validate the software to release to market within the timelines? An example would be can we write a test script once and execute on all browsers, can we scale the solution to on-board new projects, teams and what is the model there?

Configurability: What is the right balance between configuring a solution too much vs. standardizing and following a convention (aka. rule). Where is the sweet spot that is customized to my company’s environment (not what is out there which is too abstract)

Auditability: How do we keep track of work done and roll up metrics
to serve dashboards or any other management required metrics? Think about how we can archive reports or let’s say we have to have audit compliance

Re-usability: What are the specific artifacts of my solution that can be re-used? Example would be during integration testing when components/functions have to talk to each other. Another example would be how do I share what I built in terms of best practices and let other projects leverage my epiphanies and solution overall.

If the above sounds like something you can relate to, then let’s continue reading. Below are some jargon I believe we keep hearing in the industry, fellow team members, consultants and management when we talk about Automation. It is important to know at least the bare definitions of the terms since we will be involved in meetings, talks with so many folks that we don’t want to appear not-prepared.

Common Terms / Jargon

The below list is a living list as technology keeps changing and new terms/jargon keep coming. It is currently a little biased towards Ruby stack, however we will make best efforts to relate terms across other stacks too. Please type each of words in a search engine and there is enough information to keep you engaged.

CI/CD/CT: Continuous Integration, Continuous Deployment/Delivery and Continuous Test Automation. Each of those Continuous words encompasses a lot of information, however in short, they are practices through which we can accelerate the feedback loop. Please click on the links and read at least a few lines. Continuous Test Automation is becoming mandatory for many organization these days and it is about how fast can we discover a defect at any stage in the development process and ensure high quality delivery.

BDD: Behavior driven development is a software development practice where we describe the behavior of a system in a language that is easily understood by all stake-holders and eventually that drives the way code is written. Cucumber is the most popular BDD framework out there.

Cucumber/Gherkin: Cucumber is a BDD tool and it helps BDD/TDD/ATDD. Gherkin is the name given to the language we use in Cucumber (Given, when , then and so on…)

Features/Scenarios: Feature is a term in Cucumber that applies to a file that we create with .feature extension and have the product feature described there. Scenarios belong to feature (and more like test cases) where they get into details of what the feature is supposed to do

Step Definitions: Step Definitions are the automation code behind the Scenarios that wire up with gherkin and make the specifications executable. These step definitions can be written in multiple languages viz. Java, Ruby, C#, JS and so on. The choice of language for step definition is a decision point for us.

Page Object: Page object is a design pattern and it is very nicely explain by Martin Fowler here. There are # of libraries/frameworks that implement page-object and Selenium Java (.jar) has inbuilt functionality that implements this pattern. On the Ruby side, there are # of gems that implement this pattern and some of them are Capybara-page-object, page-object, page-object-pal and so on. The pattern can be implemented in multiple ways and an entire section is dedicated here on this website.

Jar/Gem: Packaged code is referred to differently in different languages. In Java, it is referred to as jar (java archive) and they end with file extension .jar. In Ruby, they end with .gem and they are called Gems. This is another important decision point for us. Do we use a gem (aka. blackbox in a way) to get our work done or do we invest time in writing our own code. Of course if we write it ourselves, we have the benefit to customize it to our environment and also debugging gets easier when things have to be fixed. On the other hand, why do we have to re-invent the wheel. So there is a sweet spot between the two and that is what I am trying to convey in this website.This is an entry point for us to talk about the programming worlds. Because we have to leverage # of packaged libraries to get to our solution. One package is NOT sufficient. For example Selenium does browser automation, however to handle test data, connect to database, report, thick client automation all of those are capabilities that other libraries provide to us and they have to be integrated with the selenium framework we build – hence the name of this website 🙂

IDE :  Integrated Development Environment. To make coding easier, integrated development environments came into existence. Think of times when we had to write code in a notepad or text editor, compile them, then link them and then execute it on a run time. We are past those times. On the Java side, there are popular IDE’s like Eclipse, IntelliJ, Net Beans and so on. On Ruby side, there is RubyMine, Sublime Text and so on 

Selenium GRID/Cluster:  As we mature using Selenium api for browser testing (the start point would be to run scripts on a local box), we would want to scale the execution for cross-browser test automation and Selenium GRID is a term given to the capability that Selenium offers that can provide this. It is a HTTP server that can be kicked off on a network of machines in a hub/node(s) model where there is a controller (hub) and bunch of nodes. Nodes typically represent a certain browser(&version) sitting on a machine and Hub manages the re-direction of scripts execution to the node we ask for. Anyways, there is a much larger discussion if we go into the details.

Source Control Repo:  Code resides in source control system. Source control systems came into existence to manage code (version control, branching, merge code). There are # of brands and tools that do source control for us. Some of the popular ones are Git, GitHub, Subversion(svn), TFS (Team Foundation server) etc.

CI Servers:  A CI server is a tool that helps us get to our ultimate goal of Continuous Delivery. There are many out there and I am listing some of them here. Jenkins, Hudson, Bamboo, TeamCity, Travis (headless ci server) and so on.

Note: I tried to keep the jargon as agnostic as possible from the programming languages and brands, however there might be some adulteration, so please bear with it. I will focus on language specific jargon on other posts. Continue Reading