How to Choose Automation Solution-2


In our previous Part-1, we talked about mostly non technical attributes that we might have to think before embarking on the road to Automation Solutions. Continuing that discussion, we will get a little more technical here.

High Level Decision Model:

Solutions and Frameworks can be viewed from multiple angles and filters and zoom lens, however I believe, the following are parameters or variables that we will need to consider. The values of those parameters really depend on the environment, situation and knowing your requirements a little more clearly. Broadly classified decision points are Software Development Model, Programming Language, Libraries/frameworks and Design Patterns. Let’s talk about each in detail.

Stages in Automation Testing:

A very practical infographic that depicts different layers in a web app where  Automation applies and how they are inter-related. The testing pyramid conveys a lot of information in terms of “dependencies” and “impacts” that needs to be planned especially in a large integrated environment because multiple teams are responsible and accountable for various parts (modules) of the entire application. That means a lot of coordination, collaboration is required not just from a technical view, but also from people and process views too. A valuable observation from the pyramid is to infer that if we have extremely flaky tests at the bottom layers of the pyramid and have very low confidence on the quality at those layers, it is NOT a good idea to expect flashy GUI Automation tests (read that as Selenium tests) to pass and have progressive metrics continuously.


Software Development Model:

Skipping the water fall model discussion, Agile model is probably the de-facto model in almost every organization or it is getting to that. Agile as we know is more of a guideline and NOT a mandate. However, “being agile” is reality, hence adopting it is a no-brainer. Now since Agile is so customized to each environment, we have some software development models that are popular these days viz. Behavioral driven development (BDD), test driven development (TDD) and Acceptance Test Driven Development (ATDD). You can read the details here.

Programming Language:

I would like to emphasize here a lot because this decision probably changes all downstream decisions and the direction of  your entire solution. So please spend enough time understanding the pros and cons and make an informed decision here. There are language wars since ages and so there is a ton of information out there. So please give enough attention to this decision point. The choice of programming language (and run time environment) opens up a world of opportunities in terms of libraries, packaged code, support community etc. which is where we get most of our functionality and value from. For example if you choose Java, then all functionality that an enterprise gets viz. browser automation, thick client automation (jacob, jni), jdbc, security and pretty much anything that you can associate in Java world can be plugged into the automation framework. Selenium is a packaged library that is available in Java (jar) , Ruby (gem), javascript (js), csharp (dll) and so on. So please do not think of Selenium as a tool like QTP/Rational/Silk tool where you get an IDE and you can add plugins etc. Yes we can get all that, but that is NOT Selenium’s selling point. That is Eclipse and Java World for Example.

On the Ruby side, Rubymine and the gem world for example opens up possibilities to parse data, database connections, api calls etc. Similarly on the c# or js

So summary is to say that Selenium is NOT independent in itself for building an non-browser and browser Automation Solution, it has to be complemented with libraries, tools and connections for achieving a whole lot through programming and Programming Language choice is what we need to make.

In fact, what Selenium provides us it to be able to automate Browsers – that’s it. I am not trying to say Selenium is insufficient, but more that “it depends” on what we want. If we want browser automation and at the same time, validate data against api layer and database layer, then thinking Selenium is the answer is probably immature. We need to think of the whole ecosystem of a Programming Language and the world of libraries it brings along with it.

That said, here would be what I would suggest….

  1. If you are already a mature Java shop, please stick to Java Stack as finding Java programmers is easier compared to Ruby or NodeJs Stack.
  2. If you are .NET stack and have awesome programmers, you can stick to .NET too
  3. If you want to convert manual testers to automation engineers and teach some programming, Ruby might be a choice to start with, however if you want developers to help testers , then weigh in on what developer’s choice is
  4. If you don’t differentiate between testers and developers, then really you won’t be reading this as you can pretty much pick any of Java, Ruby or NodeJs stack
  5. If you wish to bring organizational change with ATDD (we will talk about it in a while), then again talk to the in-house developers and staff and pick the programming language that is most suitable and comfortable for them as long as it aligns with what you want to achieve with Automation.


It is very easy to get caught up in programming language wars and lose sight of what you want to achieve. By that I mean if you end up choosing for eg. Ruby in a complete Java shop, then you might not get alignment and support from a staffing perspective and it might cause lot of frustration. Vice-versa might also be true.

My experience has been around Java, Ruby and C# (+little js) and I believe Java and Ruby are completely mature stacks to implement Automation capabilities. C# is okay but lacks few integration points especially around Continuous Integration capability with newer and newer tools that are built more friendly with Java and Ruby.


Selenium is probably the most sought after Browser Automation library out there and I have no doubts about it either. However, here is what I see most of them missing the information when they think about Selenium. Selenium is NOT a tool or an IDE or QTP/Rational Robot etc. Selenium is a packaged library and it comes in multiple programming languages viz.

  • Java
  • Ruby
  • C#
  • Js
  • Php

So once again I would like to mention there that your programming language choice is probably more important to consider before talking about Selenium

If we take Java route, we have selenium-webdriver, selenium-server jars (and selenium components). Read more here.

If we choose Ruby, we have selenium-webdriver, watir-webdriver that handle the selenium side of things.

Talking about frameworks……

If we select Java, we have mature frameworks like TestNG, JUnit, cucumber-jvm etc. and they are good enough to handle most of your requirements around Reports, parallel testing, parameterization, grouping tests and suites and so on.

If we select Ruby, we have Cucumber, rspec like frameworks that offer capabilities we talked about in previous line

Design Patterns:

Let’s briefly talk about design patterns here. For most test Automation solutions, we have data-driven, keyword driven, modular and PageObject Patterns. We will cover each of the patterns on this website giving you code examples, however PageObject pattern is the most popular pattern used these days and you can read more here

If we take Java route, we have PageObject pattern in-built into the jar that we get with Selenium. So we DO NOT have to build this pattern

If we take Ruby route, there are couple of options and as per my experience doing the cost-benefit analysis, they are listed in the order of highest preferred first to lowest preferred last. Design patterns are something to be learnt using the programming language. One can use packaged libraries that offer the designed pattern functionality, however use it only after you know how it was implemented, otherwise we will get stuck so bad when things fail that while debugging, we will realize that we created unnecessary dependencies.

  1. Page Object and Page Factory patterns have been implemented on this website with an e-commerce website. The project code base is available for free to download (or github checkout) and you can build on top of it. It is scalable and your custom code can be easily plugged in.
  2. Custom Page Object (use plain ruby code and write classes and add this code on top of it. Thanks to Alister Scott)
  3. Learn Ruby programming and get with your developers to learn how to do it. Implementing page object pattern would be something you can be proud about once you appreciate its details (especially around PageFactory)
  4. Use page-object-pal/page-object gem [I did not have great experience with this gem after seeing it being implemented in multiple environments, because it works fine for basic learning, but once we start integrating more things in to the solution, fixing something and debugging is not convenient and I have to wait on YET ANOTHER GEM kind of situation, which is a black box and until I learn what that gem (black box) does, I cannot move forward)
  5. Use Capybara-page-object [Again the same problems and issues what was mentioned with 3]

The next part in this series would present quick visual diagrams on the decision model after choosing the programming language.