Why Selenium has an edge over other automation tools

Selenium is an open source set of libraries initially developed by ThoughtWorks. These libraries can be used with java, ruby, C#, perl, and other programming languages to interact with the web browser. Over time, selenium has evolved and became a widely used automation tool attributed to the wide community base which contributed to its enhancement, maintenance, and scalability. Thus selenium has gained an edge over other commercial tools available in the market. Getting started with selenium is easy and no commercial cost is involved for using selenium. It is free and easy to use. But unlike other commercial tools, it does not provide any UI to ease the automation. User has to write code even for slightest of web operation.

Components of selenium are :


GRID and WEB DRIVER are the most commonly used components of selenium these days. Selenium RC became obsolete and replaced by webdriver. In Selenium RC, there used to be a separate server that acts as an interface between browser and selenium commands. On the other hand in selenium webdriver, a specific browser driver is initialized and invoked by webdriver object and executes selenium command without having anything acting in between.

Selenium Grid gives us the flexibility to execute our automation tests on various browsers, platforms, and operating systems. Visit our article SELENIUM GRID on how to create hub and nodes, define desired capabilities, and invoke remote web driver to execute automation tests on remote machines using selenium grid. Selenium IDE was used for record and playback.

Selenium support wide range of platforms and browsers. Below is the list of operating systems supported by selenium:


Below is the list of browsers supported by selenium:


You can download selenium libraries from: Download selenium

You might be wondering when the selenium automation comes into the picture after seeing the below high-level life cycle
But it is not always at the end of the SDLC phase or during the testing phase. We’ll come to it in later posts.

In a nutshell, selenium webdriver is an interface that is used to interact with the browser. It is a collection of classes and comes in the form of a jar. It can be bound with any programming language like java, ruby , perl, C# to perform operations on browsers. Any web component which displays on the browser can be operated upon by a webdriver object. Out of the numerous operations performed by webdriver on web elements of a web page, some basic operations include inputting into text fields, clicking buttons, selecting radio buttons, click, right click, double click, mouse hover, drag and drop, a selection from dropdown, switching between the windows, and frames, handling alerts. Selenium web driver provides APIs and libraries to handle different aspects of web components.

Getting started with Selenium


To start with selenium webdriver, you need to have the following items installed on your machine.

JDK (java 1.8 preferable)

Selenium libraries (Nowadays there is single jar ‘selenium-server-standalone.jar’ is enough unlike earlier period where we need to install multiple jars),

Browser drivers (.exe of chrome driver, ie driver, gecko driver)

Eclipse IDE

Now that you have downloaded the minimum required elements to start writing selenium code, so let’s start by setting up the environment:

Now select a java project:

Give the project name and finish.
Now configure the build path where you need to set your downloaded libraries i.e. selenium jar files, also need to set JDK and JRE path in the java compiler section. Right-click on your project and build path and configure build path.

Now select the libraries tab above and click add external jars. Add your downloaded selenium jars here:

Also now don’t forget to set your compliance level as per your JDK and JRE version.
For all your latest version of selenium, make sure the compliance level under the java compiler section must be set to java 1.8 or above. Ignoring this might lead to a “class not found” run time exception during running your selenium test script.

Now that you have set libraries and compliance levels, it’s time to write your first selenium code using java.
Right-click on your project and select create a new class.

Give it a name, check “public static void main..” , click finish. You are creating a sample test, where we are not using any testNG, cucumber, or maven therefore we need to include public static void main in our first class to run the test as the java execution will begin through the class where the main method is defined. In subsequent lectures, when we use testNG, cucumber, Maven, we will not use or include the main method as our execution will be taken care of by testNG or cucumber.

Now it’s time to write our first test script which simply opens the browser and navigate to the site specified in the script. We’ll write the code and understand the meaning of each one of them.
In our class, first include the required packages and classes which come with selenium libraries.

Now at class level, that is inside the class but before the main method, declare a webdriver instance variable:

Webdriver driver;

Now inside our main method, write code to set system property:

Here setProperty method is accepting two parameters in the form of key and value. The first one is the string parameter in which we are telling that we are going to use a chrome driver, and the second is also a string parameter where we are placing an entire path till the driver’s exe.
Now in the next line, we’ve initialized the driver with chromeDriver. This piece of the line will open the chrome browser upon execution. Now our browser is open, and we want to navigate to any site. This task will be done by driver.get(“”); method. So driver.get(“”) will land us to the specified URL.

INTERVIEW TIP: Why we cannot write Webdriver driver=new Webdriver();

Since webdriver is an interface therefore we can create its instance variable but cannot create its object. i.e. we cannot write Webdriver driver=new Webdriver(); this will throw error. We can rather write Webdriver driver =new FirefoxDriver(); or Webdriver driver=new ChromeDriver(); or Webdriver driver =new InternetExplorerDriver(); These FirefoxDriver, ChromeDriver, and InternetExplorerDriver are the actual classes which implement the webdriver interface and provide implementation to its methods. So we are assigning the instance variable of webdriver to its implementation classes.


In this lecture, we’ve installed basic minimum items required for running and executing our first selenium script. We’ve installed JDK, eclipse, selenium libraries, drivers exe’s for browsers. We’ve created a java project, set all the libraries in our project’ build path, created a class, and imported the packages and classes needed to execute our test script. We’ve written the code to set system property, initialized our driver object, open the browser, and finally lands at the site specified in the driver.get(“”).

Finding locators in selenium

Before performing any operation on any web element or any web component we need to tell the web driver where exactly that element resides on the web page. We can see the web element with our eyes but webdriver can’t. We need to tell it, that go to this particular location and perform this particular operation. To do so we make use of XPath. XPath is the XML path of a web element in the DOM. When you visit a website in google chrome and right click on the web page and select “view source”, you’ll see an HTML page with huge chunks of HTML, CSS, javascript code. You’ll find that the web page is developed with huge chunks of HTML tags

Some common HTML tags are:

a=anchor tag
tr=table row
td=table data
ul=unordered list
li=list item
ol=ordered list
and so on.

These elements also have some attribute and their corresponding attribute values.

For e.g. a div tag also has a ‘class’ attribute, or ‘id’ attribute, etc. An anchor tag ‘a’ must have an ‘href’ attribute. These attributes must have some values like class name, or id name.

So we can make use of these elements to find the xpath of our web component. You can find xpath without using any additional addon like firebug or firepath. Open the website in google chrome and right click on the element for which you want to find the xpath and click inspect element.
Now on the inspect element window press ctrl+s.

The basic format of xpath is

//tag[@attribute name="attribute value"]

Suppose there is a division in html whose class attribute value is “abcd”, so we’ll write like this: //div[@class="abcd"]
It means, find a division tag in the html whose class attribute value is “abcd”.

If we write like this //*[@class="abcd"] i.e. putting an asterisk in place of tag, this means, find any tag in the html whose class attribute value is “abcd”.
If we write with a dot operator like .//*[@class="abcd"] here the dot represents the child node. It means the processing starts from the current node. To be more precise find any tag in the html whose class attribute value is “abcd” and start processing from current node. If we do not use dot and simply write //*[@class="abcd"] then it will search class with value “abcd” in the entire document.
If you further want to go inside a parent tag, then you can use a single slash in the middle of the xpath //div[@class="abcd"]/ul/li/a this means under the parent division whose class value is “abcd” find an achor tag which is under ul and li tags. All these xpaths represents either an element or a list of elements on the web page.


Axes are the methods used to find dynamic elements. There are instances when you’ll find that the attribute of an html tag gets changed. Due to this your previously written xpath won’t work if the attribute value of any of the tag gets changed. To overcome this, xpath axes have been introduced. These are nothing but the functions which can be used inside our xpath to fetch the correct location even if the attribute is dynamic. The first such function is

1. contains().

suppose there is an attribute value “btn123″, and the numeric figure keeps changing, the numeric part is not constant it keeps changing. so you can write .//*[contains(@name, 'btn')] this means find any tag starting from the current node whose name contains “btn”. OR and AND: You can use ‘or’ or ‘and’ inside your xpath. For e.g. //*[@type='submit' or @name='abcd'], this means select any tag whose type is submit or name is “abcd”. //*[@type='submit' and @name='abcd'], this means select any tag whose type is submit and name is “abcd”. The satisfaction of both conditions is necessary.

2. starts-with()

It means find a label whose id starts with “abcd”.

3. text()


4. following:

Find all elements in the DOM starting after a particular node For. e.g.
//*[@class='post-content']//following::a This means find all the anchor tags after ‘post-content’ class.

You can see, it is giving 18 anchor tags after ‘post-content’ class.

But what if you want a particular tag? For this you can specify the index as below.

5. ancestor:

Find all elements in the DOM starting before a particular node For. e.g.
This means find all the div before ‘logoCotainer’ class.

6. descendant

All elements after current node
This means find all div after ‘uppermenubar’ class.

7. preceding

This means find all div before class “navigation”

8. child

This means find all child divisions (div) of class ‘uppermenubar’

9. parent

This means find parent div of ‘navigation’ class

Now that xpaths are being found, you can now use them in your selenium script as below
Here By.xpath(“//*[@class='navigation']//parent::div”) will return a By class object. So ultimately we are passing a By class object in findElement method.


In this lecture, we’ve learned that before performing any operation on the web elements, first, we need to find the exact location or path of that web element and instruct the web driver to go to this path and perform a particular operation. Unless we find the location or path of the web elements how can we instruct our web driver to perform a particular operation on that particular element? For e.g you want to click a button on the web page, or you need to input some text in the text field. First, you need to find where exactly that element is present on the web page. To do so, we can find the tags, attributes, values of those web elements from the page view source and write our xpath