Monthly Archives: January 2014

Why having versioned documentation

Heya,

very often we have our project’s piece of code under a versioning system. This has proven to be of real help. It helps the developer to observe the changes, remember the reason behind them and code just for the differences.

The documentation should obey the same rules.

“My taller friend” pointed out that the documentation is split, by functionality, in at least two categories. One would be to describe the characteristics of an entity or process. The second to describe a list of checks to be made in order to validate an entity.

By combining those two we look at the ideal documentation as follows. A dynamical part, with unchecked checks that is included automatically after each clone of the previous version. A static part that is being altered from one version to another.

 

Real life simplified example:

We have a a folder that can contain an infinite depth of folders and an infinite number of files.

We work hard enough to create a good enough script that validates that each folder has the require image and name based on the documentation provided by the stakeholder.

We copy our folder onto a new location and add a few folders and files.
Now we have two ways of writing some documentation in order to help the automation:

We work hard enough, again, to parametrize the initial script based on the re-written documentation.
OR
We copy the first script and just alter the changes.

 

The versioned documentation allows the team to adapt faster without over-thinking the technical solution.

I would like to read your opinions,
Gabi

To Scrum or not to SCRUM – this is the question

Hello,

one of the most used words today is “scrum”. All caps or not, it does not really matter. The baseline is that it is not an acronym.

Now, after working in this framework for a while, with various better or worse implementations I came to the conclusion that up to a point it is better to have the QA in charge of the “Scrum master” role. Here’s why:

  • In the early stages of the process, QA usually writes the acceptance criteria which get signed by the client
  • During the development QA must ease the communication between the developers and the product owner

Why the first?

Because they are the ones that later on in the process must verify that the deliverable covers all the expectations. Else the clients would get an image as a website while the team will strongly state that it looks exactly the same. This doesn’t mean that it functions, or if it does that it functions well.

Why the second? Why can’t a developer handle the task?

The QA person is the one handling the status of the tickets once they are considered complete and in testing. The developer must dig for this information while the QA can’t avoid it. So, the dev can but it is extra work.

Why not have the Product Owner in charge of it?

1) If the client is “many” and “decision-challenged”. The product owner brings way more to the project by just listening to the client and filtering the information for the team. This filters most of the noise. To his extent comes the QA that delivers his the required client-facing information.

2) If the client is “one” and “decision-challenged”. the product owner should filter the incoming information and extract the client-facing information by its own. In this case I do not consider that the QA should handle the “Scrum master” role. However in a team of 2 complementary senior QA resources, the more soft-skills oriented one can cover this role as well. He/she already have visibility regarding the team’s workload and there shouldn’t be 6 hours of decisions every day.

3) If the client is “one” and the decisions don’t change. This is quite ideal. Once the acceptance criteria are agreed upon there is no need of a product owner all-together. The team should be productivity driven with the final goal always visible.

Personal opinion:

Since the scrum is founded on the premises of a holistic approach (treat the team as a whole, not as individuals) and the client is the input and output, the QA should be the filling and the Product Owner should be the shield around it.

Please disagree and let’s have a chat.

Gabi

Selenium DDT with TestNG and DataProvider

Hi there,

Of course there are many ways to do Data Driven Testing but if you’re familiar with Java or OOP in general you’re going to like this approach with TestNG and DataProvider.

First let’s give a short intro to @DataProvider; this annotation will require a name and holds a method that returns either an Object[][] or an Interator

For the scope of this article, we will try to login multiple times on a particular site, such as gmail.com with the username and password combinations provided in an “input.txt” file:


1
2
3
user1, pass1;
user2, pass2;
user3, pass3;

Now let’s see the code for our custom readText() method and review it based on the particularities of the above input.txt file:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
    public static Object[][] readFile(File file) throws IOException {
        String text = FileUtils.readFileToString(file);
        String[] row = text.split(";");

        int rowNum = row.length;
        int colNum = row[0].split(",").length;

        Object[][] data = new String[rowNum][colNum];

        for (int i = 0; i < rowNum; i++) {
            String[] cols = row[i].split(",");
            for (int j = 0; j < colNum; j++) {
                data[i][j] = cols[j].trim();
                System.out.println("The value is " + cols[j].trim());
            }
        }
        return data;
    }

So as you can see, we’ll be passing 1 argument to our readText() method, the “file” that needs to be read. We’re then using the Apache Commons IO, FileUtils library to read our .txt file so make sure you have that imported (seems that our “custom” readText() method is not so custom afterall :) )

We’re storing all the read content inside a string and then we’re going to do a split (a fabulous one) by the semicolon (“;”) character since that seems to be separating each individual row.

Now to find out the number of “columns” we can do another split by the comma character (“,”) since it is separating each username by its password. Of course you can use any other character for separation and then do a split by it (i.e. a pipe | or a tilda `  or anything else).

Next step will create an array of array of Objects called “data” and it will have the size of data[rows][columns] that we obtained earlier.

What’s left next is to make 2 for loops in order to iterate through the number of rows and then columns while assigning the read value to the data[][] array without the whitespaces, which will be trimmed.

In the end our method will return the “data” Object[][].

So we’re now going to use this method for the @DataProvider annotation:


1
2
3
4
5
6
7
@DataProvider(name = "text")
    public static Object[][] readFile() throws IOException {
        File file = new File("input.txt");
        ReadText txt = new ReadText();
        Object[][] returnObjArray=txt.readFile(file);
        return returnObjArray;
    }

What’s left to do is to pass this @DataProvider to a @Test method that makes use of its 2 parameters and since we said we’re going to try to login to gmail we’re gonna do just that:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
 @Test (dataProvider = "text")
    public void loginGmail(String user, String pass) {
        driver = new FirefoxDriver();

        driver.manage().timeouts().implicitlyWait(7, TimeUnit.SECONDS);
        driver.get("http://www.gmail.com");
        WebElement username = driver.findElement(By.id("Email"));
        username.sendKeys(user);

        WebElement password = driver.findElement(By.id("Passwd"));
        password.sendKeys(pass);

        WebElement loginButton = driver.findElement(By.id("signIn"));
        loginButton.click();

        Assert.assertTrue(driver.findElement(By.cssSelector("a[title*='Inbox']")).isDisplayed());
    }

Notice how the dataProvider name in the @Test annotation corresponds with the name of the @DataProvider and how the test method “loginGmail” accepts 2 String arguments which corresponds with a row of parameters from our input.txt file

In the end our test will be executed 3 times because there are 3 rows of username+password combinations which are passed from file.

Enjoy DDT,

-M.

Page Objects – My friend is not insane

Hello reader,

I am writing this as I am thinking of a work colleague who tries to propose the page objects notion. To get a grip of his effort versus the general acceptance of it imagine Don Quijote versus the windmills.
As in any other post we shall start by understanding the problem at hand.

Baseline:

When we write automated tests, in almost any keyword driven framework, we must implement an action for a selector/locator. This set of key:value will be under a name. This name is specific for an area of a page of the application. If it is a website it will be a webpage, if it is a software program it will be the state of a window. From now on, this will be called a “screen”.

This screen contains multiple key:values, one for each specific action. Please note the “specific” word and think about it. So far we have screens with multiple specific key:values.

On the web, it is very likely that a screen will feature more pages at once. Let’s look at this image:

websiteLayout

From this image it is obvious that most of the screens will have the “Header”,”Categories” and the “Footer” more or less present. Some slight changes will occur for the logged/anonymous areas, except that it is always the same. The “Dynamic content” is the actual driver of the screen. This area is the reason why the users are receiving the info.

Another problem is related to the duplicated code. Every time we write something twice there is a 50% chance that on update we forget about the other piece. Also there is a 100% chance of having to work twice to maintain.

The last problem refers to tests versioning. We like to complain that xxx and yyy changed the locators; however we do little to avoid it. If the project is aiming for a difficult release it is very likely that an older backup of the codebase will be kept. This is why our test should be aware of the version required to be run.

Problem:

How do we keep the functionalities grouped in such a way that we do not have duplicated code and we maintain a versioning system?

My answer:

1) We look around in the project and draw a map, similar to the one I made earlier. Don’t fall into the trap of going too deep with the granularization. I bet that it will take an amount of time unjustifiable to the product manager. This will give us an idea of what is fixed.

2) We create a sketch of the screens and write in that sketch the name of the areas discovered earlier together with their particularities for this step. Those will be our page objects.

3) We add to the sketch the specific actions and validations.

4) In the runner class we receive as many variables as page objects and areas there are. Those variables should start from 1. They represent the version of the scripts.

5) We code each piece of the areas and take into consideration the flags that will cover the states and the version.

6) We code each of the page objects taking into consideration the input data and the version.

7) We go out and drink

This method of automation can be applied under any type of framework and has a very good return of investment. It allows the team to maintain the tests as fast as possible.

Bonus:

It allows the tester, or the PM in case of BDD, to ask questions if the requirements skip areas that used to exist before.

For instance, in the mockups a list can hold 3 products without a scroll bar.  After we look at the tests we can point out the fact that the scroll bar is not present anymore on more than 3 products. This will raise a question that will get clarified into a red scroll bar.

Have fun,
Gabi

 

Personal opinion – how to interview

Hey,

A kick in the but, a step forward, new opportunity,  good luck, bad luck, call it as you like. In the end it is just an interview. It is only a defined period of time in which you have the opportunity to make a summary out of what you have learned. “No pressure ™”.

Now, what if you would be on the other side. One of the ones that listen and chose.

Exactly like in the dude from first paragraph, the dude from the second one has to use the skill set. Maybe on a deep thought it looks like the second would be in a more powerful position. My opinion is that they are both equal. I also think this is a correct mindset.

How to make that step? From person one to person two. What qualifies someone to be person two?

At this moment, with my QA profile, I only know three things. Listen, ask “how” and at the end ask “why”. Those three actions served me well for the last period.

Having the masterplan let’s proceed to picking up a methodology.

First we need to understand the acceptance criteria for this task. What it is required of us, what are the input parameters and what it is the desired outcome. This should allow us to crate a feature related scenario. So far so great. Next step.

Second we should identify our input. In a more programmatic approach we would call those optional parameters. If they are provided, use them for relevant outcome, if not they are supported by the feature anyway.  Here, the scope changes. The recruiter is the feature within a feature. This calls for abstraction.

Third comes the prioritization. Those inputs should be assigned some values in order to determine the best mathematical outcome.

The end.

In theory QA should be able to use their expertise in order to hire people.

Have fun,

Gabi

Behat – External selectors file – Definition of useless or genius

Hello,

Before we start, thank you Mario for listening to my idea and coming up with a better one :).

Last night I have finished implementing a feature into Behat on which I have mixed feelings about. It allows the team to store the selectors in an external file. Now, this sounded great at first and I did not vouch against. Mostly I was curios how it can be done.

I am saying it is a bad idea because sending parameters to the FeatureContext constructor can be done through several different ways:

  • – an array of parameters via the behat.yml that can be extended through import to include a different file
  • – a multi-dimensional array via the Scenario Outline and the Examples table
  • – a normal file include in FeatureContext

However, none of those actually inserts the values into the scenarios at runtime, replacing the keywords. This is when I get to say that it is genius.

But again, this should not be used in the first place. The whole purpose of BDD (in this context) is to be a tool that provides documentation for the stakeholders replacing a tests managements tool. Else we should not have used Gerkin to begin with. But what if the the target is the QA person, if so it makes sense. However, we are testing a framework built on top of Magento that we implement for the clients. Now it gets back to be a bad idea. The clients will not understand jack from our tests. On the other hand, since we share some of the code base but we implement custom functionalities on top of it, we want to maintain or selectors and values in a decoupled spot and not work on the code all the time. But the .feature files are quite decoupled as they are. Uhmm… reasons reasons.

I will let you reader to meditate upon using it or not and if you do use the code, please drop a comment why. Thank you in advance unknown friend.

We will start with a top down overview of the implementation. The xxxx.feature file looks like this:


1
2
3
4
5
6
7
8
9
10
11
  Scenario Outline: Invalid user login
    Given I am on homepage
    And I follow "Log In"
    And I fill in "email" with "<__email__>"
    And I fill in "pass" with "<__password__>"
    And I press "<__button__>"
    Then I should see "<__messageBody__>"

  Examples:
    |  |
    |  |

The current implementation needs to have the empty table at the end in order for Behat to generate an array at runtime. Probably this can be fixed in the code. The <> are regular placeholders. They are substitute with the values from the Examples table at runtime. The “__” (double underscore) are used in order to ensure some kind of differentiation between our keys and the table already existing.

In FeatureContext.php I have created this method that will be loaded on the @BeforeFeature hook:


1
2
3
4
5
6
7
8
9
    /**
     * @BeforeFeature
     */

    public static function prepare(\Behat\Behat\Event\FeatureEvent $event)
    {
      $feature = $event->getFeature();
      $exampleLoader = new ExamplesLoader();
      $exampleLoader->replaceExamples($feature);
    }

The class for this is a file called ExamplesLoader.php, located in the Bootstrap folder. This does not have to be loaded or anything because Behat automatically loads all the classes from that folder. Since the logic is in here I will post it based on functionality.

This will iterate through our scenarios and for each scenario will get the examples. In will return an array with the number of elements equal to the number of rows in the Examples table. In the current implementation it works with two.


1
2
3
4
foreach ($feature->getScenarios() as $scenario) {
            $examples = $scenario->getExamples();
// all the other pieces of code will go in here. Leave it blank.
}

This piece will glue together the current working directory ( getcwd() and the name of the file where the selectors/locators exist ). They will be glued by the DIRECTORY_SEPARATOR so it works on every operating system. Please note that the working directory is where behat.yml is located, not where the current file exists. This string will exist in the “$filePath” variable.
The “$holder” will contain a bi-dimensional read from the “tsv” file just read. If you want to read a file with a different separator please read the PHP documentation for fgetcsv(). The second argument is the separator. Also, if the examples table is longer an iteration inside the $holder[$row] is required because we want to have data for all the rows, not just two.


1
2
3
4
5
6
7
8
9
10
11
12
$filePath = join(DIRECTORY_SEPARATOR, array(getcwd(), 'locatorsFile.tsv'));
$holder = array();
$row=0;
if (($handle = fopen($filePath, "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, "\t")) !== FALSE) {
        //if you are thinking that it would be better to iterate over many elements,
        //don't later you will use only key:value
        $holder[$row]=array($data[0],$data[1]);
        $row++;
    }
    fclose($handle);
}

The $rows is a variable created by Behat which stores all the values of the Examples table. Each element of this array is an array of what is inside between 2x| (pipe) on that specific row in examples. Basically here is where we want to add our keys and values. Because after we are inserting them, the framework will handle all the logic that there is to come. The setRows($rows) is a method that locks in place this table for tests creation.


1
2
3
4
5
6
7
// Add our global examples
foreach($holder as $value){
    $rows[0][] = $value[0];
    $rows[1][] = $value[1];
}
//and we send the data to the examples table
$examples->setRows($rows);

Now our table will include all the data from locatorsFile.tsv. Here’s how that file looks like on the inside:


1
2
3
4
__button__  Send
__messageBody__ Invalid login or password.
__email__   asdkjasdj@askdjaskjd.com
__password__    asdadasd

Have a nice day,
Bye bye!

BDD to the next level

Hello reader,

In this fine post I will write a story about the top level view on the BDD + Gerkin usage in a project.

Once upon a time in a land far far away the Qualitus Assuranceus, Developerus and Productus Ownerus were walking side by side. Little that they know about the Firespitter Comunicator-Braking Dragon. In their childhood all went fine. The Developerus was coming up with all those crazy ideas of gadgets. The Productus Ownerus was shooting requests all over the place and the Qualitus Assuranceus was listening to them. Because their yard was small and their parents were yet in power every evening they were sitting at the same table sharing thoughts. Oh such a beautiful world.

Soon the teenage period came. All of the grew more beautiful. They went to schools and specialized on this and that. The ideas were running even wilder. All of them teamed up with a pair of Bugs. Those Bugs brought their friends. The family was growing but the disaster was not so obvious yet. Mostly due to drinking when the thoughts came together again.

But one day, it happened. The clouds gathered, the sun was covered. The bugs caught wings. And they were rising higher and higher. Some became critically damaging to their family, even to their relationship. This was only the beginning!

Firespitter Comunicator-Braking Dragon joined the game. Spitting revenue loss and extra hours. Chaos itself landed on the project.

Gladly Cpt. BDD-erica came with a blue shield and mantle and restored peace.

The end.

This being the story, where do we go from now? Gerkin and BDD cover all the gaps of the story.

My thought is that the next step is to make it so that the testing framework creates a flow pattern of the application. This flow needs to become smart enough to ask questions about where it thinks that something will be decoupled. All will conclude in a graph of a project and testing a flow will become a query resulting in all the deviations of the possible ways to get from A to B.

The job of the QA should move upwards towards the client and be more business oriented. The goal would be to provide better requests to the developers team and the Product Owner should focus more on listening to what the users want.

 

This is tonight’s QA forecast.

Gabi

Behat Contexts – A stretch of imagination – Episode 3 – Our Poison

Hello,

This will be a short post. Out of all the poisons we have picked the http://www.php.net/traits one for our custom context “classes”. It keeps the OOP idea for later Behat versions and it allows us to have auto-complete in the FeatureContext.php. No auto-complete in the included contexts but since they are for helpers it is ok.

Table doubling + hash tables = love

Hello reader,

During my quest for learning I have encountered this thing, the “Table doubling”. Boy I was so happy about it that I wrote this article :).

The goal is to structure some data, any data, in such a way that we can delete, insert and retrieve really really fast while having a decent memory usage. A real life example would be the HTML code of a webpage. If we consider each HTML element (node) with its properties a set of data we want to get the one that we care about super fast, not in a second or so. So, in order to get a sense of why this is useful I will lay down some premises.

An array = a linear piece of memory of the length of the array. This will be always busy. The array must have the keys/indexes integers. We can’t have the key/index 2.3 or 0.1. Due to those two properties we can get to any element almost instantly because it exists and we know where it is exactly.

A linked list = A piece of data that besides its “internal” information that we care about (for our example let’s say the inner text), it has a link to location in the device’s memory for the next item after it. So 1 knows where 2 is, 2 knows where 3 is, so on and so forth. This is rather sweet but if goes only forward, with a bit of extra work we get to double linked list.

Double linked list = a linked list where the individual piece of information knows about the previous piece as well as the next. Although it has a bigger footprint it solves our previous request in a neat fast way.

Now, in order to nail down the difference, let’s look at what happens when we want to add an element close to the beginning of an array and of a list (double or linked, it doesn’t matter).

Operation Array List
Fetch an element by key
aka
Exact search
Go to key’s position Go to the first or last
Check if it is the one we need
If not ask where the next or previous is located
Repeat the previous two steps until found it*
Insert Increase the length of the array with a number of elements equal with how many we will insert
Move to the right all the other elements in a descending order
Add our data where we wanted to **
It will go at the end of the list and it will tell the last element to have it’s memory address stored.
Delete Remove the element
Move all the other elements to the left
Decrease the size of the array
Go to the desired element forwards or backwards
Tell the element to the left to point to the element to the right. If needed tell the element to the right
to point to the left one

* In case it is not obvious how slow it is. Take a piece of paper and write the steps for finding the 5th element and count the rows.
** It will not put it at the end of the array because the order matters. The premises was that we are close to the beginning of the array so don’t think of adding at the end.

In conclusion, as we can see, there are quite some advantages and disadvantages for each of the data types.

Now in order to find the holly ground of those two data types we will discover a new one, called Hash Table. The not so official definition would be an array that stores as values of its keys pointers to – in the worst case scenario – lists of data with all its properties.

The other concept of this post is Table doubling . This says in less words that instead of resizing an array 1 element at a time, when it gets filled we make it twice as long as it is now. This will allow us to hold twice as much data for future inserts after one operation, and eliminates the need of moving all the elements all the time. The second part is that on delete we shrink it by at least 1/3. This means that inserting and deleting a few elements will not trigger the resizing.

Let’s look at every operation and see the advantage.

Exact search:
Now due to being an array we get very fast navigation and finding.
Insert:
From time to time it triggers the array re-hashing but not so often. As long as we play with the same amount of data or fiddle a bit with the “1/3” value mentioned earlier, everything will work with the speed of the list.
Delete:
The deletion takes part only where there are a lot of free elements, and event after the deletion we have room for more than we had in the beginning. Almost as fast as the double linked list.

 

Conclusion:
I can see this saving seconds on every test case ran on a page with a few hundredth elements.