ABCDF of DevOps

Usually when you look at diagrams that explain DevOps, they show few wheels that keep moving. This is what we can call, “code, build, deploy” cycle or sometimes called a CI/CD pipeline (CI: Continuous Integration, CD: Continuous Deployment or Delivery).

This looks simple as below that you need to be making changes to code consistently that will trigger automatic builds or the CI part, which in turn will trigger deployments or the CD part:

If you have seen the infinity shaped DevOps diagrams, you may be asking: “hey, where is Testing?”. My answer to that question is that it is no more a separate phase but is part of everything. The code commits ‘must’ have tests for changes; the build ‘must’ run tests and only passes if tests pass; the deployment is followed by some tests conducted in Production environment.

In addition to the above, these are some points for each of the wheels that I have seen/heard so far. Each of them calls for a separate post but wanted to sum up the whole here for now:

Code:

Build:

  • Instant builds like CI
  • Follow up builds that run additional tests e.g. Performance testing
  • Builds only pass when all tests pass
  • Developers own buildology rather than it being a separate team

Deploy:

  • Manual vs. automatic deployments
  • Testing in production
  • Approval process
  • Usage logs and monitoring

Many people when embark this journey of CBD cycle, they assume that everything will start working correctly as soon as they define a Build and Release pipeline. They discover soon that it is a bumpy ride and at times they become frustrated with this journey.

What I have experienced is that if you add two more letters to the story, you increase your chances of success significantly. These are A and F to make the picture as:

Architecture:

Feedback:

That makes the plan ready which is “ABCDF” recipe for delivering quality with speed. And if any of the pieces is not ready, it may not work the way you expect it to.

Am I missing some other letter? How does your recipe look like?

Assembly Line vs. Release Pipeline

The manufacturing world has gone through many revolutions to reduce the ‘lead time’ and Release pipeline is an attempt to do the same in Software world.

I should not make things complex for those of you who are not much familiar with intricacies of manufacturing world. So, let me make your life simple.

‘Lead time’ is the time from when you asked something and when you get it. Let’s take example of restaurant which is Services world but all of us have gone though that experience. You give order to the waiter who is serving your table and the lead time begins. The waiter serves the order and the lead time ends here. Usually either you’d ask, or the waiter would give you an estimated lead time depending upon what you ordered. This time includes waiter conveying it to the Kitchen, Chef picking up pieces which are prepared or not, assembling the order, doing the presentation, waiter collecting it and reaching your table. You must have also experienced that some restaurants maintain same lead time regardless of how many customers they are serving vs. there are some who cannot manage it.

Reducing lead time is one of the main functions of management.

Okay, now back to manufacturing world from where this term originated. It is the time when you place an order and you get it delivered. This typically happens in a supply chain system and you, as a Customer usually don’t experience it. But imagine that you visited a furniture shop, looked at different sets at display, selected one and ordered it. The lead time begins, and it ends when your furniture is delivered to your home after manufacturing/assembly.

The manufacturing world has managed to reduce the lead time by doing lot of inventions and ‘Assembly line’ is one of them. An assembly line starts with nothing and pieces keep joining it until a finished product is reached. You might have seen fancy documentaries on how car manufacturing plants work and I happen to visit air conditioning assembly lines myself to see how this works. Here is one such video on youtube. If you see more machines than humans, beware the same will happen in software world.

Sorry for a long background but I hope I didn’t lose you and you are still reading.

Now same thing is happening in software world. Your customer reports a bug or requests a new feature and the ‘lead time’ begins. The time when fixed bug or enhancement reaches the production, the ‘lead time’ ends.

Remember, reducing lead time is one of the main functions of management.

One of the things have been to introduce a Release pipeline which consists of changing the code, building the code, test execution, deployment to production in stages, testing in production. The main thing in keeping the assembly line moving is to have different pieces ready, the steps to integrate that piece into whole project reproducible, and automated/skilled labor to do it and move it to the next step. In an assembly line, if a particular station is functioning slow, it becomes a bottleneck. Similar effort is required here to first set up the Release pipeline and then remove any manual steps to automation if they are slowing down it.

The picture is from this article.

For example, Microsoft Visual Studio Team Services (VSTS) defines a Release pipeline where artifacts are produced from different ‘Build’ definitions and then a ‘Release’ definition takes them to production. If you like, you can call them pieces being assembled together to build a product. Similarly Jenkins has a ‘Pipeline’ concepts where you can chain together any number of ‘Jobs’ in any combination to achieve the same. There are so many tools selling the same concept.

As you can imagine that manufacturing world has gained lot of benefits by introducing assembly lines and software world is thinking to do the same with release pipelines. Reducing lead time is obviously the biggest gain. So, any step that stops the automated delivery in this pipeline would raise alarms and that has been a prime reason to do more and more automated testing. Such that when one step is complete, self-tests can assure that this integration was complete and without bugs. This looks easy but is hard to implement. I plan to write in some future posts that why this is so tough and some tips on achieving this as we are going through this transformation.

Are you aware if your team has a release pipeline? Do you know what are it’s steps and who manages it? Do you think your team should move towards it?

 

Inadequate Software Testing Jeopardized Elections in Pakistan, Why?

(On the eve of Pakistan General Elections 2018, everyone was anxiously waiting for the results when it was told by authorities that the newly deployed system to transmit and manage results have failed. Yes, the technology failed when it was most needed. Our regular contributor Sohail Sarwar takes a deeper look at what the problem was and more importantly how we can avoid such things happening again)

Conducting elections in any country is a major political activity since it decides the future “reign-holders” of a nation. Similarly, election 2018 was conducted by Election Commission of Pakistan (ECP) on a national scale (with 272 national and 676 provincial seats). A total of 12570 candidates contested to win hearts of 106 Million Pakistani voters at 85, 307 polling stations for representing them till 2023.

In order to manage the election activity at such a massive scale, ECP for the first time deployed a system named Result Transmission System (RTS) and Result Management System (RMS) for consolidating/compiling/tracking the election results promptly. RTS was made available as mobile app whereas RMS was installed in specific systems for the concerned election personnel.

However RTS appeared to be a nightmare for field personnel on “D-Day” i.e. on 25 July 2018. It was revealed that mobile phones would hang when RTS app was launched. The directed fix to this problem was removing the app, downloading it and re-installing every time. If this component of installing and launching worked fine, the submission of results (after attaching the images) made the application not responding.

Due to these issues, polling staff in field could not submit the results as expected by ECP and suffered a backlash from political entities due to non-responsiveness of RTS. Consequently, fairness and transparency of elections was questioned that could end up in a chaos. Also, competence of eminent professionals managing world class software products was also doubted. Why this trouble was caused? A number of reasons can be enlisted but our answer is “Lack of Thorough Software Testing”.

Some of the apparent reasons that we can think of from the perspective of software quality are:

  • Deploying the systems in production without testing to ensure end-to-end functional completeness of both components i.e. RTS and RMS.
  • Pilot testing of RTS was done on a very limited test bed i.e. only 2 constituencies. (https://www.ecp.gov.pk/frmGenericPage.aspx?PageID=3083). Obviously, health of application with load generated by 300-400 polling stations would never endure the load generated by 80,000+ polling stations.
  • Overall testing of non-functional perspectives was ignored such as configuration testing (for uniformity of mobile types and OS versions); scalability testing and performance testing (load testing and stress testing).
  • Load balancing for facilitating a seamless downloading from sources seemed missing since anomalies were identified while registering the polling personnel (8 members were directed to register at one time from one constituency that seemed impractical).
  • No dedicated hardware testing infrastructure where applications were hosted with availability of minimum bandwidth requirements.
  • Testing from Cyber security perspective so that any situations in Western countries could be countered. Not sure if this phenomenon really applies to the system developed by ECP.
  • Relatively, a shorter span of time was kept for testing the products since these issues may prevail with uncooked applications.

These are few suggestions that may lessen the chances of getting into “Fire-fighting” situations as witnessed few days ago.

Do you know more about the episode? Or would like to share suggestions on avoiding such failures?

Making Testing Public

A case study on building trustworthy testing team by making all of it’s work public was shared by me for EuroSTAR Blog. The punch line is:

Everyone should be able to see Testing. Your in-laws included

Read the full article at EuroSTAR Huddle Blog

Triggered Testing

The third and last type of DevOps testing is Triggered Testing. In previous posts, I have covered how we need testing jobs to be Scheduled and why some of these should be able to executed On Demand. Now let’s see how these being triggered help in fast paced delivery with Quality.

One view of the DevOps is that it is the extension of Agile by taking it beyond just the Development phase and apply it to all phases. That would mean that feedback on usage of features will keep flowing continuously requiring the Development team to make changes continuously.

As the changes come in at any time, a Scheduled job can only run tests for changes since the last run. Depending upon the schedule, there can be a delay of 1 hour to 24 hours. To plug this gap, we can trigger testing for all changes.

You guessed it right that it is called Continuous Integration builds that are triggered with every push a Developer makes to the ‘master’ branch. A typical CI build is build code, build tests, run tests hence serves as Triggered Testing for the given change. All modern CI tools have this capability to enable it e.g. in Microsoft TFS you can set ‘Continuous Integration’ trigger and it will run automatically for any push.

(picture taken from here)

Any commercial solution is complex enough that running all tests for every change can slow down the release process. Though testing is mean to slow down the process to gauge Quality but it should at least synch with release frequency like daily deployments or even more frequent. To curb this, you can either define a subset of tests that will run with CI job or as I covered earlier that you can define “Test Impact Analysis” Maps that store info or source code to test code mapping. Thus you have the ability to only run tests that corresponds to a given change to save time.

Another area where Triggered testing comes handy is that some types of tests are sensitive to certain areas and should only be executed if that piece of code changes. For example, we have a module which converts data from one type to another, and if it changes, we need to run all our Data Conversion tests. You are now thinking in DevOps way, when you thought “Oh, that’s right but we should be able to run them automatically’. That automatically is triggering these tests if a condition is true. There are many ways you can achieve setting these conditions with most involve some scripting to allow this.

The other part of being Triggered testing comes with Deployments. Testing happens at each stage of DevOps as shown in this diagram and Tests execute after we Deploy. We have the typical Dev, QA and Prod environments for our deployments where changes first go to Dev, then tests are executed and if they pass, it is moved to QA. Currently we are not much confident about all our processes, so QA to Prod requires manual intervention but we are on the way to take this away as well.

Those Triggered tests that run in Deployed version are enabled through CD builds or as TFS calls them “Release Definitions” where you can define what tests run after it is deployed.

Before I sum up, it is important to note that these three types of testing or rather attributes of testing are played in harmony. Your strategy defines what tests are Triggered, Scheduled and On-Demand to come up with best plan for your needs. And if you can make each type of test having these attributes like Performance Testing is Scheduled but it can be Triggered or can be executed On-Demand, you are on your way to fast paced delivery with much more confidence.

In what circumstances you use Triggered Testing? What tools/techniques you use to enable it?

3 Testers, 3 Stories

One of the benefits of getting older is that you can tell stories. No, not the stories you heard but the stories that you observed. So here are my three stories based upon lives of three real testers but am hiding their identity and making the story more generic.

If you are wondering why you should read these stories, let me entice you by informing that they’ll help you plan your career better.

And by the way, this name is borrowed from the famous Urdu magazine category called “teen auratein, teen kahaniaan” (3 women, 3 stories).

Meet Tester Alif. She started her career in software testing by accident but in first few years of her career, she really liked testing as profession. She took pride in breaking software and stopping releases by reporting obnoxious bugs. As the time passed, her excitement went away. She started to believe the repetitive nature of testing is kind of boring. She tried to reinvent herself by joining a new team or new organization, but she never enjoyed testing as she used to do in the early times.

Alif took a decision. She left testing and became a Programmer. First she felt uncomfortable with this new role and her old friends mocked her a lot. But after spending some time, she became comfortable. She had doubts that she will get bored with Programming as she got bored with Testing, but she didn’t. Many years have gone and Alif is now an accomplished Programmer. Many people actually don’t know that Alif was once a Tester.

Now let’s review life of Tester Bay. He chose software testing as career as he had a knack for finding issues in even apparently unblemished work. He became an expert Black Box tester very quickly and got a repute of someone who can find big bugs at will. He progressed nicely and became a test lead like role and taught the skills of testing to his junior members. As the time passed, Bay started feeling relaxed as if he knows every trick of the trade. He became more and more a person who managed technical stuff but not do much technical stuff himself. He became dull and kind of useless though he never realized.

Bay has had some miserable years lately. He was laid off from one job though quickly got another. But within six month or so, the lack of depth in his skills was evident and he was put on a project that is not much important. Bay thinks he is doing good and his job is safe but anyone who has little understanding can predict a bleak future for Bay.

Let me introduce you the third and last tester in this series. Tester Jeem became a tester by chance as he applied for a Design job but was offered a testing job. He started it reluctantly thinking that he will soon quit it. But he started to like testing. The fun of exploring new stuff, the spotlight he got for helping his team achieve excellence, the confidence he got by understanding the internals of the software he tested made his job a fun. He grew in the ladder and became a technical tester who had a team working with him on key projects for his team. He occasionally thought to switch his career to his ambition to become a Designer but he felt that there is so much new stuff coming in Testing profession that can keep him moving for many years he can foresee.

Jeem also became an advocate for testing profession. He started to tweet about it, started joining Meetups and training, started reading lot of books/blogs and was a source of information for many testers around him. Jeem has decided to remain a Tester for the rest of his life.

That ends the three stories folks. I know you were expecting more dramatic than they are above but I told you they are real stories.

Usually I like the notion of “Story is more important than the moral of the story”, but if you want one from above, here you go:

Never Be like “Bay”. Always be like “Alif” or “Jeem”

Or to make it generic:

Do what you love. And if you don’t love it, quit it

Do these stories look familiar to you? Have you spent your life as Alif, Bay or Jeem?

On Demand Testing

In an earlier post, it was explained that how a DevOps testing consists of three main types: Scheduled Testing, On-Demand Testing and Triggered Testing. I covered how we scheduled different types of testing in the same article and now let me dig into details of On Demand Testing.

Keeping the notion that “testing is an activity” that is performed more often than a “phase” towards the end, it is necessary to configure Continuous Testing. That requires to setup automated testing jobs in a way that they can be called when needed. This is what we call On Demand Testing. This is contrary to common notion of having testers ready to do testing “On Demand” as features get ready to test.

(the original photo is here)

For example, we have a set of tests that we call “Data Conversion tests” as they convert data from our old file format to new file format. As you can imagine, lot of defects uncovered by such tests are data dependent and a typical bug looks like “Data conversion fails for this particular file due to this type of data”. Now as a Developer takes upon that defect and fixes this particular case, she’d like to be sure that this change hasn’t affected the other dataset in the conversion suite of tests. The conversion testing job is setup in a way that a Developer can run it at any given time.

I shared that we are using Jenkins for the scheduled testing. So one way for an individual team member to run any of the jobs on-demand is to push set of changes, log into the Jenkins system and start a build of automated testing job say Conversion Testing for the above example. This is good, but this might be too late as changes have been pushed. Secondly the Jenkins VMs are setup in a separate location than where the Developer is sitting and feedback time can be anything between 2-3 hours.

Remember, tightening the feedback loop is a main goal of DevOps. The quicker we can let a Developer know that the changes work (or don’t work), the better we are positioned to release more often with confidence.

So in this case, we exploited our existing build scripts which are written in Python but are basically a set of XML files that define parts that can be executed by any team member. We added a shortened dataset that has enough diversity and yet is small, which runs within 15-20 minutes. Then we added a part in the XML file that can do this conversion job on any box at any given time by anyone in the team.

Coming back to the same Developer fixing a Conversion defect, the Developer after fixing the bug now can run the above part on her system. Within half an hour, she’ll have results and if they look good, she’d push changes with a confidence that next round of Scheduled Testing with larger dataset would also pass.

Please note that we have made most of our testing jobs as On-Demand but we are having hard time at few. One of them being Performance Testing because that is done on a dedicated machine in a controlled environment for consistency of results. Let’s see what we can do about it.

Do you have automated testing that can be executed On-Demand? How did you implement it?

Pakistan Software Quality Conference 2018

Doing it for the first time is very hard. But doing it again and again with same energy and passion is even harder. That’s why as we witnessed a very successful Pakistan Software Quality Conference (PSQC’18) on April 7th, we felt even more accomplished than the first edition PSQC’17.

Last year it was in Islamabad, and this year the biggest IT event in terms of Quality Professionals attendance moved to the culture capital of Pakistan: Lahore. Beautifully decorated in national color theme, the main auditorium of FAST NU Campus witnessed 200+ amazing people join us from various cities across Pakistan.

After recitation of the Holy Quran, event host Sumara Farooq welcomed the audience and invited PSTB President Dr. Muhammad Zohaib Iqbal for the Opening note. Dr. Zohaib recapped the journey of community building event and shared demographics of the audience. He emphasized the need for everyone to act upon to the Conference theme “Adapting to Change”.

We had two quick Key Note sessions in the first half. The first one being on “Software Security Transformations” by Nahil Mahmood (CEO Delta Tech) who spoke about the grave realities of current software security situation in Pakistan. He then urged to take a part in making every software secure enough in accordance with various Industry standards. Read more about those in the PDF: PSQC18_NahilMahmood_SoftwareSecurityTransformation.

The second talk was on “Quality Engineering in Industry 4.0” by Dr. Muhammad Uzair Khan (Managing Director Quest Lab) who first explained the notion of Industry 4.0. He envisioned a future where systems are being tested in more and more automated way with Exploratory manual testing going in background. He also rightly cautioned that any prediction to the future is a tricky business.

Few honorable guests then spoke at the event including Professor Italo, Honorary Consulate of Purtagal Feroz Iftikhar and HOD of FAST NU CS Department. Shields were then presented to Sponsors of the event by Dr. Zohaib and myself. Contour Software was represented by Moin Uddin Sohail and Stewart Pakistan (formerly CTO24/7) by their HR Head Afsheen Iftikhar.

A tea break was now needed to refresh participants for the more technical stuff which was coming their way. This time was well utilized by all to meet strangers who quickly became friends.

(More photos covering the event are coming soon at our facebook page)

Second session had five back to back talks:

  • “Performance Testing… sailing on unknown waters” by Qaiser Munir, Performance Test Manager from Stewart in which after giving some definitions, he shared a case study on how a specific client felt happy with insights it needed from Performance Testing. Full slides here: PSQC18_QaiserMuneer_PerformanceTesting
  • “Agile Test Manager – A shift in perspective” by Ahmad Bilal Khalid, Test Manager from 10Pearls who travelled from Karachi for the event. ABK, as he likes to be called, recalled his own transformation from a traditional Test Manager to Test Coach who is more of an enabler. His theme of experienced Testers becoming Dinosaurs and not helping new ones learn new stuff did hit well and resulted in quite a fruitful discussion. Read more here: PSQC18_AhmadBilalKhalid_TestManager-ChangingTimes
  • “Agile Regression Testing” by Saba Tauqir, Regression Team Lead from Vroozi shared her current work experience where they have a dedicated team for Regression Testing. This also sparked a debate within the audience so as how much Regression Testing can be sustained in Agile environments. See her talk here: PSQC18_SabaTquqir_RegressionTesting
  • “To be Technical or not, that is the question” by Ali Khalid, Automation Lead at Contour Software which perhaps was the star talk of the day. He took upon story of a hypothetical tester “Asim” and how he became a Technical Tester through four lessons. Easing up learning with some funny clips and GIFs, Ali gripped the audience to convey the message strongly which included creating an attitude towards designing algorithms and enjoying solving problems. Full slides here: PSQC18_AliKhalid_ToBeTechnical
  • “Power of Cucumber” by Salman Saeed, Automation Lead from Digitify Pakistan who talked about his journey towards automation through the same tool. He explained different features of it, the Gherkin language, the tools needed to run it and shared a piece of code that showed a sample Google search test case. He urged all to use powerful tools like Cucumber to begin their automation journeys. He also promised to share code to whoever contacts him, so feel free to bug him. His slides are here: PSQC18_SalmanSaeed_PowerofCucumber

A delicious lunch was waiting in the Cafeteria which was basically an excuse to learn from each other while enjoying the food. I could see many people catching up with Speakers to ask their follow up questions and some healthy conversations around it.

Audience was welcome back by three more talks in the afternoon session:

  • “Distributed Teams” by Farah Gul, SQA Analyst at Contour Software, another speaker from Karachi. She first explained how different location and time zones create the challenge of working together as a team. She shared some real examples on how marketing campaigns failed in a foreign country due to language barriers. At the end she suggested some ways to curb these challenges which included understanding culture, spending more time in face to face communication and asking for clarity. Slides are here: PSQC18_FarahGul_DistributedTeams
  • “Backend Testing using Mocha” by Amir Shahzad, Software QA Manager at Stella Technology who started off his talks with the ever rising need of testing the backends. He explained how RESTful APIs can be tested using Mocha with some sample code. He also mentioned other libraries that can be used to do better assertions and publishing HTML reports. His talk is here: PSQC18_AmirShahzad_Mocha
  • “ETL Testing” by Arsala Dilshad, Senior SQA Engineer at Solitan Technologies who shared her first-hand experience of testing ETL solutions. After providing an overview of her company’s processes, she told how data quality, system integration  and other testing are needed to provide a Quality solution. Read more details here: PSQC18_ArsalaDilshad_ETL Testing

Then came the best part of the day. We experimented with a new segment called “My 2 cents in 2 minutes” which provide participants to come onto stage and share any challenge they are facing in their profession. Inspired by 99 seconds sessions at TestBash, this proved to be a marvelous way to engage the audience. Around 20 awesome thoughts were presented by Quality Professionals who talked on a variety of topics. I do plan to write some follow up posts on some of the stuff that was brought there as it would be unjust to sum it up here in few lines.

Another tea break was needed to defeat the afternoon nap and seeing some Samosas (and other eateries) being served with tea resulted in many happy faces.

We were then back for the final and perhaps the best talk of the day. “Melting pot of Emotional and Behavioral Intelligence“ by Muhammad Bilal Anjum Practice Head QA & Testing from Teradata who has more than a decade experience in Analytics. Bilal gave some examples on how the current situation of a potential customer can be predicted from the available data. For example, Telco data combined with Healthcare and other sources can be used to predict how likely a person will buy some health solution. He then explained how culture plays a key role in human behaviors and why Industry Consultants are in demand for jobs like above. At the end he threw some ideas on how such solutions can be tested.

With all talks finished, I was asked to close the day being the General Chair of PSQC’18. I took upon this opportunity to thank sponsors, partners, organizers (Qamber Rizvi, Salman Saeed, Adeel Shoukat, Ali Khalid, Mubashir Rashid, Muhammad Ahsan, Amir Shahzad, Ayesha Waseem, Salman Sherin , Ovyce Ashraph), speakers and audience and made a point to mention that how collaboration can produce results which can never be surpassed by individuals. To spark some motivation for participants to try out wonderful ideas presented in the day, I used Munnu Bhai’s famous Punjabi poem with punch line “Ho naheen jaanda, karna painda aye” (It never happens automatically, you have to do it)

Long live Pakistan Software Quality community and we’ll be back with more events through the year and yes PSQC’19 in the Spring of next year!

 

Scheduled Testing

To have constant heartbeat of release, testing has to take the central stage. It can no more be an activity that is performed at the end of release cycle rather it has to happen during all the phases of a release. And mind you release cycle has shrunk from months to days.

In “Effective DevOps” book, the authors lay out many plans for making an effective plan to move towards a DevOps culture. On automation, it suggests that automation tools belong to either of the following three types:

  1. Scheduled Automation: This tool runs on a predefined schedule.
  2. Triggered Automation: This tool runs when a specific event happens…
  3. On-Demand Automation: This tool is run by the user, often on the command line…

(Page 185 under Test and Build Automation section)

The way, we took upon this advice to ramp up our efforts for Continuous Testing is that each Testing that we perform should be available in all three forms:

  1. Scheduled Testing: Daily or hourly tests to ensure that changes during that time were merged successfully. There are no disruptions by the recent changes.
  2. Triggered Testing: Testing that gets triggered on some action. For example a CI job that runs tests which was executed due to push by a Developer.
  3. On-Demand Testing: Testing that is executed on a needs basis. A quick run of tests to find out how things are on a certain front.

Take Performance testing for example. It should be scheduled to find issues on daily or weekly basis, but it could also be triggered as part of release criteria or it could be run On-Demand by an individual Developer on her box.

In order to achieve this, we re-defined our testing jobs to allow all three options at once. As the idea was to use Tools in this way, we picked upon Jenkins.

There are other options too like GoCD and Microsoft Team Foundation Server (TFS) is also catching up but Jenkins has the largest set of available plugins to do a variety of tasks. Also our prime use case was to use Jenkins as Automation Server and we have yet to define delivery pipelines.

(the original icon is at: http://www.iconarchive.com/show/plump-icons-by-zerode/Document-scheduled-tasks-icon.html )

I’ll write separately on Triggered and On-Demand testing soon and now getting into some details on how we accomplished Scheduled Testing below.

Before:

We had few physical and virtual machines, on which we were using Windows Task Scheduler to run tasks. That task will kick off on a given day and time, and would trigger Python script. The Python scripts were in a local Mercurial repository based in one of these boxes.

The testing jobs were Scheduled perfectly but the schedule and outcome of these jobs were not known to the rest of the team. Only testing knew when these jobs run and whether last job was successful or not.

After:

We made on of the boxes as Jenkins Master and others as slaves. We configured Jenkins jobs and defined the schedules there. We also moved all our Python scripts to a shared Mercurial repository on a server that anyone could get. We also created custom parts into our home grown Build system that allows running pieces in series or parallel.

Given that Jenkins gives you a web view which can be accessed by all, the testing schedule became public to everyone. Though we had a “Testing Dashboard” but it was an effort to keep it up-to-date. Also anyone in the team could see how was the last few jobs of say Performance testing and what were the results.

Moving to Jenkins and making our scripts public also helped us make same set of tests Triggered and On-Demand. More details on coming posts so as how.

I wish I could show a “Before” and “After” pictures that many marketing campaigns do to show how beautiful it now looks like.

Do you have Scheduled testing in place? What tools you use and what policies you apply?

Only test changes

We are living in ‘survival of the fastest’ era. We don’t have time for anything. We prefer reading blogs instead of books and we look for tweets rather than lengthy press releases. So when it comes to testing a release that has only a few changes, we don’t have time to run all the tests.

The question but is: which subset of tests we should be running?

I have touched this subject in Test small vs. all, but looking at build change logs and picking up tests to run is a task that requires decision making. What if we can know the changes automatically and run tests based upon that?

That is possible through TIAMaps. No this term is not mine but part of it is. It originates from Microsoft’s concept of ‘Test Impact Analysis’ which I got to know from Martin Fowler’s this blog post. I’d recommend to read it first.

If you are lazier than me and couldn’t finish the whole blog, below is a summary along with a picture copied from there:

First you determine which pieces of your source code are touched upon by your tests and you store this information is some sort of maps. Then when your source code changes, you get the tests to run from the map and then just run those tests.

Below is a summary of TIAMap implementation in our project.

Why we needed it:

We didn’t do it for fun or due to “let’s do something shiny and new”. We are running out of time. Our unit tests suite has around six thousand tests and a complete run (yes, they run in parallel) takes about 20 minutes. Hmmm… a little change that needs to go has to go through 20 minutes of Unit test execution, that’s bad. Let’s see what others are doing. Oh yeah, Test Impact Analysis is the solution.

Generating TIA Maps

Code coverage comes to the rescue. If we already have a tool that finds out which lines of code are touched by all tests, can’t we have a list of source files that are touched by a single test?

So we configured a job that would run for tests and saves this simple map: test name -> source file names. There were two lessons that we learned:

  1. Initially, we had a job that would run for all 6,000 thousands and it was taking days. We became smarter and after generating first TIA Map for all tests, we only update maps for the tests that changed. We don’t have a way to find the test names that changed, but our job is based upon timestamp of files that have test code.
  2. We were storing the Map in a SQLite Db. As the Db had to pushed to our repository again and again, it was difficult to find deltas of change. We switched to simple text file to store the Map. Changes can be seen in our source control tools and anyone can look at those text files for any inspections.

Running Tests

As you can imagine that the hard part is to get those TIAMaps. Once we have them, we now do the following:

  • When there is a need to run tests, we determine which source files have changed since the last run.
  • We have a Python script that does the magic of consulting the maps and getting a list of tests to be executed.
  • We feed that list of tests to our existing test execution program.

How is it going?

It is early to say that as we have rolled this as pilot and I may have more insights into the results in few months. But the initial feedback is indicative of us being on the right path. Time is being saved big time and we are looking for any issues that may arise due to faulty maps or execution logic.

Have you ever tried anything similar? Or would you like to try it out?