This page gathers all the information necessary to reproduce the experiment made for an article submitted to ICSME 2014.
Functional testing requires executing particular sequences of user actions. Test automation tools enable scripting user actions such that they can be repeated more easily. Selenium, for instance, enables testing web applications through scripts that interact with a web browser and assert properties about its observable state. However, little is known about how common such tests are in practice. We therefore present a cross-sectional quantitative study of the prevalence of \selenium{}-based tests among open-source web applications, and of the extent to which such tests are used within individual applications. Automating functional tests also brings about the problem of maintaining test scripts. As the system under test evolves, its test scripts are bound to break. Even less is known about the way test scripts change over time. We therefore also present a longitudinal quantitative study of whether and for how long test scripts are maintained, as well as a longitudinal qualitative study of the kind of changes they undergo. To the former's end, we propose two new metrics based on whether a commit to the application's version repository touches a test file. To the latter's end, we propose to categorize the changes within each commit based on the elements of the test upon which they operate. As such, we are able to identify the elements of a test that are most prone to change.
We use a ruby script based on Selenium to scrap GitHub. This first script stores the hits onto an sqlite database [143927 hits, 70 MB] which contain basic information for each hit such as repository url, file url, keyword and targeted language. Our second script uses a wrapper of the github search api for collecting repository-wise information such as size, creation date, number of contributors, etc. Those results are stored onto an other sqlite database [7977 repositories, 200 MB].
Independently from the language, 580 of the best git repositories from the coarse database were cloned.
A JS script and its dependencies counts the number of Java file/loc as well as the number of Java Selenium file/loc and stores these information mine.sqlite
.
On the initial 580, 287 do have at least one Java selenium file.
Note that it would have been more efficient to discard repositories that do not contain any Java hit in scrap.sqlite
.
According to their documentation, the 287 repositories were manually classified into five categories: (1) web services, (2) web frameworks, (3) code example, (4) selenium extension, (5) miscellaneous.
This categorization is stored in a csv file
.
A R script has been used to generate the scatter plots present in the article.
Computationally expensive metrics are computed on the best 8 projects of the 287 from the previous section.
The first JS script detects commits that affect selenium files ; it uses the same dependencies as mine.js
.
The first sqlite database contains information relative to selenium commits such as commit date and blob oids.
The second JS script detects renames (merge a pair addition - deletion into a modification) and regenerations (split a modification into a pair addition - deletion).
The second sqlite database
is the corrected version of metric.sqlite
and is used to compute both the commit
and the life
metrics.
The first JS script of the commit
branch simply summarizes information contained into metric2.sqlite
onto a new sqlite database.
The second JS script of the commit branch merges successive Selenium and non-Selenium commits to create Selenium and non-Selenium spans.
This can be found in a sqlite database.
The second metric is centered around Selenium tests instead.
The JS script of the life
branch computes chains of related blobs that account as the life of a selenium test.
Each selenium life can be found in a sqlite database.
The author cannot be responsible to any harm that the content of this page could have cause to your computer.
Moreover, the author does not guarantee that scripts present in this web-page will run as such.
Due to hazards during the experiment, scripts have been slightly modified to process only parts of the corpora or alternative databases not present in this web-page.
The following environments need to be installed on your machine to utilize the content of this page: (i) node.js
with sqlite3
, (ii) ruby
, (iii) R
.
If you encounter difficulties to set things up, you can contact the author at lachrist@vub.ac.be
.