WS05: ECOOP 2002 Workshop on
Benchmarks for Empirical Studies in Object-Oriented Software Evolution

University of Malaga, Spain
Monday June 10, 2002



ORGANIZERS

IMPORTANT DATES

INTRODUCTION

Software evolution [1, 2] is the collection of all software development activities that are intended to generate a new software release from an earlier operational version. It includes both planned change as well as unplanned phenomena.

The study of software evolution investigates when, how and why software changes over time. This is a very active research area, as can be witnessed by the annual international workshops on principles of software evolution (IWPSE), conferences on software maintenance (ICSM and CSMR) as well as Wiley's international journal devoted to the topic. Evolution of object-oriented software also attracts wide interest, and investigates a variety of techniques to deal with evolution in presence of object-oriented constructs such as inheritance, polymorphism, frameworks, design patterns, refactorings, etc... Unfortunately, there is still a lack of scientific validation of these evolution techniques.

A case study is an instance of an empirical investigation method for providing scientific evidence concerning the applicability of a given tool or technique on a concrete software system [3, 4]. During a case study, researchers monitor the effect of applying a certain technique on a given system (the subject system) and try to assess this effect both quantitatively and qualitatively. Case studies not only illustrate the applicability of the technique for a concrete system, but also allow us to compare results from different experiments and afterwards derive more general conclusions. However, in order to compare the results of different experiments, the subject systems must be selected carefully.

The goal of this one-day workshop is to identify and agree upon a number of subject systems that can be used as a "benchmark" for the scientific investigation of software evolution. The subject systems should be representative, i.e., each of them will involve typical kinds of evolution steps (i.e. extension, correction, adaptation), each of them in a different context (i.e., characterised by life-cycle, scale and team issues). An initial proposal for such a benchmark exists [5] and various researchers have reacted enthusiastically. During this workshop, we hope to iterate and refine this proposal to reach consensus on such an object-oriented evolution benchmark, i.e., a standard selection of candidate software systems around which the software evolution community will build (incrementally, over time) a body of knowledge about their evolution.

We expect such a benchmark to be used to validate three kinds of techniques:
(a) retrospective study: verify whether a technique can reconstruct how and why a software system has evolved in the past;
(b) curative activity: verify whether a technique supports a given software evolution process (e.g., refactoring);
(c) predictive analysis: verify whether a technique may predict certain kinds of evolution based on the current state of the system (e.g., quality metrics such as evolvability, maintainability, extendibility, ...).

WORKSHOP PARTICIPATION

Solicited submissions. To ensure an active collaboration between the workshop participants, the call for participation is built up in a Q&A-style. Instead of submitting a position paper, participants should provide partial answers to a number of tentative open questions. Participants may also propose new subject systems to be included in the benchmark, or suggest interesting experiments that could be set up. Additionally, participants are invited to pose new relevant questions (preferably with a motivation and a partial answer) that seem important to address. The most relevant questions will be incorporated in a tentative list that will be distributed before the workshop so that all participants can have a look at them and form an opinion about them.

Submission format. To facilitate processing, submissions should be written in plain ASCII text (no pictures or special formatting) and should be no more than 1000 words in length. Submissions should be sent by e-mail to the following submission address ecoopws5@plg.uwaterloo.ca. The ASCII text of the submission should be directly inlined in the e-mail body. Moreover, the e-mail body should include the authors’ name, address, and affiliation. If the submission is incomplete or unclear, the workshop organisers may ask to revise it before the workshop.

During the workshop. Based on the answers gathered before the workshop, participants will discuss alternative views, further work out some partially answered questions, attempt to reach a consensus on one or more object-oriented software evolution benchmarks, and discuss some concrete experiments that could be set up.

Workshop report. As per the tradition with past ECOOP workshops, Springer-Verlag will publish the ECOOP 2002 Workshop Reader as an LNCS volume. This volume will include the report of this workshop, which will contain a synopsis of the workshop's discussions, as well as any convergences of view that took place during the workshop. This report will be written by the workshop organisers, in collaboration with the participants of the workshop.

OPEN QUESTIONS

A. Does it makes sense to define a benchmark? What are the advantages and shortcomings of using a benchmark for studying software evolution? According to the workshop organisers, the purpose of a representative set of accessible object-oriented subject systems for studying evolution is that it becomes easier to replicate results, and to compare different research techniques on the same subject systems (to find out how these techniques may complement or overlap each other). We feel that this makes a benchmark an ideal vehicle for exchanging information and experience concerning evolving software.

B. If a benchmark were available, would you use it to validate your own work? Why (not)? How?

C. Which characteristic attributes should be used to determine whether a given subject system makes a suitable candidate to be included in the benchmark? To define a benchmark, we need a clear idea of what to measure and what information about the subject systems is required to perform these measurements. Therefore, we need to define a list of subject system characteristics to serve as an instrument in selecting appropriate representatives. With such a list, we can assess whether the subject systems in the benchmark may serve as representatives for a wide range of software applications. However, care should be taken whether the instrument is accurate. In particular, we should address the question whether the list of characteristics is complete, because if it is not then we risk that the selected subject systems are not representative. Equally important is the question whether the list of characteristics is minimal, because if it is not then we risk that we must select too many systems to cover all possibilities.
A first proposal for a list of characteristics was suggested in [5]. Our initial experience with the selection of subject systems based on this list suggests that the list of characteristics is not minimal but reasonably complete. We explicitly ask the workshop participants to propose improvements to this list.

D. How can we guarantee that potential subject systems are representative and replicable? In order for a subject system to make a meaningful candidate for inclusion in the benchmark, it should be representative and replicable.
Representative means that the subject system provides as much coverage as possible of all characteristics, so that it can be used as a representative for a wide range of evolving object-oriented software systems. This gives rise to a first subquestion: "Is a single benchmark sufficient, or do we need more than one benchmark?"
Replicable means that as much information concerning the subject system should be freely available and accessible, in order to replicate any experiments that have been performed on this system. This includes source code, documentation, analysis and design, for all releases of the system. This gives rise to an interesting subquestion: "What kind of information is needed to replicate an experiment?" The answer to this question may depend on the particular research technique one envisions, since different techniques may use different information regarding the subject system. Therefore, a related subquestion is "What are the kinds of experiments we wish to perform using the subject systems, and which information do we require from the subject systems in order to be able to carry out the experiment?"

E. Which concrete subject systems make likely candidates for inclusion in the benchmark? A selection of subject systems that are meant to represent a wide range of evolving object-oriented software systems was suggested in [5]. However, the selected systems fall a bit short: they are weak in the early life-cycle phases (little analysis or design documentation is available); they only include object-oriented implementations limited to Java, C++ and Smalltalk systems (no Ada or Eiffel, ...); they cover few application domains (only networking, graphics, and software development). Therefore, we explicitly ask the workshop participants to point out other systems that can help us to provide better coverage of all the characteristics.

F. Which subject systems are beyond the scope of the benchmark that we aim to define? Several classes of systems are not discussed in this proposal, including embedded and real-time systems, games, scientific computation packages, and even websites (which are clearly software systems, but quite different in flavour). Including all these classes of systems into the benchmark as well would be too ambitious. However, it might be feasible to define a separate benchmark for each class (e.g., a benchmark for studying the evolution of real-time systems).

REFERENCES

[1] Dewayne Perry. Dimensions of Software Evolution. Invited Keynote Paper, Proc. Int. Conf. Software Maintenance, Victoria, British Columbia, September 1994.
[2] M.M. Lehman and J. F. Ramil. Software Evolution. Invited Keynote Paper, Int. Workshop Principles of Software Evolution, Vienna, Austria, September 2001. (Revised and extended version of an article to appear in Marciniak J. (ed.), Encyclopedia of Software Engineering, 2nd. Ed., Wiley, 2002.)
[3] N. Fenton and S. L. Pfleeger. Software Metrics: A Rigorous and Practical Approach. 2nd edition. International Thomson Computer Press, 1997.
[4] M. V. Zelkowitz, D. R. Wallace. Experimental Models for Validating Technology. IEEE Computer, pp 23 - 31, IEEE Computer Society Press, May 1998.
[5] Serge Demeyer, Tom Mens and Michel Wermelinger. Towards a software evolution benchmark. Proc. Int. Workshop on Principles of Software Evolution, Vienna, Austria, September 2001. ACM Press, 2002.

RELATED EVENTS

ABOUT THE ORGANIZERS

Tom Mens is a postdoctoral fellow of the Fund for Scientific Research - Flanders (Belgium) since October 2000. He is associated as a computer science researcher to the Programming Technology Lab of the Vrije Universiteit Brussel, where he finished his PhD on "A Formal Foundation for Object-Oriented Evolution" in September 1999. In 1998 he was part of the ECOOP Organizing Team. He co-organised the ECOOP 2001 workshop on Object-Oriented Architectural Evolution. His main research interest lies in the use of formal techniques for improving support for software evolution, and he published several papers on this research topic. In the EMOOSE-programme (European Masters in Object-Oriented Software Engineering), jointly organised by the Vrije Universiteit Brussel (Belgium) and the Ecole des Mines de Nantes (France), he gives an advanced course on object-oriented software evolution. Finally, he is co-founder and coordinator of the Scientific Research Network on Foundations of Software Evolution.

Serge Demeyer is a professor at the Department of Mathematics and Computer Science of the University of Antwerp. His main research interest concerns software engineering (more precisely, reengineering in an object-oriented context) but due to historical reasons he maintains a heavy interest in hypermedia systems as well. He is an active member of the corresponding international research communities, serving in various conference organization and program committees. He is currently writing a book entitled "Object-Oriented Reengineering" and was the main editor of the ECOOP'98 Workshop Reader. He was co-organiser of three ECOOP workshops on Object-Orienteed Architectural Evolution. He has written a considerable amount of peer reviewed articles, some of them in highly-respected scientific journals. He completed his M.Sc. in 1987 and his Ph.D. in 1996, both at the Vrije Universiteit Brussel. After his Ph.D., he worked for three years in the University of Bern in Switzerland, where he served as a technical coordinator of a European research project.

Michael Godfrey is an assistant professor in the Department of Computer Science at the University of Waterloo in Waterloo, Ontario, Canada. He holds an NSERC Industrial Research Chair in Telecommunications Software Engineering, sponsored by Nortel Networks, NSERC, and the University of Waterloo. Prior to joining the University of Waterloo, he earned his PhD at the University of Toronto, and was subsequently a faculty member at Cornell University. Currently, he is a member of the Software Architecture Group (SWAG) and is also the director of the software engineering laboratory. His research interests include software evolution, patterns of software change, software architecture, program comprehension, software visualization, and software engineering education.

Kim Mens obtained the degrees of Licentiate in Mathematics, Licentiate in Computer Science and Doctor in Computer Science at the Vrije Universiteit Brussel. In October 2000, he obtained his PhD on "architectural conformance checking while being assigned on an industrial research project with Getronics funded by the Belgian government. After his PhD he became a post-doctoral assistent at the VUB, before starting as a computer science professor at the Université Catholique de Louvain-la-Neuve in September 2001. In addition to his current interest in "declarative meta-programming", he is one of the founding fathers of the "reuse contract" technique for automatically detecting conflicts in evolving software. He also has a strong interest in "aspect-oriented programming" and actively participated in the organisation of several workshops and conferences on that subject.


This workshop is an offical activity of the Scientific Research Network on "Foundations of Software Evolution", and is partially financed by the Fund for Scientific Research - Flanders (Belgium).