Capita Selecta of SE

Capita Selecta of Software Engineering

Machine Learning for Software Engineering

The goal of the course is to analyze large amounts of software engineering data using data mining techniques to uncover interesting and actionable information about software systems and projects. We use modern tools and techniques for mining this data in order to discuss the associated challenges and outline future research directions.

Examples include bug prediction using classifiers, search-based software engineering, and pattern mining of git repositories.

In the final lectures, we study recent research to understand how the mining of software repositories is evolving. For these lectures, the students will prepare a presentation on which they will be graded.

The goal of the practical sessions is to apply and to extend state-of-the art methodologies and tools to real software projects. Students will be graded on three assignments in which they will extend state-of-the-art frameworks.

Table of Contents

Lecture Reading List Slides
Introduction to the Course How to Read an Engineering Research Paper
Future of Mining Software Archives: A Roundtable
Bug Prediction: Product Metrics Evaluating defect prediction approaches: a benchmark and an extensive comparison
product metrics.pdf
Bug Prediction: Process and Developer-based Metrics A Developer Centered Bug Prediction Model process metrics.pdf
Bug Prediction: Automatic Identification of Bug-Introducing Changes When do Changes Induce Fix?
Automatic Identification of Bug-Introducing Changes
Bug Prediction: The Choice of the Classifier Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models
Different classifiers find different defects although with different level of consistency
Bug Prediction: Ensembles of Classifiers Software defect prediction: do different classifiers find the same defects?
Ensemble-based classifiers
Dynamic selection of classifiers in bug prediction: An adaptive method
Bug Prediction: Cross-Project and Just-In-Time Bug Prediction Cross-project Defect Prediction A Large Scale Experiment on Data vs. Domain vs. Process
A Large-Scale Empirical Study of Just-in-Time Quality Assurance
Bug Prediction: Dealing with Data Quality
Search Based Software Engineering: Introduction Search Based Software Engineering: Techniques, Taxonomy, Tutorial
Achievements, open problems and challenges for search based software testing
Search Based Software Engineering: Using Genetic Algorithm to Configure Machine Learning Techniques A Genetic Algorithm to Configure Support Vector Machines for Predicting Fault-Prone Components
Data-Driven Search-based Software Engineering
Pattern Mining: Mining Code Idioms Treefinder: a first step towards XML data mining.
Mining Idioms from Source Code
Pattern Mining: Toward Deep Learning Software Repositories Toward Deep Learning Software Repositories
Pattern Mining: Are Deep Neural Networks the Best Choice for Modeling Source Code? Are Deep Neural Networks the Best Choice for Modeling Source Code?
Easy over Hard - A Case Study on Deep Learning


There is no traditional oral or written exam. Students will be graded as follows:

Note that failing to hand in an assignment or failing to present automatically results in an ABSENT mark.

Powered by w3.css