My research interests focus on the fields Software Engineering and Data Mining (applied to software-documents). In particular I'm interested in software diagnostics (quality assurance), refactoring (software evolution), experience-based resp. pattern-based Software Engineering and model-driven software development (MDSD). More specific I'm working on the diagnosis of quality defects (Antipatterns, Code Smells, Design flaws, etc.), the quality assurance in MDSD, plug-in based software development as well as knowledge discovery in code and defect repositories.

Quality Defects & Software Quality Assurance

This area of research is concerned with problems in software systems (esp. software products) that have a negative effect on the software quality. Today, a vast number of these defects are known and documented in various communities under various names. Typically, they are collected and described by practitioners and consult-ants and represent condensed experiences from multiple projects they were involved in. A systematic literature review conducted in 2007 revealed 43 differentnames that were used in the literature to describe these kind of problems - 22 of them with larger collections of quality defects.
The term "quality defect" is used as an umbrella term for the concepts antipattern, smell, flaw, pitfall, bug pattern, defect pattern, negative pattern, (bad) heuristic, (bad) charac-teristic, antiidiom, (design) problem, (design) defect, refactoring candidate, puzzlers, traps, anomalies, and many more (typically with an additional focus on a quality aspect, develop-ment phase, or abstraction level – e.g., a performance antipattern, test smell, or architectural anomaly) that have a negative effect on a quality aspect (e.g., maintainability, efficiency, or reusability).
DoctorQ: An extensible plugin-system for the eclipse IDE to enable the analysis of software systems for the diagnosis of Quality Defects. The system is designed to diagnose quality defects during programming in order to assist the software developer directly in his work. This helps to prevent larger amount of work that might be necessary if the quality defects were diagnosed after programming (or an iteration in agile development) when more components are based on the flawed system and needed to be changed.
The VIDE Defect Detector: An extensible plugin-system for the Topcased IDE to enable the analysis of software models for the diagnosis of Quality Defects. (Click the picture or this link to see the screencast)
SQuaD: The Software Quality Defect Ontology represents a systematic categorization of quality defects and associated concepts such as techniques for the diagnosis of defects and the indication of treatments (e.g., refactorings).
Quality Defect Formalization: One main part of the research is the unambiguous description of quality defects in order to apply them in knowledge based diagnosis systems as well as to support the sometimes ambiguous descriptions found in the literature.
Quality Defect Diagnosis & Priorization The focus of this part of the research is the diagnosis of quality defects based on static, dynamic, and historic (e.g., versions in CVS) information as well as the systematic reduction of the amount of quality defects presented to the user (e.g., developer or maintainer).
Quality Defect Handling After quality defects are discovery in a software system they need to be properly handled. This includes either the treatment and consequently the removal of the quality defect or the marking of the location and a decision that it was not applicable. Furthermore, this includes research on the context-specific contra-diagnosis based on specific locations (e.g., pattern roles and stereotypes). The persistent storage of these descisions can be supported by the annotation language RAL that was developed to record information about quality defects and their history in source code.
Refactoring configuration & Quality Defect Removal If multiple quality defects are found at on location (e.g., a method) are diagnosed this has an effect on the indication of an optimal treatment plan (e.g., a sequence of refactorings).

Software Patterns & Anti-Patterns

AKAEM (Arbeitskreis Architektur- und Entwurfsmuster in der Fachgruppe Software-Architektur der GI): Der Arbeitskreis AKAEM dient der Diskussion und Erarbeitung von Prinzipien, Grundlagen, Methoden, Techniken, Werkzeugen für und Anwendungen von Mustern und Anti-patterns im Bereich der Software-Architektur. Der AK arbeitet aktuell schwerpunktmäßig an der Erstellung eines Musterkatalogs.
Pattern Aggregation: An approach to systematically develop or "grow" software pattern based on experiences stored by developers in an Experience Management System (e.g., an Experience Factory) or a Defect Management Systems (e.g., bugzilla).

Model Driven Software Development & Software Architecture

Model-driven software development (MDSD) focuses on the idea of constructing software systems not by programming in a specific programming language but by designing models that are translated into executable software systems by generators. In theory, this process makes it unnecessary to care for an executable system’s quality, as it is "optimized" by the generators. However, the designed models are also a work product that requires a minimal set of quality aspects (e.g., the maintainability of models over a longer life-time).
The goals of quality assurance for model-driven software development are diverse and include the improvement of quality aspects such as maintainability, reusability, security, or performance. Quality assurance for model-driven software development will play an important role for the future wide-spread usage of model-driven architectures in general, as well as in specific application domains.
SAE3D: Software Architecture Editor 3D (using PhysX and OpenGL)
Quality Defect Visualization in PIMs and PSMs

Intelligent Assistance

Intelligent assistance in software engineering is a relatively old research field that is nevertheless of high interest for software engineers today. Giving support to the software engineers in programming, design, requirements, or other software-related environments is necessary, as the work product is typically very complex, large, and influenced by many persons.The core objective of intelligent assistance is to enable and improve the automation, insight, and interaction with a software system through an IDE. One main topic for intelligent assistance was the context-specific diagnosis of quality defects during development.
Intelligent Assistance for Software Engineering
Semantic Work Environments
SOP: The Software Organization Plattform is a plattform for software development organizations that is based on Mediawiki (http://www.mediawiki.org), a free software wiki package originally written for Wikipedia.

Information and Software Visualization

Codigator: Source Code Visualization and Navigation
Architecture Visualization in Model-driven Software Development
Quality Defect Visualization

Code Mining & Code Retrieval

Object oriented source code occurs in diverse programming languages with documentation using miscellaneous standards, comments in individual styles, or associated test cases that are hard to exploit through information retrieval or knowledge discovery techniques. Typically, the information about object-oriented source code for a software system is distributed across several different sources, which makes processing complex.
This area of research is concerned with problems regarding the retrieval, mining, and interconnection of all information concerning a software system.
COWA (Code Warehouse): The code warehouse acts as the repository for source code from several software systems for further processing. Source code in several versions from source versioning systems (e.g., CVS) of OSS projects is extracted, transformed (i.e., parsed), and loaded (i.e., stored) in the code warehouse. Currently only JAVA from CVS systems are parsed and stored in the code warehouse.
CORE (Code Retrieval Engine): The code retrieval engine is an application of the data from the code warehouse. It offers a search on the source code within the COWA using the lucene search engine. The index to be searched is build upon information from the code itself (e.g., class and method names) as well as information from the documentation (i.e., javadoc) and additional comments.
COAE (Code Analysis Engine): The code analysis engine is used to calculate metrics (and other countable characteristics) about the source code and its different versions for further analysis or mining purposes. Currently, only few metrics are calculated.
COME (Code Mining Engine): The code mining engine will be used to discover previously unknown knowledge about the source code using techniques from data, text, and web mining. Work in progress is using clustering techniques in order to discover library candidates from similar projects.