David Notkin on Why We Publish

Posted in Research, Teaching by Arie van Deursen

This week David Notkin (1955-2013) passed away, after a long battle against cancer. He was one of my heroes. He did great research on discovering invariants, reflexion models, software architecture, clone analysis, and more. His 1986 Gandalf paper was one of the first I studied when starting as a PhD student in 1990.

December 2011 David sent me an email in which he expressed interest to do a sabbatical in our TU Delft Software Engineering Research Group in 2013-2014. I was as proud as one can be. Unfortunately, half a year later he had to cancel his plans due to his health.

David was loved by many, as he had a genuine interest in people: developers, software users, researchers, you. And he was a great (friendly and persistent) organizer — 3 weeks ago he still answered my email on ICSE 2013 organizational matters.

In February 2013, he wrote a beautiful editorial for the ACM Transactions on Software Engineering and Methodology, entitled Looking Back. His opening reads: “It is bittersweet to pen my final editorial”. Then David continues to address the question why it is that we publish:

“… I’d like very much for each and every reader, contributor, reviewer, and editor to remember that the publications aren’t primarily for promotions, or for citation counts, or such.
Rather, the intent is to make the engineering of software more effective so that society can benefit even more from the amazing potential of software.
It is sometimes hard to see this on a day-to-day basis given the many external pressures that we all face. But if we never see this, what we do has little value to society. If we care about inﬂuence, as I hope we do, then adding value to society is the real measure we should pursue.
Of course, this isn’t easy to quantify (as are many important things in life, such as romance), and it’s rarely something a single individual can achieve even in a lifetime. But BHAGs (Big Hairy Audacious Goals) are themselves of value, and we should never let them fade far from our minds and actions.”

Dear David, we will miss you very much.

Speaking in Irvine on Metrics and Architecture

Posted in Research by Arie van Deursen

End of last year I was honored to receive an invitation to present in the Distinguished Speaker Series at the Insitute for Software Research at University of California at Irvine.

I quickly decided that the topic to discuss would be our research on software architecture, and in particular our work on metrics for maintainability.

Irvine is one of the world’s leading centers on research in software architecture. The Institute of Software Research is headed by Richard Taylor, who supervised Roy Fielding when he wrote his PhD thesis covering the REST architectural style, and Nenad Medvidovic during his work on architectural description laguages. Current topics investigated at Irvine include design and collaboration (André van der Hoek, and David Redmiles of ArgoUML fame), software analyis and testing (James Jones), and programming laguages (Cristina Lopes), to name a few. An overview of the group’s vision on software architecture can be found in their recently published textbook. In short, I figured that if there is one place to present our software architecture research it must be Irvine.

The talk (90 minutes) itself will be loosely based on my keynote at the Brazilian Software Engineering Symposium (SBES 2012), which in turn is based on joint research with Eric Bouwers and Joost Visser (both from SIG). ~~I’ll post the slides when I’m done.~~ The full slides are available on speakerdeck, but here’s the storyline along with some references.

A Software Risk Assessment (source: ICSM 2009)

The context of this research is a software risk assessment, in which a client using a particular system seeks independent advice (from a consultant) on the technical quality of the system as created by an external supplier.

How can the client be sure that the system made for him is of good quality? In particular, will it be sufficiently maintainable, if the business context of the system in question changes? Will it be easy to adapt the system to the ever changing world?

In situations like these, it is quintessential to be able to make objective, evidence-based statements about the maintainability of the system in question.

Is this possible? What role can metrics play? What are their inherent limitations? How can we know that a metric indeed captures certain aspects of maintainability? How should metric values be interpreted? How should proposals for new metrics be evaluated?

Simple answers to these questions do not exist. In this talk, I will summarize our current progress in answering these questions.

I will start out by summarizing four common pitfalls when using metrics in a software development project. Then, I will describe a metrics framework in which metrics are put into context by means of benchmarking and a quality model. Subsequently, I’ll zoom in on architectural metrics, focusing on metrics for encapsulation. I will discuss a proposal for a new metric, as well as its evaluation. The evaluation comprises both a quantitative assessment (using repository-mining) of its construct validity (doest it measure encapsulation?), as well as qualitative assessments of the usefulness in practice (by interviewing consultants who applied the metrics in their day to day work).

Based on this, I will reflect on the road ahead for empirical research in software metrics and architecture, emphasizing the need for shared datasets, as well as the use of qualitative research methods to evaluate practical impact.

The talk is scheduled for Friday March 15, in Irvine — I sincerely hope to see you there!

If you can’t make it, Eric Bouwers and I will present a 3.5-hour tutorial based on this same material at ICSE 2013, in May in San Francisco. The tutorial will be more interactive, taking your experience into account as well where possible, and it will have a stronger emphasis on metrics (based on SIG’s 10 year experience with using metrics in industry). Register now for our tutorial, and looking forward to seeing you there!

Desk Rejected

Posted in Research by Arie van Deursen

One of the first things we did after all NIER 2013 papers were in, was identifying papers that should be desk rejected. What is a desk reject? Why are papers desk rejected? How often does it happen? What can you do if your paper is desk rejected?

A desk reject means that the program chairs (or editors) reject a paper without consulting the reviewers. This is done for papers that fail to meet the submission requirements, and which hence cannot be accepted. Filtering out desk rejects in advance is common practice for both conferences and journals.

To identify such desk rejects for NIER 2013, program co-chair Sebastian Elbaum and I made a first pass through all 160+ submissions. In the end, we desk rejected around 10% of the submissions (a little more than I had anticipated).

Causes for reject included problems in:

Formatting: The paper does not meet the 4 page limit;
Scope: The paper is not about software engineering;
Presentation: The paper contains, e.g., too many grammatical problems;
Innovation: The paper does not explain how it builds upon and
extends the existing body of knowledge.

Of these, for NIER the formatting was responsible for half of the desk rejects.

Plagiarism
A potential cause that we did not encounter is plagiarism (fraud), or its special form self-plagiarism (submitting the same, or very similar, papers to multiple venues).

In my experience, plain plagiarism is not very common (I encountered one case in another conference, where we had to apply the IEEE Guidelines on Plagiarism).

Self-plagiarism is a bigger problem as it can range from copy-pasting a few paragraphs from an earlier paper to a straight double submission. While the former may be acceptable, the latter is considered a cardinal sin (your paper will be rejected at both venues, and reviewers don’t like reviewing a paper that cannot be accepted). And there are many shades of grey in between.

Notifications
We sent out notifications to authors of desk rejected papers within a few days after the submission deadline (it took a bit of searching to figure out that the best way to do this is to use the delete paper option from EasyChair). Thus, desk rejects not only serve to reduce the reviewing load of the program committee, but also to provide early feedback to authors whose papers just cannot make it.

Is there anything you can do to avoid being desk rejected?
The simple advice is to carefully read the submission guidelines. Besides that, it may be wise to submit a version adhering to all criteria early on when there is no immediate deadline stress yet. This may then serve as a fallback in case you mess up the final submission (uploading, e.g., the wrong pdf). Usually chairs have access to these earlier versions, and they can then decide to use the earlier version in case (only) the final version is a clear desk reject (for NIER this situation did not occur).

Is there anything you can do after being desk rejected?
Usually not. Most desk rejects are clear violations of submission requirements. If you think your desk reject is based on subjective grounds (presentation, innovation), and you strongly disagree, you could try to contact the chairs to get your paper into the reviewing phase anyway. The most likely outcome, however, will still be a reject, so it may not be in your self-interest to postpone this known outcome.

Submission times
And … are desk rejects are related to the paper submission time? Yes, there is a (mild) negative correlation: For NIER, there were more desk rejects in the earlier than in the later submissions. My guess is that this is quite common. There seem to be authors who simply try their same pdf at multiple conferences, hoping for an easy conference with little reviewing only.

Acceptance rates
This brings me to the final point. Conferences are commonly ranked based on their acceptance ratio. The lower the percentage of accepted papers, the more prestigious the conference is considered. The most interesting figure is obtained if acceptance rates are based on the serious competition only — i.e., the subset of papers that made it to the reviewing phase. Desk rejected papers do not qualify as such, and hence should not be taken into account when computing conference acceptance rates.

11Nov2012

Library Updating. Risk it Now, or Risk it Later?

Posted in Research by Arie van Deursen

Chances are your software depends on external libraries. What should you do, if a new version of such a library is released? Update immediately? But what if the library isn’t backward compatible? Should you swallow the pill immediately, and make the necessary changes to your system so that it can work with the new version? Or is it safe to wait for now, and avoid immediate cost and risk?

Together with Steven Raemaekers and Joost Visser (both from SIG), we embarked upon a reseach project in which we seek to answer questions like these. We are looking at library and API stability, as well as at the costs and consequences of library incompatibilities.

A first result, in which we try to measure library stability, has been presented at this year’s International Conference on Software Maintenance. The corresponding paper starts with a real life example illustrating the issues at hand.

The system in this example comprises around 200,000 lines of Java code, divided over around 4000 classes. The application depends on the Spring Framework, Apache Struts, and Hibernate. Its update history is shown below.

The system was built in 2004. Third-party library dependencies were managed using Maven. Version numbers of the latest versions which were available in 2004 were hard-coded in the conﬁguration ﬁles of the project. These libraries were not updated to more recent versions in the next seven years.

The system used version 1.0 (from 2003) of the Acegi authentication and security framework. In 2008, this library was included in Spring, and renamed into Spring Security, then at 2.0.0. As time passed, several critical safety-related bug fixes and improvements were added to Spring Security, as well as a number of changes breaking the existing API.

One might argue that keeping a security library up to date is always a good idea. But since the development team expected compatibility issues when upgrading the Acegi library, the update to Spring Security was deferred as long as possible.

In 2011, a new feature, single sign-on, was required for the system. To implement this, the team decided to adopt Atlassian Crowd.

Unfortunately, the old Acegi framework could not communicate with Atlassian Crowd. The natural replacement for Acegi was Spring Security, which was then in version 3.0.6.

However, the system already made use of an older version of the general Spring Framework. Therefore, in order to upgrade to Spring Security 3.0.6, an upgrade to the entire Spring Framework had to be conducted as well.

To make things worse, the system also made use of an older version (2.0.9) of Apache Struts. Since the new version of Spring could not work with this old version of Struts, Struts had to be updated as well.

Upgrading Struts not just affected the the system’s Java code, but also its Java Server Pages. Between Struts 2.0.9 and 2.2.3.1 the syntax of the Expression Language used in JSP changed. Consequently, all web pages in which dynamic content was presented uinsg JSP had to be updated.

Eventually, a week was spent to implement the changes and upgrades.

The good news was that there was an automated test suite available consisting of both JUnit and Selenium test cases. Without this test suite, the impact of this update would have been much harder to assess and control.

This case illustrates several issues with third-party library dependencies.

Third party libraries introduce backward incompatibilities.
Backward incompatibilities introduce costs and risks when upgrading libraries.
Backward incompatibilities are not just caused by direct dependencies
you control yourself but also by transitive ones you do not control.
There likely will come a moment in which upgrading must be done: To fix bugs, to improve security, or when the system’s functionality needs to be extended.
The longer you postpone updating, the bigger the eventual pain. As your system grows and evolves, the costs and risks of upgrading an old library increase. Such an accumulation of maintenance debt may lead to a much larger effort than in the case of smaller incremental updates.

In short, not upgrading your libraries immediately is taking the bet that it never needs to be done. Upgrading now is taking the bet it must be done anyway, in which case doing it as soon as possible is the cheapest route.

Paper "Measuring Software Library Stability through Historical Version Analysis

The full ICSM 2012 research paper.

In our research project, we seek to deepen our insight in these issues. We are looking at empirical data on how often incompatabilities occur, the impact of library popularity on library stability, the effort involved in resolving incompatibilities, and at ways in which to avoid them in the first place. Stay tuned!

Should you have similar stories from the updating trenches to share with us, please drop us a line!

3Nov2012

Paper Arrival Rates

Posted in Research by Arie van Deursen

As the deadline passed, I just closed the submission site for ICSE NIER 2013. How many hours in advance do authors typically submit their paper?

To answer that question, I collected some data on the time at which each paper was submitted. (I just looked at initial submissions, not at re-submissions). Here is a graph showing new paper arrivals, sorted by hours before the deadline, grouped in bins of 2 hours.

As you can see, we received a little less than 30 submissions more than 2 days in advance. But the vast majority likes to submit in the final 24 hours. The last paper was submitted just 5 minutes before the deadline.

Accumulating this graph and displaying the data as percentage yields the following chart:

This gives some insight in the percentage of papers submitted at different time slots before the deadline.

Let’s draw the following easy to remember conclusions from this graph:

1/6th of the papers are submitted more than 48 hours ahead of time.
1/3d of the papers are submitted 24 hours before the deadline
Half of the papers are submitted 14 hours before the dealine
2/3d of the papers are submitted 10 hours before the deadline
1/6th of the papers are submitted in the final 4 hours before the deadline

Is this relevant? If this is valid, as a conference organizer you can guestimate the number of submissions, say, 24 hours ahead of time, which is when you’d have 1/3d of the papers in.

But also if you’re an author this can be interesting. Conference systems like EasyChair give your paper an ID that equals the number of submissions so far. So if you submit at, say, 10 hours before the deadline, and get paper ID 200, the chart suggests that you may end up competing with 300 submissions in total.

The chart may very well be different for other conferences. NIER is part of ICSE which is held in California, with a deadline at 12pm Pago Pago time, on a Friday, soliciting 4-page papers, without requiring pre-submission of abstracts. These are all circumstances that will affect submission behavior. If you have pointers to similar data for other conferences let me know, and I’ll update the post.

Enjoy!

Arie van Deursen

Software engineering in theory and practice

Category Archives: Research

David Notkin on Why We Publish

Speaking in Irvine on Metrics and Architecture

See also:

Desk Rejected

Library Updating. Risk it Now, or Risk it Later?

Paper Arrival Rates