The Battle for Affordable Open Access

Last week, Elsevier cut off thousands of scientists in Germany and Sweden from reading its recent journal articles, when negotiations over the cost of a nationwide open-access agreement broke down.

In these negotiations, universities are trying to change academic publishing, while publishers are defending the status quo. If you are an academic, you need to decide how to respond to this conflict:

  1. If you don’t change your own behavior, you are chosing Elsevier’s side, helping them maintain the status quo.
  2. If you are willing to change, you can help the universities. The simplest thing to do is to rigorously self-archive all your publications.

The key reason academic publishing needs to change is that academic publishers, including Elsevier, realize profit margins of 30-40%.

Euro bills

To put this number in perspective, consider my university, TU Delft. Our library spends €4-5 million each year on (journal) subscriptions. 30-40% of this amount, €1-2 million each year, ends up directly in the pockets of the shareholders of commercial publishers.

This is unacceptable. My university needs this money: To handle the immense work load coming with ever increasing student numbers, and to meet the research demands of society. A university cannot afford to waste money by just handing it over to publishers.

Universities across Europe have started to realize this. The Dutch, German, French, and Swedish universities have negotiated at the national level with publishers such as Springer Nature, Wiley, Taylor & Francis, Oxford University Press, and Elsevier (the largest publisher). In many cases deals have been made, with more and more options for open access publishing, at prices that were acceptable to the universities.

However, in several cases no deals have been made. The Dutch universities could not agree with the Royal Society of Chemistry Publishing, the French failed with Springer Nature, and now Germany and Sweden could not come to agreement with Elsevier. A common point of contention is that universities are only willing to pay for journal subscriptions if their employees can publish open access without additional article processing charges — a demand that directly challenges the current business model in academic publishing.

The negotiations are not over yet. Both in terms of open access availability and in terms of price publishers are far from where the universities want them to be. And if the universities would not negotiate themselves, tax payers and governments could simply force them, by putting a cap on the amount of money universities are allowed to spend on journal subscriptions.

Universities are likely to join forces, also across nations. They will determine maximum prices, and will not be willing to make exceptions. The negotiations will be brutal, as the publishers have much to loose and much to fight for.

In all these negotiations it is crucial that universities take back ownership of what they produce. Every single researcher can contribute, simply by making all of their own papers available on their institutional (pure.tudelft.nl for my university) or subject repositories (e.g., arxiv.org). This helps in two ways:

  • It helps researchers cut off (Germans and Swedes as we speak) from publishers in case negotiations fail.
  • It reduces the publishers’ power in future negotiations as the negative effects of cancellations have been reduced.

This seems like a simple thing to do, and it is: It should not take an average researcher more than 10 minutes to post a paper on a public repository.

Nevertheless, during my two years as department head I have seen many researchers who fail to see the need or take the time to upload their papers. I have begged, prayed, and pushed, wrote a green open access FAQ to address any legal concerns researchers might have, and wrote a step-by-step guide on how to upload a paper.

Open Access Adoption at TU Delft

On top of that, my university, like many others, have made it compulsory for its employees to upload their papers to the institutional repository (this is not surprising since TU Delft plays a leading role in the Dutch negotiations between universities and publishers). Furthermore both national (NWO) and European (H2020, Horizon Europe) funding agencies insist on open access publications.

Despite all this, my department barely meets the university ambition of having 60% of its 2018 publications available as (green or gold) open access. To the credit of my departmental employees, however, they do better than many other departments. Also pre-print links uploaded to conference sites have typically been less than 60%, suggesting that the culture of self-archiving in computer science leaves much to be desired.

If anything, the recent cut off by Elsevier in Sweden and Germany emphasizes the need for self-archiving.

If you’re too busy to self-archive, you are helping Elsevier getting rich from public money.

If you do self-archive, you help your university explain to publishers that their services are only needed when they bring true value to the publishing process at an affordable price.


© Arie van Deursen, 2018. Licensed under CC BY-SA 4.0.

Euro image credit: pixabay, CC0 Creative Commons.

My Last Program Committee Meeting?

This month, I participated in what may very well have been my last physical program committee (PC) meeting, for ESEC/FSE 2018. In 2017, top software engineering conferences like ICSE, ESEC/FSE, ASE and ISSTA (still) had physical PC meetings. In 2019, these four will all switch to on line PC meetings instead.

I participated in almost 20 of such meetings, and chaired one in 2017. Here is what I learned and observed, starting with the positives:

  1. As an author, I learned the importance of helping reviewers to quickly see and concisely formulate the key contributions in a way that is understandable to the full pc.

  2. As a reviewer I learned to study papers so well that I could confidently discuss them in front of 40 (randomly critical) PC members.

  3. During the meetings, I witnessed how reviewers can passionately defend a paper as long as they clearly see its value and contributions, and how they will kill a paper if it has an irreparable flaw.

  4. I started to understand reviewing as a social process in which reviewers need to be encouraged to change their minds as more information unfolds, in order to arrive at consensus.

  5. I learned phrases reviewers use to permit them to change their minds, such as “on the fence”, “lukewarm”, “not embarrassing”, “my +1 can also be read as a -1”, “I am not an expert but”, etc. Essential idioms to reach consensus.

  6. I witnessed how paper discussions can go beyond the individual paper, and trigger broad and important debate about the nature of the arguments used to accept or reject a paper (e.g. on evaluation methods used, impact, data availability, etc)

  7. I saw how overhearing discussions of papers reviewed by others can be useful, both to add insight (e.g. additional related work) and to challenge the (nature of the) arguments used.

  8. I felt, when I was PC co-chair, the pressure from 40 PC members challenging the consistency of any decision we made on paper acceptance. In terms of impact on the reviewing process, this may well be the most important benefit of a physical PC meeting.

  9. I experienced how PC meetings are a great way to build a trusted community and make friends for life. I deeply respected the rigor and well articulated concerns of many PC members. And nothing bonds like spending two full days in a small meeting room with many people and insufficient oxygen.

I also witnessed some of the problems:

  1. My biggest struggle was the incredible inefficiency of PC meetings. They take 1-2 days from 8am-6pm, you’re present at discussions of up to 100 papers discussed in 5-10 minutes each, yet participate in often less than 10 papers, in some cases just one or two.

  2. I had to travel long distances just for meetings. Co-located meetings (e.g. the FSE meeting is typically immediately after ICSE) reduce the footprint, but I have crossed the Atlantic multiple times just for a two day PC meeting.

  3. My family paid a price for my absence caused by almost 20 PC meetings. I have missed multiple family birthdays.

  4. The financial burden on the conference (meeting room + 40 x dinner and 80 lunches, €5000) and each PC member (travel and 2-3 hotel nights, adding up easily to €750 per person paid by the PC members) is substantial.

  5. I saw how vocal pc members can dominate discussions, yielding less opportunity for the more timid pc members who need more time to think before they dare to speak.

  6. I hardly attended a PC meeting in which not at least a few PC members eventually had to cancel their trip, and at best participated via Skype. This gives papers reviewed by these PC members a different treatment. As PC chair for ESEC/FSE we had five PC members who could not make it, all for valid (personal, painful) reasons. I myself had to cancel one PC meeting a week before the meeting, when one of my children had serious health problems.

  7. Insisting on a physical PC meeeting limits the choice of PC members: When inviting 40 PC members for ESEC/FSE 2017, we had 20 candidates who declined our invitation as they could not commit a year in advance to attending a PC meeting (in Buenos Aires).

Taking the pros and cons together, I have come to believe that the benefits do not outweigh the high costs. It must be possible to organize an on line PC meeting with special actions to keep the good parts (quality control, consistent decisions, overhearing/inspecting each others reviews, …).

I look forward to learning from ICSE, ESEC FSE, ISSTA and ASE experiences in 2019 and beyond about best practices to apply for organizing a successful on line PC meeting.

In principle, ICSE will have on line PC meetings in 2019, 2020, and 2021, after which the steering committee will evaluate the pros and cons.

As ICSE 2021 program co-chairs, Tao Xie and I are very happy about this, and we will do our best to turn the ICSE 2021 on line PC meeting into a great success, for the authors, the PC members, and the ICSE community. Any suggestions on how to achieve this are greatly appreciated.

T-Shirt saying "Last PC Meeting Ever?"

Christian Bird realized the ESEC/FSE 2018 PC meeting may be our last, and realized this nostalgic moment deserved a T-shirt of its own. Thanks!!


(c) Arie van Deursen, June 2018.

Academic Hour Tracking: Why, When, How

One tool I find indispensable in managing my time is keeping track of how I spend my working hours. During the past years, I have tracked my time at the hour level, split across around 20 tasks, grouped into three high level categories corresponding to my main responsibilities (management, research, teaching).

Keeping track of hours has helped me as follows:

  1. Formulating a strategy: Thinking in terms of hours spent per week, forces me to formulate a bigger strategy on how I wish to spend my time: I want to work around 40h per week, divided evenly over management, research, and teaching.

  2. Identifying time sinks: Activities that take more time than expected become visible. This need not be a problem per se, but making time sinks explicit helps me to adjust the planning of my other activities.

  3. Keeping commitments: Seeing my time spent helps me understand if I keep my commitments. For example, my hour sheets will tell me when I spent substantially less time on one PhD student compared to another, so that I can take action.

  4. Rewarding myself: When I’m in a week with, e.g., clearly too much management, I have a good reason to cut down organizational duties the next week, and engage in some research instead (without feeling “guilty” about this).

  5. Planning my work: My sheets of the previous year help me in planning the current year.

  6. Organizational change: Knowing how much time activities take gives me a great starting point for an informed debate about organizational change — within the department, university, with my boss, or in my research community.

  7. Slowing down: My sheets tell me when there is a busy period, and that I need to slow down.

  8. Regaining control: Stress is a factor in (academic) life, and I’m not immune to it. Regaining control over my time by seeing what I do is a stress management tool I can’t live without.

  9. Yearly appraisal: In my annual performance review with my superiors I can indicate with confidence how much effort my various responsibilities required.

My approach to keeping hours is simple and low tech: I just use a spreadsheet:

  • It has columns for my activity types (around 20) grouped together into three high level categories.

  • It has one row for each day

  • A cell contains the number of hours spent on an activity on a given day. I always use round numbers (full hours), but I know others who track at the level of 15 minutes.

  • The top row aggregates the time and percentages of the activities: I can, e.g., see that I spent 10% of my time on course X, and 5% of my time reviewing for conference Y.

  • I have aggregating columns giving the time spent that day, the total time spent the last week, and the average time spent per week over the full measuring period.

I typically use a spreadsheet for half a year, and then start a fresh one with adjusted columns. Whenever activities take more than 10%, I try to split them into different smaller ones, to better see what is taking so much time.

The bigger categories can sometimes raise interesting questions: Is reviewing research? Is project acquisition and proposal writing research? Is supervising a master student research? But whatever the categories, these activities take time, and monitoring them makes explicit how much.

I usually fill in my spreadsheet at the end of the day or the end of the week. I use my memory, calendar, and sometimes my email archives to remember what I did. For my purposes, this is sufficiently precise. Filling in the sheet takes me less than 15 minutes — and while filling it in I am forced to reflect on how I spend my time.

If you’re interested in following a similar approach, I’ve created a template empty spreadsheet for download. An alternative is to use the BubbleTimer app and service (which Jonathan Aldrich uses).

If you’re struggling with your time try out tracking it: it has helped me, and hopefully it will help you too!


(c) Arie van Deursen, April 2018. CC-BY-SA-4.0.


horloge strassbourg

Image: Astronomical clock, Strasbourg. Credit: Pascal Subtil, Flickr, CC-BY-2.0

TU Delft Computer Science is Hiring!

Studying Computer Science at Delft University of Technology has become immensely popular: our student numbers have shown double digit growth for seven years in a row, with record enrollments expected for 2018/2019.

To handle this demand in computer science education, we have a number of exciting teaching-related vacancies available:

Together with our current faculty, it will be your job to help educate future generations of computer scientists using the latest teaching methods.

The faculty’s main educational programs in computer science include a three-year bachelor programme in Computer Science and Engineering, a two-year master program in Computer Science (with main tracks in Software Technology and Data Science & Technology) and a two-year master program in Embedded Systems. The faculty offers a recently renewed minor in the third bachelor year for non-computer science students of Delft University of Technology in the area of Software Design & Data Science. Through its participation in EdX, the faculty offers a series of highly successful computer science MOOCs. All programmes are lectured in English.

Bachelor courses have enrollments of hundreds of students. Such courses are lectured by a teaching team, including professors, educators, and up to 30 teaching assistants. Master-level courses are typically lectured in smaller groups of up to 100 students, and are closely related to research carried out in the Computer Science Departments. Both the bachelor and the master are concluded with an individual research thesis (of 15 and 45 credit points, respectively).

The faculty’s research in computer science is internationally leading and conducted in the departments of Software Technology and Intelligent Systems. The two departments consist of in total 11 sections, which together are active in all core disciplines of computer science. Furthermore, the faculty conducts research in various themes that crosscut disciplines and other faculties, such as data science, cyber-security, blockchain, and Internet of Things.

To ensure continued high quality research and education, the faculty is in the process of strengthening its Computer Science Teaching Team. Responsibilities of the teaching team include supporting all bachelor-level education, co-teaching selected courses, managing a group of around 150 teaching assistants, supporting educational innovation, and blending on line and on campus education. Teaching team members with research responsibilities will also be attached to one of the research sections of the faculty. All teaching staff has the opportunity follow teaching training, leading to a University Teaching Qualification (UTQ).

Screening of applications will begin April 3, 2018 and will continue until all required positions are filled.

The anticipated starting date for all positions is as soon as possible. Interested applicants are advised to apply as early as possible. A trial lecture (except for the educational software developers) will be part of the interview.

To apply, follow the procedure as described in the vacancies: For further information, feel free to contact me. We look forward to your application!!

Image credit: @Felienne.

Academic Leadership, Module 1

Last week I participated in the first (two day) module of a six month TU Delft course on “Academic Leadership” — a course so successful it has been taught every single year for the past 32 years.

Maybe the most impressive content comes from the participants themselves (16 in total this year), who serve in different leadership roles at TU Delft. Participants can bring in “cases” they are currently struggling with — my case relates to moving my department to a new building (with less space for the 150 people involved). The participants can ask questions about these cases, often reflecting their own experience in dealing with similar cases. The questions not only help to drill down to the essence of the case, but also to the (possibly deeply personal) reasons behind the struggle at hand.

The format for this participatory content is that of “intervision“, which in English translates to “peer supervision”. It is a technique common among (mental) health care professionals, to exchange their experiences, to analyze how they handle a given complex situation, and to reflect collectively on their professional conduct.

The intervision in this course takes place in smaller groups of four, under the guidance of a coach. In a series of sessions, each participant gets one afternoon to present his or her case, and to discuss it in depth in a trusted, fully confidential setting. A few years back I participated in such an intervision, and I look forward to doing this again.

The actual course content of the first module came from Mathieu Weggeman, a professor and consultant who specializes in management of knowledge-intensive organizations. His lecture carried the title of his book Managing Professionals? Don’t!, emphasizing that professionals usually work best when their managers take a step back. No more “planning and control”, but a focus on shared ambition and employee expertise. I’m sure this resonates with many academics.

Weggeman spent considerable time discussing the characteristics of leaders in excellent professional organizations. Such leaders:

  • develop, together with all employees, a shared ambition;
  • inspire people, and involve them in the organization’s strategy to materialize the ambition;
  • communicate fairly and timely: they are available, and they listen (think management by walking around);
  • are clear about the desired output, and offer clear feedback;
  • are assertive towards employees who are not good at their job anymore;
  • function as “heat shield” against “noise from above”
  • have an authoritative yet serving and humble attitude

Weggeman connected this to a quote from Laozi (老子, 6th century BC):

A leader is best when people barely know he exists,
not so good when people obey and acclaim him,
worse when they despise him.
But a good leader, who talks little, when the work is done, his aim fulfilled,
they will say:
We did it ourselves.

Weggeman also discussed tools to diagnose and design organizations. Such tools need to distinguish (1) setting goals, (2) designing the organization, and (3) executing a strategy to meet the stated objectives — in Dutch nicely summarized as richten, inrichten, verrichten. Weggeman explained how such activities can be influenced through organizational “design variables”, which he (loosely) based on McKinsey’s 7S Framework. This framework distinguishes seven elements, described as (wikipedia):

  • Strategy: Purpose of the business and the way the organization seeks to enhance its competitive advantage.
  • Structure: Division of activities; integration and coordination mechanisms.
  • Systems: Formal procedures for measurement, reward and resource allocation.
  • Shared Values: Included in culture by Weggeman, who also includes in culture the way of working derived from these values.
  • Skills: The organization’s core competencies and distinctive capabilities.
  • Staff: Organization’s human resources, demographic, educational and attitudinal characteristics.
  • Style: Typical behavior patterns of key groups, such as managers, and other professionals

McKinseys 7S Framework

The basic premise of this framework is that these seven internal aspects of an organization need to be aligned, and that they are interrelated: Changing one element will affect the others.

As any good management consultant, Weggeman was full of quotes. To explain the need for a shared ambition, he quoted Nietzche:

He who has a ‘why’ to live for can bear almost any ‘how’.
(“Hat man sein warum des Lebens, so verträgt man sich fast mit jedem wie”, translation by Frankl)

As an academic, it is easy to get lost in the fights of the “how” (getting tenure, submitting a paper, writing a review, applying for funding, managing the class room, handling Blackboard Brightspace, etc., etc.). And naturally, it is our collective duty to improve the ‘how’ wherever we can.

But our ‘why’ is clear: Driven by curiosity, we train young people to become the world’s leading computer scientists and software engineers, and we push the boundaries of what the world knows about computer science. And this we want, in the words of the late David Notkin, “so that society can benefit even more from the amazing potential of software.”

Current Dutch Ius Promovendi Considered Harmful

The Dutch regulations on granting PhD degrees are a disgrace. Unlike most other countries, The Netherlands only gives full professors the right to award a PhD (the full professor has the “ius promovendi”). Thus, in the current regulations, if you’re an assistant or associate professor supervising a PhD student, you will need to find a full professor (deemed “the promotor”) who will be fully responsible for the entire PhD process, even if you’re planning to do all supervision yourself.

I am relieved that this miserable rule is about to change: February 23rd of this year, the Dutch parliament (Tweede Kamer) has passed an Internationalization Law to bring PhD supervision more in line with the rest of the world (for an English summary, see this Science Guide article).

However, the vote for this law in the Dutch Senate (Eerste Kamer) still needs to take place, scheduled for June 6. Much to my surprise, the preliminary reports by the Senate committee raised all sorts of concerns. The senators advocate a hierarchical model in which the full professor is “responsible for the discipline” — a notion at odds with the modern principal investigator approach in which all researchers are peers. The senators are afraid of “undesirable friction”, thereby assuming that making the roles of co-advisors explicit increases rather than decreases friction. The senators gratuitously worry that extending the ius promovendi to non-full professors will reduce the quality of the supervision. The senators argue that granting the right to supervise at the end of a career, is an attractive way of encouraging young professors to engage in such a career. Lastly, in line with the advice of the Raad van State, the senators wonder “whether there is a real problem requiring a new law”.

Answers to these concerns and questions will be provided by the Dutch Minster of Education, after which the senators will vote. Despite the senators’ concerns, the general expectation is that the law will pass. In preparing her answer, the minister can use a wide range of documents arguing the need for this change, e.g., by the Young Academy and the ILLC Research School.

The Dutch Young Academy lists nine key concerns with the current regulation. They argue that the current law is disconnected from reality in academic life, resulting in insufficient recognition (in reputation, financially, and when competing for grants) for the non-full professors involved. The Young Academy also points out that the present situation makes The Netherlands unattractive for talented researchers seeking to pursue an academic career in The Netherlands (as a point of reference, in my department half of the faculty is non-Dutch). Besides this, the ILLC in its letter (in English) to the executive board of the University of Amsterdam emphasizes problems related to quality control, arguing that in many areas it will be hard to find a full professor who is the actual expert in the field.

Yet besides being problematic for assistant and associate professors, restricting formal responsibility for PhD supervision to full professors directly hurts PhD students:

  • It can lead to unclear expectations at the very start of a PhD project, when students deciding whether to accept a PhD position need to understand whether they will be working with the full professor or the non-full professor doing the daily supervision.

  • It can lead to confusion during the PhD project, for example when the full professor and co-advisor express different ideas or opinions. Since the full professor is in a position of power, the student may be reluctant to disagree with the full professor.

  • It can lead to an unclear CV of the student, who needs to explain the role of extra advisors (and sometimes even extra co-authors) to new international employers unfamiliar with the Dutch laws.

  • It can lead to publication restrictions for the PhD student, even if the full professor abstains from active involvement in the research. A case in point are ACM rules that in a strict interpretation forbid research students to submit a paper if their advisor is program chair even if that advisor ultimately has a mere ceremonial role in the PhD project. Extra advisors here means fewer publication options.

In the past 10 years, I witnessed each of these problems in my many interactions with PhD students and co-advisors under the Dutch regulations (I have worked with seven different non-full professor co-supervisors, and with over 35 past and current PhD students). In my group we try to take the most liberal interpretation of the regulations, handing over supervision responsibilities to the co-advisors as much as possible. Despite this, the bizarre limitations for non-full professors percolate through the PhD process, and affect all involved negatively.

To remedy these problems, I trust the senate will vote in favor of the changes to the law.

After that, it is up to the universities to take advantage of this law, and change their own doctoral regulations. The ILLC has various useful suggestions. For example, they call for always appointing two supervisors (a good idea already in place at various universities). Given the new law, it then becomes possible to make the young assistant professor (who, e.g., has obtained the grant) the primary supervisor, and to involve a second supervisor in a more advisory role (i.e., not carrying executive responsibility for the success of the PhD). Drafting these new regulations in a way that is most beneficial to the PhD students, and getting them approved will take some time, but I am confident the improvement will be worth the effort.

I call on all Dutch senators to vote in favor of the new law. Once approved, I expect that Dutch universities will revise their doctoral regulations to make full use of the new legal possibilities, to the benefit of PhD students in The Netherlands and (young) faculty alike. Where needed, I pledge to use my influence, for example as head of department at TU Delft, to make this happen.


UPDATE 7 June 2017: In response to the questions from the Dutch senators, the VSNU (the Association of Dutch Universities) and the presidents of the Dutch Universities provided an addendum explaining how they would implement the law in their regulations. In particular, this addendum emphasizes that universities will grant the “ius promovendum” to associate professors (“UHDs”), but not to assistant professors. Given this addendum, the Senate passed the internationalization law on June 6, 2017.


Image credit: Joachim Schlosser, Flickr.

Managing Complex Spreadsheets — The Story of PerfectXL

This week we finished grading of the software architecture course I’m teaching.

Like many teachers, I use a spreadsheet for grading together with my co-teachers and teaching assistants. In this case, we concurrently worked with five people on a Google Spreadsheet. The resulting spreadsheet is quite interesting:

  • The spreadsheet currently has 22 sheets (tabs)

  • There are input sheets for basic information on the over one hundred students in the class, the groups they form, and the rubrics we use.

  • There are input sheets from various forms the students used to enter course-related information

  • There are input sheets for different sub-assignments, which the teachers and assistants use to enter subgrades for each rubric: Some grades are individual, others are per team. Such sheets also contain basic formulas to compute grades from rubrics.

  • There are overview sheets collecting the sub-grades from various sheets, combining them to overall grades. The corresponding formulas can become quite tricky, involving rounding, lookups, sumproducts, thresholds, conditional logic based on absence or presence of certain grades, etc.

  • There are various output sheets, to report grades to students, to export grades to the university’s administrative systems, and to offer diagrams showing grade distributions for educational assessments of the course.

The spreadsheet used has a history of five years: Each year we take the existing one, copy it, and remove the student data. We then adjust it to the changes we made to the course (additional assignments, new grading policies, better rubrics, etc).

Visualization of sheet dependencies

All in all, this spreadsheet has grown quite complex, and it is easy to make a mistake. For example, I once released incorrect grades — a rather stressful event both for my students and myself. And all I did wrong was forgetting the infamous false argument needed in a vlookup — despite the fact that I was well aware of this “feature”. For the this year’s spreadsheet we had duplicate student ids, in a column where each row had to be unique, leading to a missing grade, and again extra effort and stress to resolve this as soon as possible.

I suspect that if you use spreadsheets seriously, for example for grading, you recognize the characteristics of my spreadsheet — and maybe your sheets are even more complicated.

Now I have an interest in spreadsheets that goes beyond that of the casual user: As a software engineering researcher, I have looked extensively at spreadsheets. I did this together with Felienne Hermans, first when she was doing her PhD under my supervision in the context of the Perplex project (co-funded by Microsoft) and then in the context of the Prose project (funded by the Dutch STW agency). From a research perspective, these projects were certainly successful, leading to a series of publications in such venues as ECOOP 2010, ICSE 2011-2013, ICSM, EMSE, and SANER.

But we did our research not just to publish papers: We also had (and have) the ambition to actually help the working spreadsheet user, as well as large organizations that depend on spreadsheets for business-critical decision making.

To that end, we founded a company, Infotron, which offers tooling, services, consultancy, and training to help organizations and individuals become more effective with their spreadsheets.

After several years of operating somewhat under the radar, the Infotron team (headed by CEO Matéo Mol) has now launched an on line service, PerfectXL, in which users can upload a spreadsheet and get it analyzed. The service then helps in the following ways:

  • PerfectXL can visualize the architectural dependencies between sheets, as shown above for my course sheet;
  • PerfectXL can identify risks (such as the vlookup I mentioned, interrupted ranges, or overly complex conditional logic);
  • PerfectXL can assess the structure and quality of the formulas in your sheet.

If this sounds interesting, you can try out the service for free at perfectxl.com. There are various pricing options that help Infotron run and further grow this service — pick the subscription that suits you and your organization best!

Even if you decide not to use the PerfectXL service, the site contains a lot of generally useful information, such as various hints and tips on how to create and maintain well-structured spreadsheets.

Enjoy!

The Collaborative Software Architecture Course

After four exciting years of Teaching Software Architecture Using GitHub, we decided to write a paper reflecting on the course and our experiences, and submit it to SIGCSE, the flagship conference of the ACM Special Interest Group on Computer Science Education, typically attended by more than a thousand educators from around the world.

We’re very happy that our paper was immediately accepted!

In the paper, we identify three challenges in teaching software architecture:

  • C1: The theory of software architecture (design principles, tradeoffs, architectural patterns, product lines, etc) is often very abstract and therefore hard for a student to master.
  • C2: The problems of software architecture are only visible at scale, and disappear once small example systems are used.
  • C3: A software architect needs a combination of technical and social skills: software architecture is about communication between stakeholders, and the architect needs to be able to achieve and explain consensus.

To address these challenges, the paper proposes a collaborative approach to teaching software architecture. In particular, we report how we organized our software architecture course according to the following principles:

  • Embrace open source: Students pick an open source system of choice and study its architecture. Students use it to learn how to apply architectural theories to realistic systems (C1, C2).
  • Embrace collaboration: Students work in teams of four to study one system in depth (C3).
  • Embrace open learning: Teams share all of their work with other students. Furthermore, students share their main result with the open source community: their architectural description is published as a chapter in an online book resulting from the course (C3).
  • Interact with the architects: Students are required to offer contributions (in the form of GitHub pull requests) to the open source projects, which will expose them to feedback from actual integrators and architects of the open source projects (C1, C2, C3).
  • Combine breadth and depth: Students dive deeply in the system they analyze themselves, and learn broadly from the analyses conducted and presented by other teams (C1, C3).

DESOSA 2016 book cover

In 2016 the resulting book (created in markdown and git using gitbook) described the architectures of 21 open source systems, including Ember.js, Karma, Neo4j, and SonicPi. The chapters are based both on existing architectural theories (such as architectural views, product lines, and technical debt), as well as the students’ first hand experiences in making actual contributions (merged pull requests) to the open source systems under study.

SIGCSE Abstract

Teaching software architecture is hard. The topic is abstract and is best understood by experiencing it, which requires proper scale to fully grasp its complexity. Furthermore, students need to practice both technical and social skills to become good software architects. To overcome these teaching challenges, we developed the Collaborative Software Architecture Course. In this course, participants work together to study and document a large, open source software system of their own choice. In the process, all communication is transparent in order to foster an open learning environment, and the end-result is published as an online book to benefit the larger open source community.

We have taught this course during the past four years to classes of 50-100 students each. Our experience suggests that: (1) open source systems can be successfully used to let students gain experience with key software architecture concepts, (2) students are capable of making code contributions to the open source projects, (3) integrators (architects) from open source systems are willing to interact with students about their contributions, (4) working together on a joint book helps teams to look beyond their own work, and study the architectural descriptions produced by the other teams.

Arie van Deursen, Maurício Aniche, Joop Aué, Rogier Slag, Michael de Jong, Alex Nederlof and Eric Bouwers. “A Collaborative Approach to Teaching Software Architecture.” Proceedings of the 48th ACM Technical Symposium on Computer Science Education (SIGCSE), March 2017, Seattle, USA.

You can download the paper from the TU Delft institutional repository, or have a look at the slides we used at our SIGCSE 2017 presentation.

Golden Open Access for the ACM: Who Should Pay?

In a move that I greatly support, the ACM Special Interest Group on Programming Languages (SIGPLAN), is exploring various ways to adopt a truly Golden Open Access model, by rolling out a survey asking your opinion, set up by Michael Hicks. Even though I myself am most active in ACM’s Special Interest Group on Software Engineering SIGSOFT, I do publish at and attend SIGPLAN conferences such as OOPSLA. And I sincerely hope that SIGSOFT will follow SIGPLAN’s leadership in this important issue.

ACM presently supports green open access (self-archiving) and a concept called “Open TOC” in which papers are accessible via a dedicated “Table of Contents” page for a particular conference. While better than nothing, I agree with OOPSLA 2017 program chair Jonathan Aldrich who explains in his blog post that Golden Open Access is much preferred.

This does, however, raise the question who should pay for making publications open access, which is part of the SIGPLAN survey:

  • Attendants Pay: Increase the conference fees: SIGPLAN estimates that this would amount to an increase by around $50,- per attendee.

  • Authors Pay: Introduce Article Processing Charges: SIGPLAN indicates that if a full conference goes open access this would presently amount to $400 per paper.

screen-shot-2017-01-05-at-4-23-12-pm

Note that the math here suggest that the number of registrants is around 8 times the number of papers in the main research track. Also note that it assumes that only papers in the main research track are made open access. A conference like ICSE, however, has many workshops with many papers: It is equally important that these become open access too, which would change the math considerably.

The article processing charges of $400,- are presented as a given: They may seem in line with what commercial publishers charge, but they are certainly very high compared to what, e.g. LIPIcs charges for ECOOP (which is less than $100). These costs of $400,- come from ACM’s desire (need) to continue to make a substantial profit from their publishing activities, and should go down.

In his blog post, Jonathan Aldrich argues for the “author pays” model. His reasoning is that this can be viewed as a “funder pays” model: Most authors are funded by research grants, and usually in those grants funds can be found to cater for the costs involved in publishing open access.

On this point (and this point alone) I disagree with Jonathan. To me it feels fundamentally wrong to punish authors by making them pay $400 more for their registration. If anything, they should get a reduction for delivering the content of the conference.

I see Jonathan’s point that some funding agencies are willing to cover open access costs (e.g. NSF, NWO, H2020), and that it is worthwhile to explore how to tap into that money. But this requires data on what percentage of papers could be labeled as “funded”. For my department, I foresee several cases where it would be the department who’d have to pay for this instead of an external agency.

I do sympathize with Jonathan’s appeal to reduce conference registration costs, which can be very high. But the cost of making publications open access should be borne by the full community (all attendants), not just by those who happen to publish a paper.

Shining examples of open access computer science conferences are the Usenix, AAAI, and NIPS events. Full golden open access of all content, and no extra charges for authors — these conferences are years ahead of the ACM.

Do you have an opinion on “author pays” versus “participant pays”? Fill in the survey!

Thank you SIGPLAN for initiating this discussion!

Self-Archiving Publications in Elsevier Pure

In 2016, TU Delft adopted Elsevier Pure as its database to keep track of all publications from its employees.

At the same time, TU Delft has adopted a mandated green open access policy. This means that for papers published after May 2016, an author-prepared version (pdf) must be uploaded into Pure.

I am very happy with this commitment to green open access (and TU Delft is not alone). This decision also means, however, that we as researchers need to do some extra work, to make our author-prepared versions available.

To make it easier for you to upload your papers and comply with the green open access policy, here are some suggestions based on my experience so far working with Pure.

I can’t say I’m a big fan of Elsevier Pure. In the interest of open access, however, I’m doing my best to tolerate the quirks of Pure, in order to help the TU Delft to share all its research papers freely and persistently with everyone in the world.

Elsevier Pure is used at hundreds of different universities. If you work at one of them, this post may help you in using Pure to make your research available as open access.

The Outcome

Pure Paper Data

Anyone can browse publications in Pure, available at https://pure.tudelft.nl.

All pages have persistent URL’s, making it easy to refer to a list of all your publications (such as my list), or individual papers (such as my recent one on crash reproduction). For all recent papers I have added a pdf of the version that we as authors prepared ourselves (aka the postprint), as well as a DOI link to the publisher version (often behind a paywall).

Thus, you can use Pure to offer, for each publication, your self-archived (green open access) version as well as the final publisher version.

Moreover, these publications can be aggregated to the section, department, and faculty level, for management reporting purposes.

In this way, Pure data shows the tax payers how their money is spent on academic research, and gives the tax payer free access to the outcomes. The tax payer deserves it that we invest some time in populating Pure with accurate data.

Accessing Pure

To enter publications into pure, you’ll need to login. On https://pure.tudelft.nl, in the footer at the right, you’ll find “Log into Pure”. Use your TU Delft netid.

If you’re interested in web applications, you will quickly recognize that Pure is a fairly old system, with user interface choices that would not be made these days.

Entering Meta-Data

You can start entering a publication by hitting the big green button “Add new” at the top right of the page. It will open a brand new browser window for you.

In the new window, click “Research Output”, which will turn blue and expand into three items.

Then there are several ways to enter a publication, including:

  1. Import via Elsevier Scopus, found via “Import from Online Source”. This is by far the easiest, if (1) your publication venue is indexed by Scopus, (2) it is already visible at Scopus (which typically takes a few months), and if (3) you can find it on Scopus. To help Scopus, I have set up an ORCID author identifier and connected it to my Scopus author profile.

  2. Import via Bibtex, found via “Import from file”. If you click it, importing from bibtex is one of the options. You can obtain bibtex entries from DBLP, Google Scholar, ACM, your departmental publications server, or write them by hand in your favorite editor, and then copy paste them into Pure.

  3. Entering details via a series of buttons and forms (“Create from template”). I recommend not to use this option. If you go against this advice, make sure that if you want to enter a conference paper, you do not pick the template “Paper/contribution to conference”, as you should pick “Conference Contribution/Chapter in Conference Proceedings” instead. Don’t ask me why.

In all cases, yet another browser window is opened, in which you can inspect, correct, and save the bibliographic data. After saving, you’ll have a new entry with a unique URL that you can use for sharing your publication. The URL will stay the same after you make additional updates.

Entering your Author-Prepared version

With each publication, you can add various “electronic versions”.

Each can be a file (pdf), a link to a version, or a DOI. For pdfs you want to upload, make sure you check it meets the conditions under your publisher allows self-archiving.

Pure distinguishes various version types, which you can enter via the “Document version” pull down menu. Here you need to include at least the following two versions:

  • The “accepted author manuscript”. This is also called a postprint, and is the version that (1) is fully prepared by you as authors; and that (2) includes all improvements you made after receiving the reviews. Here you can typically upload the pdf as you prepared it yourself.

  • The “final published version”. This is the Publisher’s version. It is likely that the final version is copyrighted by the publisher. Therefore, you typically include a link (DOI) to the final version, and do not upload a pdf to Pure. If you import from Scopus, this field is automatically set.

Furthermore, Pure permits setting the “access to electronic version”, and defining the “public access”. Relevant items include:

  • Open, meaning (green) open access. This is what I typically select for the “accepted author manuscript”.

  • Restricted, meaning behind a paywall. This is what I typically select for the final published version.

  • Embargoed, meaning that the pdf cannot be made public until a set date. Can be used for commercial publishers who insist on restricting access to post-prints from institutional repositories in the first 1-2 years.

Example Pure cover page.

The vast majority (80%) of the academic publishers permits authors to archive their accepted manuscripts in institutional repositories such as Pure. However, publishers typically permit this under specific conditions, which may differ per publisher. You can check out my Green Open Access FAQ if you want to learn more about these conditions, and how to find them for your (computer science) publisher.

Once uploaded, your pdf is available for download for everyone. Pure adds a cover page with meta-data such as the citation (how it is published) and the DOI to the final version. This cover page is useful, as it helps to meet the intent of the conditions most publishers require on green open access publishing.

Google Scholar indexes Pure, so after a while your paper should also appear on your Scholar page.

A Paper’s Life Cycle

Making papers early available is one of the benefits of self-archiving. This can be done in Pure by setting the paper’s “Publication Status”. This field can have the following values:

  1. “In preparation”: Literally a pre-print. Your paper can be considered a draft and may still change.
  2. “Submitted”: You submitted your paper to a journal or conference where it is now under review.
  3. “Accepted/In press”: Yes, paper accepted! This also means that you as an author can share your “accepted author manuscript”.
  4. “E-Pub ahead of print”: I don’t see how this differs from the Accepted state.
  5. “Published”: The paper is final and has been officially published.

In my Green Open Access FAQ I provide an answer to the question Which Version Should I Self-Archive.

I typically enter publications once accepted, and share the Pure link with the accepted author manuscript as pre-print link on Twitter or on conference sites (e.g. ICSE 2018)

In particular, I do the following once my paper is accepted:

  1. I create a bibtex entry for an @inproceedings (conference, workshop) or @article (journal) publication.
  2. I upload the bibtex entry into pure.
  3. I add my own pdf with the author-prepared version to the resulting pure entry
  4. I set the Publication Status to “Accepted”.
  5. I set the Entry Status (bottom of the page) to “in progress”
  6. I save the entry (bottom of the page)
  7. I share the resulting Pure link on Twitter with the rest of the world so that they can read my paper.

Once the publisher actually manages to publish this paper as well (this may be several months later!), I update my pure entry:

  1. I add the DOI link to the final published version.
  2. I provide the missing bibliographic meta-data (page numbers, volume, number, …).
  3. I set the Publication Status to “Published”.
  4. I set the Entry Status to “for approval” (by the library who can then change it into an immutable “approved” if they think this is a valid entry).

My preprint links I shared still contain a pointer to the self-archived pdf, but now also to the official version at the publisher for those who have access through the pay wall.

Permalinks

The Pure page for your paper including all meta-information and all versions of that paper (example) in principle is stable, and its URL provide a permanent link (unless you delete it).

You can also directly link to the individual pdfs you upload (example). However, these are more volatile: If you upload a newer version the old link will be dead. Moreover, in some cases the (TU Delft) library has moved pdfs around thereby destroying old pdf links.

Therefore, I recommend to use links to the full record rather than individual pdfs when sharing pure links.

Self-Archiving Elsevier Papers

Elsevier does not like it if you self-archive papers published in Elsevier journals into Elsevier Pure. The official rules are that Elsevier journal papers are subject to an embargo, yet at the same time can be published with a CC-BY-NC-ND license on arxiv.

Combining these two leads to the following steps, assuming you have a pre-print (never reviewed), and a post-print (the author-prepared accepted version after review).

  1. Upload your pre-print onto Arxiv.
  2. Add a footnote to your post-print stating: This manuscript version is made available under the CC-BY-NC-ND 4.0 license.
  3. Update your arxiv pre-print with your CC-BY-NC-ND licensed post-print, and add publication details (journal name, volume, issue) to your arxiv entry.
  4. Create a Pure entry for your journal paper
  5. Upload the post-print as author-accepted version to your Pure entry, make it available immediately, and set the license to CC-BY-NC-ND.

Note that the Elsevier rules explicitly allow steps 1-3, and in fact insists on the CC-BY-NC-ND license. Elsevier does not suggest you take step 5, but as a consequence of the CC-BY-NC-ND license you are permitted to do so.

What Elsevier would want you to do instead of step 5 is add the postprint to Pure under a (2 year) embargo, thus delaying (green) open access availability by 2 years. Elsevier Pure even supports this embargo option as one of the “access” options, in which you could enter the end-date of such an embargo.

Note: Yes, these steps are annoying. But: at the time of writing (2019), universities in Germany, Sweden, and California have no access to recent papers published by Elsevier. If you want your paper to be read in any of these countries make sure to upload it into your university repository. If you don’t want to go through these steps and you want your paper to be read, I recommend you pick a different publisher.

See also:

Complicated Author Names

Pure contains official employee names as registered by TU Delft.

Some authors publish under different (variants of their) names. For example, Dutch universities have trouble handling the complex naming habits of Portuguese and Brazilian employees.

If Pure is not able to map an author name to the corresponding employee, find the author name in the publication, click edit, and then click “Replace”. This allows searching the TU Delft employee database for the correct person.

If Pure has found the correct employee, but the name displayed is very differently from what is listed on the publication itself, you can edit the author for that publication, and enter a different first and last name for this publication.

Exporting Linked Bibtex (to Orcid)

If you’re logged in, you can download your publication list in various formats, including BibTex (you’ll find the button for this at the bottom of the page).

I prefer bibtex entries that have a url back to the place where all info is. Therefore, I wrote a little Python script to scrape a Pure web page (mine, yours, or anyone’s), that adds such information.

I use the bibtex entries produced by this script to populate my Orcid profile as well as our Departmental Publication Server with publications from Pure that link back to their corresponding pure page.


Version history

  • 20 November 2016: Version 0.1, for internal purposes.
  • 07 December 2016: Version 0.2, first public version.
  • 14 December 2016: Version 0.3, minor improvements.
  • 13 January 2017: Version 0.4, updated Google Scholar information.
  • 16 March 2017: Version 0.5, updated approval states based on correction from Hans Meijerrathken.
  • 17 March 2017: Version 0.6, life cycle and exporting added.
  • 24 November 2017: Version 0.7, simplified life cycle and approval states.
  • 03 March 2018: Version 0.8, added info on populating Orcid from Pure.
  • 27 July 2018: Version 0.9, added info on permalinks, licensed as CC BY-SA 4.0
  • 08 March 2019: Version 1.0, added info on publishing Elsevier papers.

Acknowledgments: Thanks to Moritz Beller for providing feedback and trying out Pure.

© Arie van Deursen, 2018. Licensed under CC BY-SA 4.0.