Exit Twitter. Enter Mastodon?

Posted in Research by Arie van Deursen

I used to love Twitter. It was in 2009 that I joined it as @avandeursen. I mostly used it for work, to share and discuss research and education in software engineering. Looking back, this is what I liked best about Twitter:

It allowed me to connect to people who shared similar interests, even if I had never met them before;
It gave me a platform to share my views on current developments in my research area with thousands of people
It helped me stay current

Unfortunately, I find it harder and harder to stay excited about Twitter:

Human dignity is increasingly under attack at Twitter, making it a platform of harassment and mis-information. Mechanisms to handle this, such as moderation or ability to block, have been weakened instead of strengthened.
The owner of Twitter exhibits erratic and irrational behavior, thereby normalizing it.
Twitter randomly blocks or delays sources of useful information (e.g., Mastodon, New York Times), thereby making Twitter itself less useful.
Access to tweets has been limited: Now, they are only visible when logged in into Twitter, and older tweets (before 2014) are lost. To me tweeting is a form of (micro-)publishing, so hiding tweets defeats the purpose.

Thus, it is unavoidable that I take a step back in my Twitter presence. Platform X doesn’t deserve my carefully worded tweets.

As an alternative, I have been exploring Mastodon during the last 12 months. It’s not perfect, but there are a few things to like about it.

For one, it is entirely open, being based on the open ActivityPub protocol, and running on open source software. Naturally, my posts are open too, openly visible to anyone, even those without a Mastodon account.

Furthermore, it is federated, meaning it is a collection of micro-blogging servers that exchange messages with each other. This has interesting consequences:

There is no single owner of Mastodon (or the broader Fediverse of all ActivityPub-compliant servers). Billionaires can’t buy it.
As a user, you can select a server to join, for example a server on your favorite hobby. While this sounds easy, in practice finding a meaningful server to join isn’t always easy (and a potential road block to joining – see also below).
The server owners are responsible for formulating and enforcing an anti-harrassment policy, which is another factor in selecting a suitable server. Server admins can also block entire other servers if they consider the posts on such a server incompatible with their own code of conduct.

My own journey started a year ago on mastodon.social. I picked this server as it was the first and largest Mastodon server (300,000 users at the time of writing). I soon found a few friends who were active on the smaller fediscience.org with around 2,000 users to date. I therefore moved to this server, and have been there for the last 10 months.

In that period I shared around 250 posts, and built up a network of 300-400 people. While these numbers are small compared to Twitter (where I had over 4,000 followers), I find the engagement more meaningful and more pleasant. Mastodon very much reminds me of Twitter in its early days, when everyone did their best to get the most out of this exciting new technology.

Recently I have moved my Mastodon account to the mastodon.acm.org instance – which wasn’t around yet when I originally joined Mastodon. This server is hosted by the Association for Computing Machinery (ACM), and has around 250 users at the moment. The server is intended to provide a trustworthy space for computing professionals from around the world to connect and engage with each other in a meaningful way – which sounds spot on for me.

As a computing professional, I am a (paying) member of the ACM. I am very happy to see that ACM offers this service, including moderating content to enforce the server rules to be “professional, respectful, inclusive, accessible, honest, and friendly”. Anyone interested in computing can join: I sure hope to see you there!

4Jul2022

TU Delft Computer Science Research Assessment 2015-2020

Posted in Research by Arie van Deursen

Last year, the TU Delft computer science research programs were evaluated, comprising a reflection on the preceding six years (2015-2020), and an outlook the next six years.

The assessment follows the Strategy Evaluation Protocol (SEP), used by all Dutch universities, which focuses on research quality, societal impact, and viability. The assessment is conducted by an international external committee. It is based on a self-assessment written by us, as well as a two-day site visit by the committee.

Arie van Deursen and Alan Hanjalic in the Computer Science building

At TU Delft, computer science is organized into two departments: Intelligent Systems, chaired by Alan Hanjalic and Software Technology, chaired by me.

In 2021, Alan and I together worked hard to compile our self-assessment report. It is based on extensive discussions with and contributions by many people, both inside and outside the two departments. It contains our thoughts on what we want computer science in Delft to achieve (our mission), what we did to achieve this (our strategy), an assessment of the success of our approach (our SWOT), and a plan of action for the next years ahead (our future strategy).

We proudly make the full self-assessment report available via this link — the only modification being that for reasons of privacy we omitted some of the appendices that could be traced back too easily to individual faculty members.

As part of the protocol, the committee’s findings, as well as the reaction of the executive board to these findings, have been made available as well, at the central TU Delft site. The committee is “positive about the very high and often excellent research quality, the high quality of the staff as well as the energy, drive and potential of the primarily junior research staff of both departments,” and “recognizes the relevance and societal impact of the researchcarried out the INSY and ST departments.”

We are grateful to the external committee, and in particular for the 17 recommendations that will help us further strengthen TU Delft computer science. We have integrated these recommendations in the action plan already laid out in our self-assessment, and look forward to work with everyone in our departments and our faculty to execute this action plan in the next few years.

Below, we provide the executive summary of our self-assessment, and we invite you to have a look at our full report.

Self-Assessment Summary

The phenomena of datafication and AI-zation reflect the increasing tendency to quantify everything through data and to automate the decision-making processes that are also largely based on data. Since these phenomena have entered all segments of our lives and since research in computer science (CS) is at the heart of the technological developments underlying these phenomena, CS as a research field has gained strategic importance. TU Delft Computer Science operates at the forefront of these developments with the aim to help society at large, by enabling it to maximally benefit from these phenomena, while protecting it from potential risks. To that end, inspired and driven by the TU Delft core values of Diversity, Inclusion, Respect, Engagement, Courage and Trust (DIRECT), our mission includes (1) conducting world class research in selected computer science core areas; (2) maximizing opportunities for societal impact of our research; (3) providing rigorous, research-inspired engineering education in computer science; and (4) contributing to an international academic culture that is open, diverse and inclusive, and that offers openly available knowledge.

We are organized in two departments, Intelligent Systems and Software Technology, consisting of 5 and 6 sections respectively. Sections are small-scale organizational units, typically headed by a full or associate professor and marking a particular CS disciplinary scope. While the departments are separate units, they work closely together in research and education, and collaborate for societal impact. The convergence between the departments in terms of alignment and joint pursuit of strategic and operational goals has even become so strong over recent years that we can speak of an increasingly recognizable CS entity in Delft organizing its research into five main themes transcending the departmental and section boundaries: (1) decision support; (2) data management and analytics; (3) software systems engineering; (4) networked and distributed systems; and (5) security and privacy. The themes offer critical mass in order to achieve substantial impact, and each theme involves many researchers with various CS backgrounds and expertise.

Award-winning research in these themes achieved during 2015-2020 include a novel cross-modal (e.g., combining text and images) retrieval method based on adverserial learning; genetic algorithms for the automatic reproduction of software crashes to facilitate automated debugging; and Trustchain, a permission-less tamper-proof data structure for storing transaction records of agents with applications in digital identity. International recognition of our expertise is reflected by numerous leadership roles, e.g., as general or program chairs in numerous flagship conferences, such as AAAI, EuroGraphics, ACM/IEEE ICSE, ACM OOPSLA, ACM RecSys and ACM Multimedia. In the same time period, several staff members also received the highest (inter)national recognition in their fields, such as IEEE Fellow, membership of the Young Academy of the Royal Dutch Academy of Sciences, or the Netherlands Prize for ICT Research. Our scientific reputation also brought us into the consortia of two prestigious NWO Gravitation Projects (“NWO Zwaartekracht“) of the Dutch Research Council, Hybrid Intelligence and BRAINSCAPES – the consortia that “belong to the world top in their field of research or have the potential to do so”

To maximize societal impact, we embrace eight key sectors: transport and logistics, energy, health and well-being, safety and security, finance, online education, creative industry, and smart cities. To enable and support us in making substantial interdisciplinary impact in these sectors, we have built up expertise, a network of collaborators and societal partners, and established the necessary organizational structures. Prominent examples of our impact in these sectors include the NGS sequencing analysis pipeline we designed and implemented as part of the NIPT test, which is used routinely by hospitals in several countries; Cytosplore, a software system for interactive visual single-cell profiling of the immune system; and SocialGlass, a tool suite for integration, enrichment, and sense-making of urban data. Our close ties with society are also reflected in our strategic collaborations with socio-economical partners, such as ING, DSM, Booking.com, Adyen, Ripple, Erasmus Medical Center and Leiden University Medical Center, leading amongst other things to strategic investments in the form of three large industry-funded labs (with ING, DSM and Booking.com) setup in the assessed time period for a duration of five years. Furthermore, we have invested extensive effort in public outreach, explaining and discussing science with a broad audience, and in particular in the context of complex societal debates in the domain of AI and blockchain. Finally, we play a leading role in regional, national and European initiatives, most notably in the Dutch AI Coalition (NLAIC).

In addition to scientific excellence and strong impact in the selected societal sectors, we are committed to (a) meeting the increasing societal need for highly skilled CS experts, (b) development of human capital in our organization, leading to a new generation of international academic leaders, and (c) advancing the organization and academic culture, with the key pillars of open science, diversity and inclusion.

Regarding (a), we embraced an over 100% increase of our student population, but also aim at securing the highest possible level of their knowledge, skills and academic forming despite scaling up. Therefore, we value a close connection between research and education, and let both MSc and BSc students participate actively in our research. We also formulated an ambitious strategy, the realization of which would enable us to manage this education scale-up efficiently and effectively, leaving sufficient room to our staff for further developing scientific excellence and deploying it for societal impact. Part of this strategy is the growth of our academic staff towards 100 FTE by 2024 to meet the stabilization of the student numbers (due to numerus fixus). Between 2015 and 2020, we already achieved a net growth from 54 to 72 faculty members (+33%), with more to come in the upcoming years.

Next to BSc and MSc students, we are committed to delivering highly skilled CS experts at the PhD level. The number of PhD students grew from 105 to 165 (+57%) in the assessed time period, reflecting our ability to successfully acquire research funding in the present landscape. For our PhD students, the Graduate School defines a framework in which they can develop their skills next to conducting their thesis research. We strive towards completion of PhD theses within four years and organize our supervision, official moments of assessment, requirements on the volume and quality of the conducted research, as well as evidence of scientific impact through publications, accordingly.

Regarding (b), development of human capital: as computer science expertise is in high demand across the globe, finding strong new people as well as retaining our current staff proved highly challenging, especially given the high teaching load due to our record student intake. Therefore, acquiring, developing and retaining academic talent has been one of our most important goals. Dedicated actions, such as devising of a Development Track Plan, serve to empower each staff member to provide contribution to the organization in his/her own way, based on individual interests, talents and ambitions, and in view of our joint ambition as organization.

In view of (c), our organization, we embrace open science, with a substantial percentage (80% in 2020) of our articles available as open access, and by making numerous software tools and data sets openly available. We are a highly international organization with employees and students from all over the world. We strive to be an inclusive organization, where staff and students feel at home and valued, regardless of their background, age, gender, sexual orientation or functional disability. In terms of female faculty, we realized a net growth from 11 to 14 faculty members. As the number of men employed also increased, the percentage of female faculty stayed stable at around 20%. We consider this too low. We are committed to addressing this, for which we will take a long-term approach with, amongst other means, dedicated budget reserved for continued openings for female faculty in the upcoming years.

We are proud of our scientific successes and societal impact in the core computer science disciplines as well as in interdisciplinary research in our target societal sectors. This is especially so as those were achieved in a period that was transformational for TU Delft Computer Science, characterized by substantial growth and development across our organization and activities. We anticipate an even stronger societal demand for our research and expertise in the future. We will therefore continue to initiate, participate and take on a key role in effective and interdisciplinary partnerships at the university (TU Delft AI), regional (LDE), national (ICAI, IPN), and European (ELLIS, CLAIRE) levels. Furthermore, we will continue the growth path for our staff, in order to build up capacity enabling us to further develop our scientific excellence and offer our strongly increased student population the world-class research-intensive education they deserve. To achieve this, we center the next steps in our ongoing transformation around people, organization, and profiling and identify seven key actions for the upcoming years that aim at (1) improving our attractiveness as an employer; (2) improving diversity and inclusion; (3) improving the execution of the PhD program; (4) expanding our staff capacity; (5) aligning our office space with the optimal way of working; (6) articulating the scientific profile; and (7) boosting our scientific and societal impact.

15Apr2022

Eelco Visser (1966-2022)

Posted in Research by Arie van Deursen

Text of the eulogy for Eelco Visser (12 October 1966 – 5 April 2022) at his farewell ceremony held in Leusden, April 2022. Original text in Dutch.

I stand in front of you, in total disbelief, as head of the department for which Eelco Visser has worked the last 15 years.

I would like to offer you my perspective on Eelco’s significance, as a scientist, as a teacher, and as a person.

Eelco and I got to know each other in 1992, thirty years ago.

At the time, I was halfway my PhD in Amsterdam, working in the group of Paul Klint. Eelco was studying in Amsterdam, following Paul’s courses. These were so inspiring to Eelco that he decided to join Paul’s group, first to write his master’s thesis, and then to work on his PhD.

It didn’t take long before Eelco and I had a connection. We had extensive discussions about research. The details don’t matter, and they didn’t really lead to concrete results. But thirty years later I still remember the substantive drive, the deep desire to profoundly understand a problem, the feeling to work on something very important, and, of course, Eelco’s tenacity.

Eelco has been able to maintain that same drive for thirty years. Just like he knew how to inspire me, year after year he has inspired his students and his (international) peers — always driven by content, always persistent.

This contributed to Eelco’s research being of the highest international level. Let me illustrate this through an award he received in 2020, a so-called Most Influential Paper award. This is an award you get ten years after publication, after it has been established that your paper actually had made the biggest impact.

Eelco received this award for his article from 2010 on Spoofax, written with (then) PhD student Lennart Kats. Eelco was very proud of this award, and rightly so. In fact, he was so proud that he wrote a (long) blog post about it, entitled “A short history of the Spoofax language workbench.”

This “short history” starts in 1993 with Eelco’s PhD research in Amsterdam. Next, Eelco explains his journey, from Portland as a postdoc, via Utrecht as assistant professor, to Delft as associate and full professor. Each of these stops provides building blocks for the award-winning paper from 2010. And then, Eelco’s “short history” continues: He describes what his group in those ten years after the paper’s publication has done, and what good things he still has in store for the time to come.

To me, this “short history” is signature Eelco:

Visionary, working year after year on building blocks that all belong together
System-driven, with working software that he preferably contributes to himself
In a team, together with his PhD students, postdocs, engineers, students, and numerous international co-workers.

This short history also serves to illustrate the international side of Eelco’s work. He was very active, and loved, within international organizations like IFIP and ACM SIGPLAN. He succeeded in bringing the flagship SPLASH conference to Europe for the first time.

And, naturally, Eelco had a vision on how to improve things: All those conferences putting effort in ad hoc web sites: There had to be a better way. And so, in line with his systems philosophy, he designed the CONF system that has been up and running for ten years now. And he managed to convince hundreds of conferences to use his system, for a fee.

Likewise, Eelco had a vision on education, and he knew how to realize it. In his opinion, programming education just had to be better. Thus, he designed a system, WebLab, which has also been in operation for almost 10 years now. And here too he managed to convince countless teachers to use his system.

In addition, Eelco had a well-thought-out opinion about the courses that belong in a computer science program. So when we needed to revise our curriculum, Eelco was the perfect candidate to chair the associated committee. Eelco did this graciously, in a calm and persistent manner, reasoning from educational principles to settle disputes. The result is rock solid.

Eelco’s education is well characterized by Willem-Paul Brinkman in the online Farewell Community: Without maybe realizing it, many generations of Delft students will benefit from his teaching innovations.

Eelco was proud of his Programming Languages Group. He built it up from scratch into an international top group. He took good care of his people, fighting for the best equipment and offices. As a member of the departmental management team, he fought for computer science in full, at faculty and university level. Nationally he was active in, for example, the Dutch National Association for Software Engineering VERSEN.

And how was Eelco able to realize all this? What was his secret?

Perhaps Eelco actually liked (a little) resistance. He was not afraid to disagree: after all, he had thought deeply about his opinion. And he was fine with being challenged: it was a sign that he was well on his way to breaking the status quo.

Maybe not everyone always found this easy. But Eelco was also very friendly, and certainly willing to change his mind.

And, Eelco was also patient: Big changes take time. If he saw that he had insufficient supporters, he could wait. Or, under the radar, start small in order to set his own plans in motion.

How much we will miss Eelco in Delft! The visionary, the obstinate, the focus on the content, the love for computer science, the tenacity, and the attention for students and colleagues: exactly what we need so much in Delft.

Let me conclude with a few words related to Corona and the lock down. The past few years, Eelco and I were in touch weekly, mostly within the departmental management team, but also often one-on-one. All online, from home. We discussed departmental matters, small and large, as well as the impact of Corona. On one thing we agreed: Being at home more, seeing more of your children, doing more with the family: we both experienced this as a gift.

Due to the lock down, I don’t know when it was that I saw Eelco last in person. I think it was on October 14, at the PhD defense of Arjen Rouvoet. This was a beautiful day, and this is how I like to remember Eelco: Science at the highest level, international peers in substantive debate, a cum laude PhD defense, and Eelco happy, radiant in the midst of his PL group.

Dear family, friends, colleagues, everyone: We will miss Eelco very much. Each of us has his or her own, wonderful memories of Eelco. Today is a day to hold on to that, and to share those memories with each other.

I wish you all, and especially the family, all the best.

15Apr2022

Eelco Visser (1966-2022)

Posted in Research by Arie van Deursen

Toespraak gehouden tijdens de uitvaartplechtigheid van Eelco Visser (12 oktober 1966 – 5 april 2022) op 12 april 2022 in Leusden. English translation available.

Ik sta hier, in totale verbijstering, als hoofd van de afdeling waar Eelco Visser de afgelopen 15 jaar gewerkt heeft.

Ik wil U graag iets vertellen over de betekenis van Eelco, als wetenschapper, als docent, en als mens.

Eelco en ik leerden elkaar kennen in 1992, dertig jaar geleden.

Ik was toen halverwege mijn promotie in Amsterdam, en werkte in de groep van Paul Klint. Eelco studeerde toen in Amsterdam, en volgde colleges bij Paul. Die inspireerden hem zo dat hij zich aansloot bij Pauls groep om daar eerst af te studeren en later te promoveren.

Eelco en ik hadden al snel een klik. We hadden uitgebreide discussies over onderzoek. De details doen er niet toe, en tot een echt resultaat hebben ze niet geleid. Maar dertig jaar later herinner ik me nog steeds de inhoudelijke drive, de wens om een probleem écht te snappen, het gevoel samen met iets héél belangrijks bezig te zijn, en natuurlijk Eelco’s vasthoudendheid.

Diezelfde drive heeft Eelco dertig jaar vast weten te houden. Zoals hij mij wist te inspireren, heeft hij jaar in jaar uit zijn studenten, zijn promovendi, en zijn internationale collega’s aan zich weten te binden — altijd vanuit de inhoud, en altijd vasthoudend.

Mede hierdoor was Eelco’s onderzoek van het hoogste internationale niveau. Laat ik dit illustreren aan de hand van een prijs die hij in 2020 ontving: een zogenaamde “Most Influential Paper” Award. Zo’n prijs krijg je pas als je artikel 10 jaar na publicatie de meeste invloed gehad bleek te hebben.

Eelco kreeg die voor zijn artikel uit 2010 over Spoofax, met promovendus Lennart Kats. Eelco was hier, terecht, heel trots op. Zó trots, dat hij er een (lange) blog post over heeft geschreven, getiteld Een korte geschiedenis van de Spoofax taalwerkbank.

Deze “korte geschiedenis” begint in 1993 bij Eelco’s promotieonderzoek in Amsterdam. Vervolgens legt Eelco zijn reis uit, van Portland als postdoc, via Utrecht als universitair docent, naar Delft als hoofddocent en hoogleraar. Elk van deze tussenstops levert bouwstenen op voor het winnende paper uit 2010. En vervolgens gaat Eelco’s “korte geschiedenis” door: Hij beschrijft wat zijn groep in die tien jaar na het paper heeft gedaan, en wat voor moois hij nog in petto heeft voor de tijd die komen gaat.

Wat mij betreft is deze “korte geschiedenis” Eelco ten voeten uit:

visionair, jaar in jaar uit werkend aan bouwstenen die allemaal bij elkaar horen
systeem-gedreven, met werkende software systemen waar hij het liefst zelf aan meeprogrammeert
in team verband, samen met zijn promovendi, postdocs, engineers, studenten, en talloze internationale collega’s.

Deze korte geschiedenis laat ook iets zien over de internationale dimensie van Eelco’s werk. Hij was zeer actief, en geliefd, binnen internationale organisaties zoals IFIP en ACM SIGPLAN. Het lukte hem om het top-congres SPLASH voor het eerst naar Europa te halen.

En natuurlijk had Eelco een visie hoe het beter kon: Al die congressen die hun web site in elkaar liepen te knutselen: dat moest efficiënter kunnen. En dus, in lijn met zijn systeem-filosofie, ontwierp hij het CONF systeem dat nu al tien jaar in de lucht is. En wist hij honderden congressen te overtuigen zijn systeem, tegen betaling, te gebruiken.

Ook op het gebied van onderwijs had Eelco een visie, en wist hij die te realiseren. Programmeeronderwijs moest beter, vond hij, en dus ontwierp hij een systeem, WebLab, dat nu ook al bijna 10 jaar in de lucht is. En ook hier wist hij talloze docenten te overtuigen zijn systeem te gebruiken.

Daarnaast had Eelco een goed doordachte mening over welke vakken in een opleiding informatica thuis horen. Toen wij ons curriculum moesten herzien, was Eelco de perfecte kandidaat om de bijbehorende commissie voor te zitten. Eelco deed dit met verve, rustig en vasthoudend, redenerend vanuit onderwijskundige principes om geschillen te beslechten. Het resultaat staat als een huis.

Voor Eelco’s onderwijs geldt wat Willem-Paul Brinkman in het online condoleance register schreef: “Zonder het misschien te beseffen, zullen vele generaties Delftse studenten profiteren van zijn onderwijsinnovaties.”

Eelco was trots op zijn “Programming Languages Group“. Hij heeft die uit het niets opgebouwd tot een internationale topgroep. Hij zorgde goed voor zijn mensen, en streed voor de beste apparatuur en werkplekken. Als lid van het management team van de afdeling, zette hij zich in voor informatica in de volle breedte, op facultair en universitair niveau. Ook landelijk was hij actief, onder meer in de Nederlandse Vereniging voor Software Engineering VERSEN.

En hoe kreeg Eelco dit allemaal voor elkaar? Wat was zijn geheim?

Misschien vond Eelco (een beetje) weerstand eigenlijk wel leuk. Hij was niet bang tegengas te geven: hij had tenslotte goed nagedacht over zijn mening. En hij vond het prima tegenstand te krijgen: dat was een teken dat hij goed op weg was de status quo te doorbreken.

Misschien vond niet iedereen dit altijd even makkelijk. Maar Eelco was ook heel vriendelijk, en zeker bereid van mening te veranderen.

En, Eelco was ook geduldig: Grootse veranderingen kosten tijd. Als hij zag dat hij onvoldoende medestanders had kon hij wachten. Of, onder de radar, alvast klein beginnen om zijn eigen plannen toch in gang te zetten.

Wat zullen we Eelco missen in Delft. Het visionaire, het dwarse, de focus op de inhoud, de liefde voor het vak, de vasthoudendheid, en de aandacht voor student en collega: juist dat hebben we nodig in Delft.

Ik wil afsluiten met een paar woorden naar aanleiding van Corona en de lockdown. De afgelopen twee jaar hadden Eelco en ik wekelijks contact, vooral binnen het management team van de afdeling, maar ook vaak 1-op-1, allemaal online, vanuit huis. We bespraken het reilen en zeilen van de afdeling, en de impact van Corona. Over één ding waren we het eens: Meer thuis zijn, meer zien van je kinderen, meer doen met het gezin: dit hebben we allebei als een geschenk ervaren.

Door de lockdown weet ik niet wanneer ik Eelco voor het laatst echt gezien heb. Ik denk dat het 14 oktober was, bij de promotie van Arjen Rouvoet. Dat was een mooie dag, en zo herinner ik me Eelco graag: Wetenschap van het hoogste niveau, internationale collega’s in inhoudelijk debat, een cum laude promotie, en Eelco gelukkig, stralend temidden van zijn PL groep.

Lieve familie, vrienden, collega’s, allemaal: We zullen Eelco zéér missen. Iedereen van ons heeft zijn of haar eigen, prachtige herinneringen aan Eelco. Vandaag is een dag om daaraan vast te houden, en om die herinneringen met elkaar te delen.

Ik wens u allen, en in het bijzonder de familie, alle sterkte toe.

29Jul2018

Writers and Collaborators Workshops

Posted in Research by Arie van Deursen

In September this year we will organize a Lorentz workshop in the area of software analytics and big software. Lorentz worskhops take place in the Lorentz Center in Leiden, and are somewhat similar to Dagstuhl seminars common in computer science: A small group, a week long retreat, and a focus on interaction and collaboration.

in-vivo-analytics-for-big-software-quality-poster-pic

To make this interaction happen, we will experiment with “writer’s and collaborator’s workshops”, inspired by Writer’s Workshops for design patterns.

The workshops we have in mind are short (1-2 hour) sessions, in which a small group of participants (the “discussants”) study one or more papers (proposed by the “author”) in depth.

The primary objective of the session is to provide feedback on the paper to the author. This feedback can relate to any aspect of the paper, such as the experimental setup, the related work, the precise objective, future work that can be carried out to build upon these results, etc.

Besides that, the discussion of each paper serves to explore possible (future) collaborations between any of the participants. Thus, discussants can bring in their own related work, and explore how joining forces can help to further advance the paper’s key results.

The set up of the workshops draws inspiration from Writer’s Workshops commonly used in the design patterns community, which in turn were inspired by workshops int he creative literature community. Pattern workshops have been used to review, evaluate, and improve pattern descriptions. At the same time, the process is akin to a peer review process, except that the objective is not paper selection, but in depth discussion between authors and participants about the key message of a paper.

The specific format we propose is as follows.

The preparation phase aims to match authors and discussants. Using a conference paper management system like EasyChair, the steps include:

Authors submit the paper they would like to have discussed. This can be a paper currently under review (e.g., their most recent ICSE submissions), a draft version of a paper they would like to submit, or an already published paper they would like to expand (for example for a journal submission).
All workshop participants can see all papers, and place “bids” on papers they would be particularly interested in studying in advance.
Papers and participants are grouped into coherent units of 3-4 papers and around 10 participants each.
Each paper to be discussed gets assigned at least three discussants, based on the groups and bids.
Discussants study the papers assigned in advance, and compile a short summary of the paper and its main strenghts and points for improvement.

The actual workshops will take 1-2 hours, has up to 10 participants, and includes the discussion of 2-3 papers using 30-45 minutes per paper. We propose the following format:

For each workshop, we assign one moderator to steer the process.
One of the discussants is assigned to summarize the paper in around 5 minutes, and explain it to the participants.
Each discussant explains what he or she particularly liked about the paper
Each discussant identifies opportunities for possible improvements to the paper.
Workshop participants who did not review the paper themselves offer their perspectives on the discussion, including areas of further work.
After this, the author him or herself can step in, and respond to the various points raised.
As the discussion concludes, the moderator provides a summary of the main findings of the discussion of this paper.
The process is repeated for the next paper, rotating the author, moderator, and discussant roles.

If you have ever attended a physical PC meeting, you will recognize our attempt to keep some of the good parts of a PC meeting, without the need to make any form of “acceptance” decision.

Since several of the lessons learned during such a session will transcend the individual papers discussed, we will also use plenary sessions in which each of the moderators can summarize the main findings of their workshops, and share them with everyone.

As also emphasized by the patterns community, this format requires a safe setting with participants who trust each other. In particular:

Papers discussed are confidential: Authors need not be scared that participants “steal” their ideas;
Feedback is directed at the work rather than the author, preserving the dignity of the author.

Clearly, such a “writers and collaborators workshop” does require work from the participants, both in terms of following the protocol and in preparing the discussions. So we will have to see if it really works or whether some adjustments are necessary.

Yet this format does provide an excellent way to truly engage with each other’s research, and we look forward to the improved research results and future collaborations that will emerge from this.

If you have any experience with formats like this, please let me know!

P.S. We still have some places available in this Lorentz workshop, so contact me if you are interested in participating.

23Jul2018

The Battle for Affordable Open Access

Posted in Research by Arie van Deursen

Last week, Elsevier cut off thousands of scientists in Germany and Sweden from reading its recent journal articles, when negotiations over the cost of a nationwide open-access agreement broke down.

In these negotiations, universities are trying to change academic publishing, while publishers are defending the status quo. If you are an academic, you need to decide how to respond to this conflict:

If you don’t change your own behavior, you are chosing Elsevier’s side, helping them maintain the status quo.
If you are willing to change, you can help the universities. The simplest thing to do is to rigorously self-archive all your publications.

The key reason academic publishing needs to change is that academic publishers, including Elsevier, realize profit margins of 30-40%.

Euro bills

To put this number in perspective, consider my university, TU Delft. Our library spends €4-5 million each year on (journal) subscriptions. 30-40% of this amount, €1-2 million each year, ends up directly in the pockets of the shareholders of commercial publishers.

This is unacceptable. My university needs this money: To handle the immense work load coming with ever increasing student numbers, and to meet the research demands of society. A university cannot afford to waste money by just handing it over to publishers.

Universities across Europe have started to realize this. The Dutch, German, French, and Swedish universities have negotiated at the national level with publishers such as Springer Nature, Wiley, Taylor & Francis, Oxford University Press, and Elsevier (the largest publisher). In many cases deals have been made, with more and more options for open access publishing, at prices that were acceptable to the universities.

However, in several cases no deals have been made. The Dutch universities could not agree with the Royal Society of Chemistry Publishing, the French failed with Springer Nature, and now Germany and Sweden could not come to agreement with Elsevier. A common point of contention is that universities are only willing to pay for journal subscriptions if their employees can publish open access without additional article processing charges — a demand that directly challenges the current business model in academic publishing.

The negotiations are not over yet. Both in terms of open access availability and in terms of price publishers are far from where the universities want them to be. And if the universities would not negotiate themselves, tax payers and governments could simply force them, by putting a cap on the amount of money universities are allowed to spend on journal subscriptions.

Universities are likely to join forces, also across nations. They will determine maximum prices, and will not be willing to make exceptions. The negotiations will be brutal, as the publishers have much to loose and much to fight for.

In all these negotiations it is crucial that universities take back ownership of what they produce. Every single researcher can contribute, simply by making all of their own papers available on their institutional (pure.tudelft.nl for my university) or subject repositories (e.g., arxiv.org). This helps in two ways:

It helps researchers cut off (Germans and Swedes as we speak) from publishers in case negotiations fail.
It reduces the publishers’ power in future negotiations as the negative effects of cancellations have been reduced.

This seems like a simple thing to do, and it is: It should not take an average researcher more than 10 minutes to post a paper on a public repository.

Nevertheless, during my two years as department head I have seen many researchers who fail to see the need or take the time to upload their papers. I have begged, prayed, and pushed, wrote a green open access FAQ to address any legal concerns researchers might have, and wrote a step-by-step guide on how to upload a paper.

On top of that, my university, like many others, have made it compulsory for its employees to upload their papers to the institutional repository (this is not surprising since TU Delft plays a leading role in the Dutch negotiations between universities and publishers). Furthermore both national (NWO) and European (H2020, Horizon Europe) funding agencies insist on open access publications.

Despite all this, my department barely meets the university ambition of having 60% of its 2018 publications available as (green or gold) open access. To the credit of my departmental employees, however, they do better than many other departments. Also pre-print links uploaded to conference sites have typically been less than 60%, suggesting that the culture of self-archiving in computer science leaves much to be desired.

If anything, the recent cut off by Elsevier in Sweden and Germany emphasizes the need for self-archiving.

If you’re too busy to self-archive, you are helping Elsevier getting rich from public money.

If you do self-archive, you help your university explain to publishers that their services are only needed when they bring true value to the publishing process at an affordable price.

Euro image credit: pixabay, CC0 Creative Commons.

25Jun2018

My Last Program Committee Meeting?

Posted in Research by Arie van Deursen

This month, I participated in what may very well have been my last physical program committee (PC) meeting, for ESEC/FSE 2018. In 2017, top software engineering conferences like ICSE, ESEC/FSE, ASE and ISSTA (still) had physical PC meetings. In 2019, these four will all switch to on line PC meetings instead.

I participated in almost 20 of such meetings, and chaired one in 2017. Here is what I learned and observed, starting with the positives:

As an author, I learned the importance of helping reviewers to quickly see and concisely formulate the key contributions in a way that is understandable to the full pc.
As a reviewer I learned to study papers so well that I could confidently discuss them in front of 40 (randomly critical) PC members.
During the meetings, I witnessed how reviewers can passionately defend a paper as long as they clearly see its value and contributions, and how they will kill a paper if it has an irreparable flaw.
I started to understand reviewing as a social process in which reviewers need to be encouraged to change their minds as more information unfolds, in order to arrive at consensus.
I learned phrases reviewers use to permit them to change their minds, such as “on the fence”, “lukewarm”, “not embarrassing”, “my +1 can also be read as a -1”, “I am not an expert but”, etc. Essential idioms to reach consensus.
I witnessed how paper discussions can go beyond the individual paper, and trigger broad and important debate about the nature of the arguments used to accept or reject a paper (e.g. on evaluation methods used, impact, data availability, etc)
I saw how overhearing discussions of papers reviewed by others can be useful, both to add insight (e.g. additional related work) and to challenge the (nature of the) arguments used.
I felt, when I was PC co-chair, the pressure from 40 PC members challenging the consistency of any decision we made on paper acceptance. In terms of impact on the reviewing process, this may well be the most important benefit of a physical PC meeting.
I experienced how PC meetings are a great way to build a trusted community and make friends for life. I deeply respected the rigor and well articulated concerns of many PC members. And nothing bonds like spending two full days in a small meeting room with many people and insufficient oxygen.

I also witnessed some of the problems:

My biggest struggle was the incredible inefficiency of PC meetings. They take 1-2 days from 8am-6pm, you’re present at discussions of up to 100 papers discussed in 5-10 minutes each, yet participate in often less than 10 papers, in some cases just one or two.
I had to travel long distances just for meetings. Co-located meetings (e.g. the FSE meeting is typically immediately after ICSE) reduce the footprint, but I have crossed the Atlantic multiple times just for a two day PC meeting.
My family paid a price for my absence caused by almost 20 PC meetings. I have missed multiple family birthdays.
The financial burden on the conference (meeting room + 40 x dinner and 80 lunches, €5000) and each PC member (travel and 2-3 hotel nights, adding up easily to €750 per person paid by the PC members) is substantial.
I saw how vocal pc members can dominate discussions, yielding less opportunity for the more timid pc members who need more time to think before they dare to speak.
I hardly attended a PC meeting in which not at least a few PC members eventually had to cancel their trip, and at best participated via Skype. This gives papers reviewed by these PC members a different treatment. As PC chair for ESEC/FSE we had five PC members who could not make it, all for valid (personal, painful) reasons. I myself had to cancel one PC meeting a week before the meeting, when one of my children had serious health problems.
Insisting on a physical PC meeeting limits the choice of PC members: When inviting 40 PC members for ESEC/FSE 2017, we had 20 candidates who declined our invitation as they could not commit a year in advance to attending a PC meeting (in Buenos Aires).

Taking the pros and cons together, I have come to believe that the benefits do not outweigh the high costs. It must be possible to organize an on line PC meeting with special actions to keep the good parts (quality control, consistent decisions, overhearing/inspecting each others reviews, …).

I look forward to learning from ICSE, ESEC FSE, ISSTA and ASE experiences in 2019 and beyond about best practices to apply for organizing a successful on line PC meeting.

In principle, ICSE will have on line PC meetings in 2019, 2020, and 2021, after which the steering committee will evaluate the pros and cons.

As ICSE 2021 program co-chairs, Tao Xie and I are very happy about this, and we will do our best to turn the ICSE 2021 on line PC meeting into a great success, for the authors, the PC members, and the ICSE community. Any suggestions on how to achieve this are greatly appreciated.

T-Shirt saying "Last PC Meeting Ever?"

Christian Bird realized the ESEC/FSE 2018 PC meeting may be our last, and realized this nostalgic moment deserved a T-shirt of its own. Thanks!!

15Apr2017

Managing Complex Spreadsheets — The Story of PerfectXL

Posted in Research, Society by Arie van Deursen

This week we finished grading of the software architecture course I’m teaching.

Like many teachers, I use a spreadsheet for grading together with my co-teachers and teaching assistants. In this case, we concurrently worked with five people on a Google Spreadsheet. The resulting spreadsheet is quite interesting:

The spreadsheet currently has 22 sheets (tabs)
There are input sheets for basic information on the over one hundred students in the class, the groups they form, and the rubrics we use.
There are input sheets from various forms the students used to enter course-related information
There are input sheets for different sub-assignments, which the teachers and assistants use to enter subgrades for each rubric: Some grades are individual, others are per team. Such sheets also contain basic formulas to compute grades from rubrics.
There are overview sheets collecting the sub-grades from various sheets, combining them to overall grades. The corresponding formulas can become quite tricky, involving rounding, lookups, sumproducts, thresholds, conditional logic based on absence or presence of certain grades, etc.
There are various output sheets, to report grades to students, to export grades to the university’s administrative systems, and to offer diagrams showing grade distributions for educational assessments of the course.

The spreadsheet used has a history of five years: Each year we take the existing one, copy it, and remove the student data. We then adjust it to the changes we made to the course (additional assignments, new grading policies, better rubrics, etc).

All in all, this spreadsheet has grown quite complex, and it is easy to make a mistake. For example, I once released incorrect grades — a rather stressful event both for my students and myself. And all I did wrong was forgetting the infamous false argument needed in a vlookup — despite the fact that I was well aware of this “feature”. For the this year’s spreadsheet we had duplicate student ids, in a column where each row had to be unique, leading to a missing grade, and again extra effort and stress to resolve this as soon as possible.

I suspect that if you use spreadsheets seriously, for example for grading, you recognize the characteristics of my spreadsheet — and maybe your sheets are even more complicated.

Now I have an interest in spreadsheets that goes beyond that of the casual user: As a software engineering researcher, I have looked extensively at spreadsheets. I did this together with Felienne Hermans, first when she was doing her PhD under my supervision in the context of the Perplex project (co-funded by Microsoft) and then in the context of the Prose project (funded by the Dutch STW agency). From a research perspective, these projects were certainly successful, leading to a series of publications in such venues as ECOOP 2010, ICSE 2011-2013, ICSM, EMSE, and SANER.

But we did our research not just to publish papers: We also had (and have) the ambition to actually help the working spreadsheet user, as well as large organizations that depend on spreadsheets for business-critical decision making.

To that end, we founded a company, Infotron, which offers tooling, services, consultancy, and training to help organizations and individuals become more effective with their spreadsheets.

After several years of operating somewhat under the radar, the Infotron team (headed by CEO Matéo Mol) has now launched an on line service, PerfectXL, in which users can upload a spreadsheet and get it analyzed. The service then helps in the following ways:

PerfectXL can visualize the architectural dependencies between sheets, as shown above for my course sheet;
PerfectXL can identify risks (such as the vlookup I mentioned, interrupted ranges, or overly complex conditional logic);
PerfectXL can assess the structure and quality of the formulas in your sheet.

If this sounds interesting, you can try out the service for free at perfectxl.com. There are various pricing options that help Infotron run and further grow this service — pick the subscription that suits you and your organization best!

Even if you decide not to use the PerfectXL service, the site contains a lot of generally useful information, such as various hints and tips on how to create and maintain well-structured spreadsheets.

Enjoy!

5Jan2017

Golden Open Access for the ACM: Who Should Pay?

Posted in Research by Arie van Deursen

In a move that I greatly support, the ACM Special Interest Group on Programming Languages (SIGPLAN), is exploring various ways to adopt a truly Golden Open Access model, by rolling out a survey asking your opinion, set up by Michael Hicks. Even though I myself am most active in ACM’s Special Interest Group on Software Engineering SIGSOFT, I do publish at and attend SIGPLAN conferences such as OOPSLA. And I sincerely hope that SIGSOFT will follow SIGPLAN’s leadership in this important issue.

ACM presently supports green open access (self-archiving) and a concept called “Open TOC” in which papers are accessible via a dedicated “Table of Contents” page for a particular conference. While better than nothing, I agree with OOPSLA 2017 program chair Jonathan Aldrich who explains in his blog post that Golden Open Access is much preferred.

This does, however, raise the question who should pay for making publications open access, which is part of the SIGPLAN survey:

Attendants Pay: Increase the conference fees: SIGPLAN estimates that this would amount to an increase by around $50,- per attendee.
Authors Pay: Introduce Article Processing Charges: SIGPLAN indicates that if a full conference goes open access this would presently amount to $400 per paper.

Note that the math here suggest that the number of registrants is around 8 times the number of papers in the main research track. Also note that it assumes that only papers in the main research track are made open access. A conference like ICSE, however, has many workshops with many papers: It is equally important that these become open access too, which would change the math considerably.

The article processing charges of $400,- are presented as a given: They may seem in line with what commercial publishers charge, but they are certainly very high compared to what, e.g. LIPIcs charges for ECOOP (which is less than $100). These costs of $400,- come from ACM’s desire (need) to continue to make a substantial profit from their publishing activities, and should go down.

In his blog post, Jonathan Aldrich argues for the “author pays” model. His reasoning is that this can be viewed as a “funder pays” model: Most authors are funded by research grants, and usually in those grants funds can be found to cater for the costs involved in publishing open access.

On this point (and this point alone) I disagree with Jonathan. To me it feels fundamentally wrong to punish authors by making them pay $400 more for their registration. If anything, they should get a reduction for delivering the content of the conference.

I see Jonathan’s point that some funding agencies are willing to cover open access costs (e.g. NSF, NWO, H2020), and that it is worthwhile to explore how to tap into that money. But this requires data on what percentage of papers could be labeled as “funded”. For my department, I foresee several cases where it would be the department who’d have to pay for this instead of an external agency.

I do sympathize with Jonathan’s appeal to reduce conference registration costs, which can be very high. But the cost of making publications open access should be borne by the full community (all attendants), not just by those who happen to publish a paper.

Shining examples of open access computer science conferences are the Usenix, AAAI, and NIPS events. Full golden open access of all content, and no extra charges for authors — these conferences are years ahead of the ACM.

Do you have an opinion on “author pays” versus “participant pays”? Fill in the survey!

Thank you SIGPLAN for initiating this discussion!

7Dec2016

Self-Archiving Publications in Elsevier Pure

Posted in Research by Arie van Deursen

In 2016, TU Delft adopted Elsevier Pure as its database to keep track of all publications from its employees.

At the same time, TU Delft has adopted a mandated green open access policy. This means that for papers published after May 2016, an author-prepared version (pdf) must be uploaded into Pure.

I am very happy with this commitment to green open access (and TU Delft is not alone). This decision also means, however, that we as researchers need to do some extra work, to make our author-prepared versions available.

To make it easier for you to upload your papers and comply with the green open access policy, here are some suggestions based on my experience so far working with Pure.

I can’t say I’m a big fan of Elsevier Pure. In the interest of open access, however, I’m doing my best to tolerate the quirks of Pure, in order to help the TU Delft to share all its research papers freely and persistently with everyone in the world.

Elsevier Pure is used at hundreds of different universities. If you work at one of them, this post may help you in using Pure to make your research available as open access.

The Outcome

Anyone can browse publications in Pure, available at https://pure.tudelft.nl.

All pages have persistent URL’s, making it easy to refer to a list of all your publications (such as my list), or individual papers (such as my recent one on crash reproduction). For all recent papers I have added a pdf of the version that we as authors prepared ourselves (aka the postprint), as well as a DOI link to the publisher version (often behind a paywall).

Thus, you can use Pure to offer, for each publication, your self-archived (green open access) version as well as the final publisher version.

Moreover, these publications can be aggregated to the section, department, and faculty level, for management reporting purposes.

In this way, Pure data shows the tax payers how their money is spent on academic research, and gives the tax payer free access to the outcomes. The tax payer deserves it that we invest some time in populating Pure with accurate data.

Accessing Pure

To enter publications into pure, you’ll need to login. On https://pure.tudelft.nl, in the footer at the right, you’ll find “Log into Pure”. Use your TU Delft netid.

If you’re interested in web applications, you will quickly recognize that Pure is a fairly old system, with user interface choices that would not be made these days.

Entering Meta-Data

You can start entering a publication by hitting the big green button “Add new” at the top right of the page. It will open a brand new browser window for you.

In the new window, click “Research Output”, which will turn blue and expand into three items.

Then there are several ways to enter a publication, including:

Import via Elsevier Scopus, found via “Import from Online Source”. This is by far the easiest, if (1) your publication venue is indexed by Scopus, (2) it is already visible at Scopus (which typically takes a few months), and if (3) you can find it on Scopus. To help Scopus, I have set up an ORCID author identifier and connected it to my Scopus author profile.
Import via Bibtex, found via “Import from file”. If you click it, importing from bibtex is one of the options. You can obtain bibtex entries from DBLP, Google Scholar, ACM, your departmental publications server, or write them by hand in your favorite editor, and then copy paste them into Pure.
Entering details via a series of buttons and forms (“Create from template”). I recommend not to use this option. If you go against this advice, make sure that if you want to enter a conference paper, you do not pick the template “Paper/contribution to conference”, as you should pick “Conference Contribution/Chapter in Conference Proceedings” instead. Don’t ask me why.

In all cases, yet another browser window is opened, in which you can inspect, correct, and save the bibliographic data. After saving, you’ll have a new entry with a unique URL that you can use for sharing your publication. The URL will stay the same after you make additional updates.

Entering your Author-Prepared version

With each publication, you can add various “electronic versions”.

Each can be a file (pdf), a link to a version, or a DOI. For pdfs you want to upload, make sure you check it meets the conditions under your publisher allows self-archiving.

Pure distinguishes various version types, which you can enter via the “Document version” pull down menu. Here you need to include at least the following two versions:

The “accepted author manuscript”. This is also called a postprint, and is the version that (1) is fully prepared by you as authors; and that (2) includes all improvements you made after receiving the reviews. Here you can typically upload the pdf as you prepared it yourself.
The “final published version”. This is the Publisher’s version. It is likely that the final version is copyrighted by the publisher. Therefore, you typically include a link (DOI) to the final version, and do not upload a pdf to Pure. If you import from Scopus, this field is automatically set.

Furthermore, Pure permits setting the “access to electronic version”, and defining the “public access”. Relevant items include:

Open, meaning (green) open access. This is what I typically select for the “accepted author manuscript”.
Restricted, meaning behind a paywall. This is what I typically select for the final published version.
Embargoed, meaning that the pdf cannot be made public until a set date. Can be used for commercial publishers who insist on restricting access to post-prints from institutional repositories in the first 1-2 years.

The vast majority (80%) of the academic publishers permits authors to archive their accepted manuscripts in institutional repositories such as Pure. However, publishers typically permit this under specific conditions, which may differ per publisher. You can check out my Green Open Access FAQ if you want to learn more about these conditions, and how to find them for your (computer science) publisher.

Once uploaded, your pdf is available for download for everyone. Pure adds a cover page with meta-data such as the citation (how it is published) and the DOI to the final version. This cover page is useful, as it helps to meet the intent of the conditions most publishers require on green open access publishing.

Google Scholar indexes Pure, so after a while your paper should also appear on your Scholar page.

A Paper’s Life Cycle

Making papers early available is one of the benefits of self-archiving. This can be done in Pure by setting the paper’s “Publication Status”. This field can have the following values:

“In preparation”: Literally a pre-print. Your paper can be considered a draft and may still change.
“Submitted”: You submitted your paper to a journal or conference where it is now under review.
“Accepted/In press”: Yes, paper accepted! This also means that you as an author can share your “accepted author manuscript”.
“E-Pub ahead of print”: I don’t see how this differs from the Accepted state.
“Published”: The paper is final and has been officially published.

In my Green Open Access FAQ I provide an answer to the question Which Version Should I Self-Archive.

I typically enter publications once accepted, and share the Pure link with the accepted author manuscript as pre-print link on Twitter or on conference sites (e.g. ICSE 2018)

In particular, I do the following once my paper is accepted:

I create a bibtex entry for an @inproceedings (conference, workshop) or @article (journal) publication.
I upload the bibtex entry into pure.
I add my own pdf with the author-prepared version to the resulting pure entry
I set the Publication Status to “Accepted”.
I set the Entry Status (bottom of the page) to “in progress”
I save the entry (bottom of the page)
I share the resulting Pure link on Twitter with the rest of the world so that they can read my paper.

Once the publisher actually manages to publish this paper as well (this may be several months later!), I update my pure entry:

I add the DOI link to the final published version.
I provide the missing bibliographic meta-data (page numbers, volume, number, …).
I set the Publication Status to “Published”.
I set the Entry Status to “for approval” (by the library who can then change it into an immutable “approved” if they think this is a valid entry).

My preprint links I shared still contain a pointer to the self-archived pdf, but now also to the official version at the publisher for those who have access through the pay wall.

Permalinks

The Pure page for your paper including all meta-information and all versions of that paper (example) in principle is stable, and its URL provide a permanent link (unless you delete it).

You can also directly link to the individual pdfs you upload (example). However, these are more volatile: If you upload a newer version the old link will be dead. Moreover, in some cases the (TU Delft) library has moved pdfs around thereby destroying old pdf links.

Therefore, I recommend to use links to the full record rather than individual pdfs when sharing pure links.

Self-Archiving Elsevier Papers

Elsevier does not like it if you self-archive papers published in Elsevier journals into Elsevier Pure. The official rules are that Elsevier journal papers are subject to an embargo, yet at the same time can be published with a CC-BY-NC-ND license on arxiv.

Combining these two leads to the following steps, assuming you have a pre-print (never reviewed), and a post-print (the author-prepared accepted version after review).

Upload your pre-print onto Arxiv.
Add a footnote to your post-print stating: This manuscript version is made available under the CC-BY-NC-ND 4.0 license.
Update your arxiv pre-print with your CC-BY-NC-ND licensed post-print, and add publication details (journal name, volume, issue) to your arxiv entry.
Create a Pure entry for your journal paper
Upload the post-print as author-accepted version to your Pure entry, make it available immediately, and set the license to CC-BY-NC-ND.

Note that the Elsevier rules explicitly allow steps 1-3, and in fact insists on the CC-BY-NC-ND license. Elsevier does not suggest you take step 5, but as a consequence of the CC-BY-NC-ND license you are permitted to do so.

What Elsevier would want you to do instead of step 5 is add the postprint to Pure under a (2 year) embargo, thus delaying (green) open access availability by 2 years. Elsevier Pure even supports this embargo option as one of the “access” options, in which you could enter the end-date of such an embargo.

Note: Yes, these steps are annoying. But: at the time of writing (2019), universities in Germany, Sweden, and California have no access to recent papers published by Elsevier. If you want your paper to be read in any of these countries make sure to upload it into your university repository. If you don’t want to go through these steps and you want your paper to be read, I recommend you pick a different publisher.

Complicated Author Names

Pure contains official employee names as registered by TU Delft.

Some authors publish under different (variants of their) names. For example, Dutch universities have trouble handling the complex naming habits of Portuguese and Brazilian employees.

If Pure is not able to map an author name to the corresponding employee, find the author name in the publication, click edit, and then click “Replace”. This allows searching the TU Delft employee database for the correct person.

If Pure has found the correct employee, but the name displayed is very differently from what is listed on the publication itself, you can edit the author for that publication, and enter a different first and last name for this publication.

Exporting Linked Bibtex (to Orcid)

If you’re logged in, you can download your publication list in various formats, including BibTex (you’ll find the button for this at the bottom of the page).

I prefer bibtex entries that have a url back to the place where all info is. Therefore, I wrote a little Python script to scrape a Pure web page (mine, yours, or anyone’s), that adds such information.

I use the bibtex entries produced by this script to populate my Orcid profile as well as our Departmental Publication Server with publications from Pure that link back to their corresponding pure page.

Version history

20 November 2016: Version 0.1, for internal purposes.
07 December 2016: Version 0.2, first public version.
14 December 2016: Version 0.3, minor improvements.
13 January 2017: Version 0.4, updated Google Scholar information.
16 March 2017: Version 0.5, updated approval states based on correction from Hans Meijerrathken.
17 March 2017: Version 0.6, life cycle and exporting added.
24 November 2017: Version 0.7, simplified life cycle and approval states.
03 March 2018: Version 0.8, added info on populating Orcid from Pure.
27 July 2018: Version 0.9, added info on permalinks, licensed as CC BY-SA 4.0
08 March 2019: Version 1.0, added info on publishing Elsevier papers.

Acknowledgments: Thanks to Moritz Beller for providing feedback and trying out Pure.

Arie van Deursen

Software engineering in theory and practice

Category Archives: Research

TU Delft Computer Science Research Assessment 2015-2020

Self-Assessment Summary

Eelco Visser (1966-2022)

Eelco Visser (1966-2022)

Writers and Collaborators Workshops

The Battle for Affordable Open Access

My Last Program Committee Meeting?

Managing Complex Spreadsheets — The Story of PerfectXL

Golden Open Access for the ACM: Who Should Pay?

Self-Archiving Publications in Elsevier Pure

The Outcome

Accessing Pure

Entering Meta-Data

Entering your Author-Prepared version

A Paper’s Life Cycle

Permalinks

Self-Archiving Elsevier Papers

Complicated Author Names

Exporting Linked Bibtex (to Orcid)