Academic Leadership, Module 1

Last week I participated in the first (two day) module of a six month TU Delft course on “Academic Leadership” — a course so successful it has been taught every single year for the past 32 years.

Maybe the most impressive content comes from the participants themselves (16 in total this year), who serve in different leadership roles at TU Delft. Participants can bring in “cases” they are currently struggling with — my case relates to moving my department to a new building (with less space for the 150 people involved). The participants can ask questions about these cases, often reflecting their own experience in dealing with similar cases. The questions not only help to drill down to the essence of the case, but also to the (possibly deeply personal) reasons behind the struggle at hand.

The format for this participatory content is that of “intervision“, which in English translates to “peer supervision”. It is a technique common among (mental) health care professionals, to exchange their experiences, to analyze how they handle a given complex situation, and to reflect collectively on their professional conduct.

The intervision in this course takes place in smaller groups of four, under the guidance of a coach. In a series of sessions, each participant gets one afternoon to present his or her case, and to discuss it in depth in a trusted, fully confidential setting. A few years back I participated in such an intervision, and I look forward to doing this again.

The actual course content of the first module came from Mathieu Weggeman, a professor and consultant who specializes in management of knowledge-intensive organizations. His lecture carried the title of his book Managing Professionals? Don’t!, emphasizing that professionals usually work best when their managers take a step back. No more “planning and control”, but a focus on shared ambition and employee expertise. I’m sure this resonates with many academics.

Weggeman spent considerable time discussing the characteristics of leaders in excellent professional organizations. Such leaders:

  • develop, together with all employees, a shared ambition;
  • inspire people, and involve them in the organization’s strategy to materialize the ambition;
  • communicate fairly and timely: they are available, and they listen (think management by walking around);
  • are clear about the desired output, and offer clear feedback;
  • are assertive towards employees who are not good at their job anymore;
  • function as “heat shield” against “noise from above”
  • have an authoritative yet serving and humble attitude

Weggeman connected this to a quote from Laozi (老子, 6th century BC):

A leader is best when people barely know he exists,
not so good when people obey and acclaim him,
worse when they despise him.
But a good leader, who talks little, when the work is done, his aim fulfilled,
they will say:
We did it ourselves.

Weggeman also discussed tools to diagnose and design organizations. Such tools need to distinguish (1) setting goals, (2) designing the organization, and (3) executing a strategy to meet the stated objectives — in Dutch nicely summarized as richten, inrichten, verrichten. Weggeman explained how such activities can be influenced through organizational “design variables”, which he (loosely) based on McKinsey’s 7S Framework. This framework distinguishes seven elements, described as (wikipedia):

  • Strategy: Purpose of the business and the way the organization seeks to enhance its competitive advantage.
  • Structure: Division of activities; integration and coordination mechanisms.
  • Systems: Formal procedures for measurement, reward and resource allocation.
  • Shared Values: Included in culture by Weggeman, who also includes in culture the way of working derived from these values.
  • Skills: The organization’s core competencies and distinctive capabilities.
  • Staff: Organization’s human resources, demographic, educational and attitudinal characteristics.
  • Style: Typical behavior patterns of key groups, such as managers, and other professionals

McKinseys 7S Framework

The basic premise of this framework is that these seven internal aspects of an organization need to be aligned, and that they are interrelated: Changing one element will affect the others.

As any good management consultant, Weggeman was full of quotes. To explain the need for a shared ambition, he quoted Nietzche:

He who has a ‘why’ to live for can bear almost any ‘how’.
(“Hat man sein warum des Lebens, so verträgt man sich fast mit jedem wie”, translation by Frankl)

As an academic, it is easy to get lost in the fights of the “how” (getting tenure, submitting a paper, writing a review, applying for funding, managing the class room, handling Blackboard Brightspace, etc., etc.). And naturally, it is our collective duty to improve the ‘how’ wherever we can.

But our ‘why’ is clear: Driven by curiosity, we train young people to become the world’s leading computer scientists and software engineers, and we push the boundaries of what the world knows about computer science. And this we want, in the words of the late David Notkin, “so that society can benefit even more from the amazing potential of software.”

Current Dutch Ius Promovendi Considered Harmful

The Dutch regulations on granting PhD degrees are a disgrace. Unlike most other countries, The Netherlands only gives full professors the right to award a PhD (the full professor has the “ius promovendi”). Thus, in the current regulations, if you’re an assistant or associate professor supervising a PhD student, you will need to find a full professor (deemed “the promotor”) who will be fully responsible for the entire PhD process, even if you’re planning to do all supervision yourself.

I am relieved that this miserable rule is about to change: February 23rd of this year, the Dutch parliament (Tweede Kamer) has passed an Internationalization Law to bring PhD supervision more in line with the rest of the world (for an English summary, see this Science Guide article).

However, the vote for this law in the Dutch Senate (Eerste Kamer) still needs to take place, scheduled for June 6. Much to my surprise, the preliminary reports by the Senate committee raised all sorts of concerns. The senators advocate a hierarchical model in which the full professor is “responsible for the discipline” — a notion at odds with the modern principal investigator approach in which all researchers are peers. The senators are afraid of “undesirable friction”, thereby assuming that making the roles of co-advisors explicit increases rather than decreases friction. The senators gratuitously worry that extending the ius promovendi to non-full professors will reduce the quality of the supervision. The senators argue that granting the right to supervise at the end of a career, is an attractive way of encouraging young professors to engage in such a career. Lastly, in line with the advice of the Raad van State, the senators wonder “whether there is a real problem requiring a new law”.

Answers to these concerns and questions will be provided by the Dutch Minster of Education, after which the senators will vote. Despite the senators’ concerns, the general expectation is that the law will pass. In preparing her answer, the minister can use a wide range of documents arguing the need for this change, e.g., by the Young Academy and the ILLC Research School.

The Dutch Young Academy lists nine key concerns with the current regulation. They argue that the current law is disconnected from reality in academic life, resulting in insufficient recognition (in reputation, financially, and when competing for grants) for the non-full professors involved. The Young Academy also points out that the present situation makes The Netherlands unattractive for talented researchers seeking to pursue an academic career in The Netherlands (as a point of reference, in my department half of the faculty is non-Dutch). Besides this, the ILLC in its letter (in English) to the executive board of the University of Amsterdam emphasizes problems related to quality control, arguing that in many areas it will be hard to find a full professor who is the actual expert in the field.

Yet besides being problematic for assistant and associate professors, restricting formal responsibility for PhD supervision to full professors directly hurts PhD students:

  • It can lead to unclear expectations at the very start of a PhD project, when students deciding whether to accept a PhD position need to understand whether they will be working with the full professor or the non-full professor doing the daily supervision.

  • It can lead to confusion during the PhD project, for example when the full professor and co-advisor express different ideas or opinions. Since the full professor is in a position of power, the student may be reluctant to disagree with the full professor.

  • It can lead to an unclear CV of the student, who needs to explain the role of extra advisors (and sometimes even extra co-authors) to new international employers unfamiliar with the Dutch laws.

  • It can lead to publication restrictions for the PhD student, even if the full professor abstains from active involvement in the research. A case in point are ACM rules that in a strict interpretation forbid research students to submit a paper if their advisor is program chair even if that advisor ultimately has a mere ceremonial role in the PhD project. Extra advisors here means fewer publication options.

In the past 10 years, I witnessed each of these problems in my many interactions with PhD students and co-advisors under the Dutch regulations (I have worked with seven different non-full professor co-supervisors, and with over 35 past and current PhD students). In my group we try to take the most liberal interpretation of the regulations, handing over supervision responsibilities to the co-advisors as much as possible. Despite this, the bizarre limitations for non-full professors percolate through the PhD process, and affect all involved negatively.

To remedy these problems, I trust the senate will vote in favor of the changes to the law.

After that, it is up to the universities to take advantage of this law, and change their own doctoral regulations. The ILLC has various useful suggestions. For example, they call for always appointing two supervisors (a good idea already in place at various universities). Given the new law, it then becomes possible to make the young assistant professor (who, e.g., has obtained the grant) the primary supervisor, and to involve a second supervisor in a more advisory role (i.e., not carrying executive responsibility for the success of the PhD). Drafting these new regulations in a way that is most beneficial to the PhD students, and getting them approved will take some time, but I am confident the improvement will be worth the effort.

I call on all Dutch senators to vote in favor of the new law. Once approved, I expect that Dutch universities will revise their doctoral regulations to make full use of the new legal possibilities, to the benefit of PhD students in The Netherlands and (young) faculty alike. Where needed, I pledge to use my influence, for example as head of department at TU Delft, to make this happen.


UPDATE 7 June 2017: In response to the questions from the Dutch senators, the VSNU (the Association of Dutch Universities) and the presidents of the Dutch Universities provided an addendum explaining how they would implement the law in their regulations. In particular, this addendum emphasizes that universities will grant the “ius promovendum” to associate professors (“UHDs”), but not to assistant professors. Given this addendum, the Senate passed the internationalization law on June 6, 2017.


Image credit: Joachim Schlosser, Flickr.

Managing Complex Spreadsheets — The Story of PerfectXL

This week we finished grading of the software architecture course I’m teaching.

Like many teachers, I use a spreadsheet for grading together with my co-teachers and teaching assistants. In this case, we concurrently worked with five people on a Google Spreadsheet. The resulting spreadsheet is quite interesting:

  • The spreadsheet currently has 22 sheets (tabs)

  • There are input sheets for basic information on the over one hundred students in the class, the groups they form, and the rubrics we use.

  • There are input sheets from various forms the students used to enter course-related information

  • There are input sheets for different sub-assignments, which the teachers and assistants use to enter subgrades for each rubric: Some grades are individual, others are per team. Such sheets also contain basic formulas to compute grades from rubrics.

  • There are overview sheets collecting the sub-grades from various sheets, combining them to overall grades. The corresponding formulas can become quite tricky, involving rounding, lookups, sumproducts, thresholds, conditional logic based on absence or presence of certain grades, etc.

  • There are various output sheets, to report grades to students, to export grades to the university’s administrative systems, and to offer diagrams showing grade distributions for educational assessments of the course.

The spreadsheet used has a history of five years: Each year we take the existing one, copy it, and remove the student data. We then adjust it to the changes we made to the course (additional assignments, new grading policies, better rubrics, etc).

Visualization of sheet dependencies

All in all, this spreadsheet has grown quite complex, and it is easy to make a mistake. For example, I once released incorrect grades — a rather stressful event both for my students and myself. And all I did wrong was forgetting the infamous false argument needed in a vlookup — despite the fact that I was well aware of this “feature”. For the this year’s spreadsheet we had duplicate student ids, in a column where each row had to be unique, leading to a missing grade, and again extra effort and stress to resolve this as soon as possible.

I suspect that if you use spreadsheets seriously, for example for grading, you recognize the characteristics of my spreadsheet — and maybe your sheets are even more complicated.

Now I have an interest in spreadsheets that goes beyond that of the casual user: As a software engineering researcher, I have looked extensively at spreadsheets. I did this together with Felienne Hermans, first when she was doing her PhD under my supervision in the context of the Perplex project (co-funded by Microsoft) and then in the context of the Prose project (funded by the Dutch STW agency). From a research perspective, these projects were certainly successful, leading to a series of publications in such venues as ECOOP 2010, ICSE 2011-2013, ICSM, EMSE, and SANER.

But we did our research not just to publish papers: We also had (and have) the ambition to actually help the working spreadsheet user, as well as large organizations that depend on spreadsheets for business-critical decision making.

To that end, we founded a company, Infotron, which offers tooling, services, consultancy, and training to help organizations and individuals become more effective with their spreadsheets.

After several years of operating somewhat under the radar, the Infotron team (headed by CEO Matéo Mol) has now launched an on line service, PerfectXL, in which users can upload a spreadsheet and get it analyzed. The service then helps in the following ways:

  • PerfectXL can visualize the architectural dependencies between sheets, as shown above for my course sheet;
  • PerfectXL can identify risks (such as the vlookup I mentioned, interrupted ranges, or overly complex conditional logic);
  • PerfectXL can assess the structure and quality of the formulas in your sheet.

If this sounds interesting, you can try out the service for free at perfectxl.com. There are various pricing options that help Infotron run and further grow this service — pick the subscription that suits you and your organization best!

Even if you decide not to use the PerfectXL service, the site contains a lot of generally useful information, such as various hints and tips on how to create and maintain well-structured spreadsheets.

Enjoy!

Novels I Loved Reading in 2016

I enjoy reading a few pages after work, or listen to an audiobook when commuting. Here are the five novels I loved most in 2016.

Arthur Japin: De Gevleugelde (The Winged, 2015)

A novel based on the life of Alberto Santos-Dumont, Brazilian aviation pioneer. Inspired by Jules Vernes, the young Alberto sets out to invent flying machines, building his own hot air balloons, dirigibles (Zeppelins), and eventually planes. In 1901 he is the winner of a competition to fly a route of 11km around the Eiffel Tower, and in 1906 he is the first in Europe to make a flight with an “aircraft heavier than air”. Japin describes what drives the engineer Santos-Dumont to make his inventions and reach his fame. But most of all he tells the story of Santos-Dumont’s forbidden love for his mechanic Albert Chapin.

I listened to the audiobook narrated by the author (who is a trained actor). Many of Japin’s novels are translated and I expect a Portuguese and English version to appear soon, maybe in 2017?

In real life, Alberto Santos-Dumont refused to file any patents (he “open-sourced” his designs), as he wanted the world to benefit from the ability to fly as soon as possible. Later, suffering from multiple sclerosis and hugely disappointed by the military use of planes in world war I and during the Brazilian São Paolo revolution of 1932, he burned all his designs and committed suicide. See also this santos-movie for a short biography.

John Green: The Fault in Our Stars (2012)

Love story of 16 year old Hazel Lancaster and 17 year old Augustus Waters, both cancer patients. Heartbreaking and beautiful. Also features the Amsterdam Anne Frank house and Westerkerk.
“Young Adult Literature”: Amazingly well done novel about love and death that is meaningful to parents and kids alike.

Chances are you read the Fault in Our Stars before me: If you loved it, also consider John Green’s “Looking for Alaska” (2006). Most commonly banned book at US schools and libraries: Can you imagine a better recommendation?

John Green and his brother Hank Green set an example to universities around the world by their crash course initiative offering free on line courses on such topics as astronomy, psychology, world history, physics, and (soon) computer science.

Griet op de Beeck: Kom hier dat ik U Kus (Come Here so that I Can Kiss You, 2014)

Griet op de Beeck is a Belgian author who was to be appointed as TU Delft “cultural professor” in 2016. Unfortunately she had to cancel due to personal circumstances. Before that, she featured in the highly regarded Dutch TV show Zomergasten where she gave an open and optimistic account of her life and her mental health struggles.

Her novel tells the story of three stages in the life of Mona. Told by Mona herself, the first part is set in the simple and compelling language of a 10 year old. Mona tries to make sense of the world after the death of her mother. In the later parts, 24 and 34 year old Mona seeks to find and understand herself, and her relationship to her parents, stepmother, and her brother and stepsister.

I listened to the audiobook narrated by the author in wonderful Flemish. Translations in German (“Komm her und lass dich küssen”) available, and forthcoming in French and other languages in 2017.

Andy Weir: The Martian (2011)

A delightful page turner written by a software engineer. It is the year 2035, and when his mission to Mars gets into trouble, astronaut Mark Watney gets left behind all alone. Thanks to his knowledge of potatoes, farming, and chemistry, and thanks to his amazing optimism, perseverance, and improvisation skills, Mark manages to travel across Mars on his own to reach a place where he might be picked up to return to earth.

Thomas Mann. Buddenbrooks: Verfall einer Familie (The Decline of a Family, 1901)

My classic of choice in 2016 was Mann’s first novel, telling the story of four generations of a 19th century German merchant family. Thomas Buddenbrook runs the family business, optimistic at first, but more and more exhausted and depressed as life goes on. His brother Christian suffers from mental health problems; His sister Anthonie has bad luck in her marriages, and her daughter Erika is unfortunate in marrying a merchant who ends up in prison. And young Johann’s fate is covered in a heartbreaking chapter just describing the symptoms of typhoid.

German edition on Project Gutenberg. In case you want to start exploring Mann with a shorter novel, consider his “Death in Venice” (1912).

The Collaborative Software Architecture Course

After four exciting years of Teaching Software Architecture Using GitHub, we decided to write a paper reflecting on the course and our experiences, and submit it to SIGCSE, the flagship conference of the ACM Special Interest Group on Computer Science Education, typically attended by more than a thousand educators from around the world.

We’re very happy that our paper was immediately accepted!

In the paper, we identify three challenges in teaching software architecture:

  • C1: The theory of software architecture (design principles, tradeoffs, architectural patterns, product lines, etc) is often very abstract and therefore hard for a student to master.
  • C2: The problems of software architecture are only visible at scale, and disappear once small example systems are used.
  • C3: A software architect needs a combination of technical and social skills: software architecture is about communication between stakeholders, and the architect needs to be able to achieve and explain consensus.

To address these challenges, the paper proposes a collaborative approach to teaching software architecture. In particular, we report how we organized our software architecture course according to the following principles:

  • Embrace open source: Students pick an open source system of choice and study its architecture. Students use it to learn how to apply architectural theories to realistic systems (C1, C2).
  • Embrace collaboration: Students work in teams of four to study one system in depth (C3).
  • Embrace open learning: Teams share all of their work with other students. Furthermore, students share their main result with the open source community: their architectural description is published as a chapter in an online book resulting from the course (C3).
  • Interact with the architects: Students are required to offer contributions (in the form of GitHub pull requests) to the open source projects, which will expose them to feedback from actual integrators and architects of the open source projects (C1, C2, C3).
  • Combine breadth and depth: Students dive deeply in the system they analyze themselves, and learn broadly from the analyses conducted and presented by other teams (C1, C3).

DESOSA 2016 book cover

In 2016 the resulting book (created in markdown and git using gitbook) described the architectures of 21 open source systems, including Ember.js, Karma, Neo4j, and SonicPi. The chapters are based both on existing architectural theories (such as architectural views, product lines, and technical debt), as well as the students’ first hand experiences in making actual contributions (merged pull requests) to the open source systems under study.

SIGCSE Abstract

Teaching software architecture is hard. The topic is abstract and is best understood by experiencing it, which requires proper scale to fully grasp its complexity. Furthermore, students need to practice both technical and social skills to become good software architects. To overcome these teaching challenges, we developed the Collaborative Software Architecture Course. In this course, participants work together to study and document a large, open source software system of their own choice. In the process, all communication is transparent in order to foster an open learning environment, and the end-result is published as an online book to benefit the larger open source community.

We have taught this course during the past four years to classes of 50-100 students each. Our experience suggests that: (1) open source systems can be successfully used to let students gain experience with key software architecture concepts, (2) students are capable of making code contributions to the open source projects, (3) integrators (architects) from open source systems are willing to interact with students about their contributions, (4) working together on a joint book helps teams to look beyond their own work, and study the architectural descriptions produced by the other teams.

Arie van Deursen, Maurício Aniche, Joop Aué, Rogier Slag, Michael de Jong, Alex Nederlof and Eric Bouwers. “A Collaborative Approach to Teaching Software Architecture.” Proceedings of the 48th ACM Technical Symposium on Computer Science Education (SIGCSE), March 2017, Seattle, USA.

You can download the paper from the TU Delft institutional repository, or have a look at the slides we used at our SIGCSE 2017 presentation.

Golden Open Access for the ACM: Who Should Pay?

In a move that I greatly support, the ACM Special Interest Group on Programming Languages (SIGPLAN), is exploring various ways to adopt a truly Golden Open Access model, by rolling out a survey asking your opinion, set up by Michael Hicks. Even though I myself am most active in ACM’s Special Interest Group on Software Engineering SIGSOFT, I do publish at and attend SIGPLAN conferences such as OOPSLA. And I sincerely hope that SIGSOFT will follow SIGPLAN’s leadership in this important issue.

ACM presently supports green open access (self-archiving) and a concept called “Open TOC” in which papers are accessible via a dedicated “Table of Contents” page for a particular conference. While better than nothing, I agree with OOPSLA 2017 program chair Jonathan Aldrich who explains in his blog post that Golden Open Access is much preferred.

This does, however, raise the question who should pay for making publications open access, which is part of the SIGPLAN survey:

  • Attendants Pay: Increase the conference fees: SIGPLAN estimates that this would amount to an increase by around $50,- per attendee.

  • Authors Pay: Introduce Article Processing Charges: SIGPLAN indicates that if a full conference goes open access this would presently amount to $400 per paper.

screen-shot-2017-01-05-at-4-23-12-pm

Note that the math here suggest that the number of registrants is around 8 times the number of papers in the main research track. Also note that it assumes that only papers in the main research track are made open access. A conference like ICSE, however, has many workshops with many papers: It is equally important that these become open access too, which would change the math considerably.

The article processing charges of $400,- are presented as a given: They may seem in line with what commercial publishers charge, but they are certainly very high compared to what, e.g. LIPIcs charges for ECOOP (which is less than $100). These costs of $400,- come from ACM’s desire (need) to continue to make a substantial profit from their publishing activities, and should go down.

In his blog post, Jonathan Aldrich argues for the “author pays” model. His reasoning is that this can be viewed as a “funder pays” model: Most authors are funded by research grants, and usually in those grants funds can be found to cater for the costs involved in publishing open access.

On this point (and this point alone) I disagree with Jonathan. To me it feels fundamentally wrong to punish authors by making them pay $400 more for their registration. If anything, they should get a reduction for delivering the content of the conference.

I see Jonathan’s point that some funding agencies are willing to cover open access costs (e.g. NSF, NWO, H2020), and that it is worthwhile to explore how to tap into that money. But this requires data on what percentage of papers could be labeled as “funded”. For my department, I foresee several cases where it would be the department who’d have to pay for this instead of an external agency.

I do sympathize with Jonathan’s appeal to reduce conference registration costs, which can be very high. But the cost of making publications open access should be borne by the full community (all attendants), not just by those who happen to publish a paper.

Shining examples of open access computer science conferences are the Usenix, AAAI, and NIPS events. Full golden open access of all content, and no extra charges for authors — these conferences are years ahead of the ACM.

Do you have an opinion on “author pays” versus “participant pays”? Fill in the survey!

Thank you SIGPLAN for initiating this discussion!

Self-Archiving Publications in Elsevier Pure

In 2016, TU Delft recently has adopted Elsevier Pure as its database to keep track of all publications from its employees.

At the same time, TU Delft has adopted a mandated green open access policy. This means that for papers published after May 2016, an author-prepared version (pdf) must be uploaded into Pure.

I am very happy with this commitment to green open access (and TU Delft is not alone). This decision also means, however, that we as researchers need to do some extra work, to make our author-prepared versions available.

To make it easier for you to upload your papers and comply with the green open access policy, here are some suggestions based on my experience so far working with Pure.

I can’t say I’m a big fan of Elsevier Pure. In the interest of open access, however, I’m doing my best to tolerate the quirks of Pure, in order to help the TU Delft to share all its research papers freely and persistently with everyone in the world.

Elsevier Pure is used at hundreds of different universities. If you work at one of them, this post may help you in using Pure to make your research available as open access.

The Outcome

Pure Paper Data

Anyone can browse publications in Pure, available at https://pure.tudelft.nl.

All pages have persistent URL’s, making it easy to refer to a list of all your publications (such as my list), or individual papers (such as my recent one on crash reproduction). For all recent papers I have added a pdf of the version that we as authors prepared ourselves (aka the postprint), as well as a DOI link to the publisher version (often behind a paywall).

Thus, you can use Pure to offer, for each publication, your self-archived (green open access) version as well as the final publisher version.

Moreover, these publications can be aggregated to the section, department, and faculty level, for management reporting purposes.

In this way, Pure data shows the tax payers how their money is spent on academic research, and gives the tax payer free access to the outcomes. The tax payer deserves it that we invest some time in populating Pure with accurate data.

Accessing Pure

To enter publications into pure, you’ll need to login. On https://pure.tudelft.nl, in the footer at the right, you’ll find “Log into Pure”. Use your TU Delft netid.

If you’re interested in web applications, you will quickly recognize that Pure is a fairly old system, with user interface choices that would not be made these days.

Entering Meta-Data

You can start entering a publication by hitting the big green button “Add new” at the top right of the page. It will open a brand new browser window for you.

In the new window, click “Research Output”, which will turn blue and expand into three items.

Then there are several ways to enter a publication, including:

  1. Import via Elsevier Scopus, found via “Import from Online Source”. This is by far the easiest, if (1) your publication venue is indexed by Scopus, (2) it is already visible at Scopus (which typically takes a few months), and if (3) you can find it on Scopus. To help Scopus, I have set up an ORCID author identifier and connected it to my Scopus author profile.

  2. Import via Bibtex, found via “Import from file”. If you click it, importing from bibtex is one of the options. You can obtain bibtex entries from DBLP, Google Scholar, ACM, your departmental publications server, or write them by hand in your favorite editor, and then copy paste them into Pure.

  3. Entering details via a series of buttons and forms (“Create from template”). I recommend not to use this option. If you go against this advice, make sure that if you want to enter a conference paper, you do not pick the template “Paper/contribution to conference”, as you should pick “Conference Contribution/Chapter in Conference Proceedings” instead. Don’t ask me why.

In all cases, yet another browser window is opened, in which you can inspect, correct, and save the bibliographic data. After saving, you’ll have a new entry with a unique URL that you can use for sharing your publication. The URL will stay the same after you make additional updates.

Entering your Author-Prepared version

With each publication, you can add various “electronic versions”.

Each can be a file (pdf), a link to a version, or a DOI. For pdfs you want to upload, make sure you check it meets the conditions under your publisher allows self-archiving.

Pure distinguishes various version types, which you can enter via the “Document version” pull down menu. Here you need to include at least the following two versions:

  • The “accepted author manuscript”. This is also called a postprint, and is the version that (1) is fully prepared by you as authors; and that (2) includes all improvements you made after receiving the reviews. Here you can typically upload the pdf as you prepared it yourself.

  • The “final published version”. This is the Publisher’s version. It is likely that the final version is copyrighted by the publisher. Therefore, you typically include a link (DOI) to the final version, and do not upload a pdf to Pure. If you import from Scopus, this field is automatically set.

Furthermore, Pure permits setting the “access to electronic version”, and defining the “public access”. Relevant items include:

  • Open, meaning (green) open access. This is what I typically select for the “accepted author manuscript”.

  • Restricted, meaning behind a paywall. This is what I typically select for the final published version.

  • Embargoed, meaning that the pdf cannot be made public until a set date. Can be used for commercial publishers who insist on restricting access to post-prints from institutional repositories in the first 1-2 years.

Example Pure cover page.

The vast majority (80%) of the academic publishers permits authors to archive their accepted manuscripts in institutional repositories such as Pure. However, publishers typically permit this under specific conditions, which may differ per publisher. You can check out my Green Open Access FAQ if you want to learn more about these conditions, and how to find them for your (computer science) publisher.

Once uploaded, your pdf is available for download for everyone. Pure puts adds a cover page with meta-data such as the citation (how it is published) and the DOI to the final version. This cover page is useful, as it helps to meet the intent of the conditions most publishers require on green open access publishing.

Google Scholar indexes Pure, so after a while your paper should also appear on your Scholar page.

A Paper’s Life Cycle

Making papers early available is one of the benefits of self-archiving. This can be done in Pure by setting the paper’s “Publication Status”. This field can have the following values:

  1. “In preparation”: Literally a pre-print. Your paper can be considered a draft and may still change.
  2. “Submitted”: You submitted your paper to a journal or conference where it is now under review.
  3. “Accepted/In press”: Yes, paper accepted! This also means that you as an author can share your “accepted author manuscript”.
  4. “E-Pub ahead of print”: I don’t see how this differs from the Accepted state.
  5. “Published”: The paper is final and has been officially published.

In my Green Open Access FAQ I provide an answer to the question Which Version Should I Self-Archive.

I typically enter publications once accepted, and share the Pure link with the accepted author manuscript as pre-print link on Twitter or on conference sites (e.g. ICSE 2017)

In particular, I do the following once my paper is accepted:

  1. I create a bibtex entry for an @inproceedings (conference, workshop) or @article (journal) publication.
  2. I upload the bibtex entry into pure.
  3. I add my own pdf with the author-prepared version to the resulting pure entry
  4. I set the Publication Status to “Accepted”.
  5. I set the Entry Status (bottom of the page) to “in progress”
  6. I save the entry (bottom of the page)
  7. I share the resulting Pure link on Twitter with the rest of the world so that they can read my paper.

Once the publisher actually manages to publish this paper as well (this may be several months later!), I update my pure entry:

  1. I add the DOI link to the final published version.
  2. I provide the missing bibliographic meta-data (page numbers, volume, number, …).
  3. I set the Publication Status to “Published”.
  4. I set the Entry Status to “for approval” (by the library who can then change it into an immutable “approved” if they think this is a valid entry).

My preprint links I shared still contain a pointer to the self-archived pdf, but now also to the official version at the publisher for those who have access through the pay wall.

Complicated Author Names

Pure contains official employee names as registered by TU Delft.

Some authors publish under different (variants of their) names. For example, Dutch universities have trouble handling the complex naming habits of Portuguese and Brazilian employees.

If Pure is not able to map an author name to the corresponding employee, find the author name in the publication, click edit, and then click “Replace”. This allows searching the TU Delft employee database for the correct person.

If Pure has found the correct employee, but the name displayed is very differently from what is listed on the publication itself, you can edit the author for that publication, and enter a different first and last name for this publication.

Exporting To Linked Bibtex

If you’re logged in, you can download your publication list in various formats, including BibTex (you’ll find the button for this at the bottom of the page).

I prefer bibtex entries that have a url back to the place where all info is. Therefore, I wrote a little Python script to scrape a Pure web page (mine, yours, or anyone’s), that adds such information.

I also use this script to populate our Departmental Publication Server with publications from Pure, that link back to their corresponding pure page.


Version history

  • 20 November 2016: Version 0.1, for internal purposes.
  • 07 December 2016: Version 0.2, first public version.
  • 14 December 2016: Version 0.3, minor improvements.
  • 13 January 2017: Version 0.4, updated Google Scholar information.
  • 16 March 2017: Version 0.5, updated approval states based on correction from Hans Meijerrathken.
  • 17 March 2017: Version 0.6, life cycle and exporting added.
  • 24 November 2017: Version 0.7, simplified life cycle and approval states.

Acknowledgments: Thanks to Moritz Beller for providing feedback and trying out Pure.

© Arie van Deursen, December 2016.