Self-Archiving Publications in Elsevier Pure (at TU Delft)

TU Delft recently has adopted Elsevier Pure as its database to keep track of all publications from its employees.

At the same time, TU Delft has adopted a mandated green open access policy. This means that for papers published after May 2016, an author-prepared version (pdf) must be uploaded into Pure.

I am very happy with TU Delft’s commitment to green open access (and TU Delft is not alone). This decision also means, however, that TU Delft researchers need to do some extra work, to make their author-prepared versions available.

To make it easier for TU Delft researchers to upload their papers and comply with the green open access policy, here are some suggestions based on my experience so far working with Pure. Since Pure is used around the world, this post may also be relevant for researchers not working at TU Delft.

The Outcome

Pure Paper Data

Anyone can browse publications in Pure, available at https://pure.tudelft.nl.

All pages have persistent URL’s, making it easy to refer to a list of all your publications (such as my list), or individual papers (such as my recent one on crash reproduction). For all recent papers I have added a pdf of the version that we as authors prepared ourselves (aka the postprint), as well as a DOI link to the publisher version (often behind a paywall).

Thus, you can use Pure to offer, for each publication, your self-archived (green open access) version as well as the final publisher version.

Moreover, these publications can be aggregated to the section, department, and faculty level, for management reporting purposes.

In this way, Pure data shows the tax payers how their money is spent on academic research, and gives the tax payer free access to the outcomes. The tax payer deserves it that we invest some time in populating Pure with accurate data.

Accessing Pure

To enter publications into pure, you’ll need to login. On https://pure.tudelft.nl, in the footer at the right, you’ll find “Log into Pure”. Use your netid.

If you’re interested in web applications, you will clickly recognize that Pure is a fairly old system, with user interface choices that would not be made these days.

I hope, however, that in the interest of open access, you will be willing to tolerate the quirks of Pure, and do whatever it takes to get your pdfs at a place that is persistent and freely accessible.

Entering Meta-Data

You can start entering a publication by hitting the big green button “Add new” at the top right of the page. It will open a brand new browser window for you.

In the new window, click “Research Output”, which will turn blue and expand into three items.

Then there are several ways to enter a publication, including:

  1. Import via Elsevier Scopus, found via “Import from Online Source”. This is by far the easiest, if your publication venue is indexed by Scopus, if it is already visible at Scopus (which typically takes a few months), and if you can find it on Scopus. To help Scopus, I have set up an ORCID author identifier and connected it to my Scopus author profile.

  2. Import via Bibtex, found via “Import from file”. If you click it, importing from bibtex is one of the options. You can obtain bibtex entries from DBLP, Google Scholar, ACM, our departmental publications server and copy paste it to pure.

  3. Entering details via a series of buttons and forms (“Create from template”). I have never used this option.

In all cases, yet another browser window is opened, in which you can inspect, correct, and save the bibliographic data.

Entering your Author-Prepared version

With each publication, you can add various “electronic versions”.

Each can be a file (pdf), a link to a version, or a DOI.

Pure distinguishes various version types, which you can enter via the “Document version” pull down menu. Here you need to include at least the following two versions:

  • The “accepted author manuscript”. This is also called a postprint, and is the version that (1) is fully prepared by you as authors; and that (2) includes all improvements you made after receiving the reviews. Here you can typically upload the pdf as you prepared it yourself.

  • The “final published version”. This is the Publisher’s version. It is likely that the final version is copyrighted by the publisher. Therefore, you typically include a link (DOI) to the final version, and do not upload a pdf to Pure. If you import from Scopus, this field is automatically set.

Furthermore, Pure permits setting the “access to electronic version”, and defining the “public access”. Relevant items include:

  • Open, meaning (green) open access. This is what I typically select for the “accepted author manuscript”.

  • Restricted, meaning behind a paywall. This is what I typically select for the final published version.

  • Embargoed, meaning that the pdf cannot be made public until a set date. Can be used for commercial publishers who insist on restricting access to post-prints from institutional repositories in the first 1-2 years.

The vast majority (80%) of the academic publishers permits authors to archive their accepted manuscripts in institutional repositories such as Pure. However, publishers typically permit this under specific conditions, which may differ per publisher. You can check out my Green Open Access FAQ if you want to learn more about these conditions, and how to find them for your (computer science) publisher.

Updating Entries

A publication you entered can be in two states: “For Validation” or “Validated”.

“For Validation” means that the publication has not been approved yet by a TU Delft Library employee. It also means that you can still make changes yourself.

“Validated” means that a TU Delft library employee has approved the publication. It also means that you yourself cannot change this publication anymore. If your publication does need a correction nevertheless, you will have to email the TU Delft library contact person for your department (Jasper van Dijck for my Department of Software Technology).

Older (before 2015) Entries

Entries from 2015 and before were automatically imported from TU Delfts old Metis system (which in our Department of Software Technology in turn was populated from our ST Publication Server). Since Metis did not support pdf uploads, these older publications do not come with open access post-prints in Pure.

To update such older entries, see the updating procedure described above.

End-of-the-Year Publication Reporting

As any (Dutch) university, the TU Delft has to report to its stakeholders what its “output” is. This information is collected in Pure, and used by the government to analyze the research performance of various universities.

This means that in, say, February of year N, all publications in year N-1 must have been entered into Pure.

If you follow the Scopus approach (like I try to do), this means that due to the delay in Scopus you may have to switch to the bibtex approach to enter publications from November or December.

Note that in the United Kingdom under the REF open access policy, authors must upload their papers within 3 months of being accepted. The TU Delft has no such rule yet as far as I know, but this would simplify the process of end-of-the-year publication collection.

Google Indexing

The TU Delft Pure data itself is not indexed by Google (as far as I know).

However, pdfs uploaded in Pure should be automatically (after validation by the library) copied to https://repository.tudelft.nl, which is indexed, meaning that your papers (and your post-prints) will end up in Google Scholar.

Complicated Author Names

Pure contains official employee names as registered by TU Delft.

Some authors publish under different (variants of their) names. For example, Dutch universities have trouble handling the complex naming habits of Portuguese and Brazilian employees.

If Pure is not able to map an author name to the corresponding employee, find the author name in the publication, click edit, and then click “Replace”. This allows searching the TU Delft employee database for the correct person.

If Pure has found the correct employee, but the name displayed is very differently from what is listed on the publication itself, you can edit the author for that publication, and enter a different first and last name for this publication.


Version history

  • 20 November 2016: Version 0.1, for internal purposes.
  • 07 December 2016: Version 0.2, first public version

Acknowledgments: Thanks to Moritz Beller for providing feedback and trying out Pure.

© Arie van Deursen, December 2016.

Green Open Access FAQ

Green field

Image credit: Flickr, user static_view

(Opinionated) answers to frequently asked questions on (green) open access, from a computer science (software engineering) research perspective.

Disclaimer: IANAL, so if you want to know things for sure you’ll have to study the references provided. Use at your own risk.

Green open access is trickier than I thought, so I might have made mistakes. Corrections are welcome, just as additional questions for this FAQ. Thanks!

Green Open Access Questions

  1. What is Green Open Access?
  2. What is a pre-print?
  3. What is a post-print?
  4. What is a publisher’s version?
  5. Do publishers allow Green Open Access?
  6. Under what conditions is Green Open Access permitted?
  7. What is Yellow Open Access?
  8. What is Gold Open Access?
  9. What is Hybrid Open Access?
  10. What are the Self-Archiving policies of common computer science venues?
  11. Is Green Open Access compulsory?
  12. What is a good place for self-archiving?
  13. Can I use PeerJ Preprints for Self-Archiving?
  14. Can I use ResearchGate or Academia.edu for Self-Archiving?
  15. Which version(s) should I self-archive?
  16. What does Gold Open Access add to Green Open Access?
  17. Will Green Open Access hurt commercial publishers?
  18. What is the greenest publisher in computer science?
  19. Should I use ACM Authorizer for Self-Archiving?
  20. As a conference organizer, can I mandate Green Open Access?
  21. What does Green Open Access cost?
  22. Should I adopt Green Open Access?
  23. Where can I learn more about Green Open Access?

What is Green Open Access?

In Green Open Access you as an author archive a version of your paper yourself, and make it publicly available. This can be at your personal home page, at the institutional repository of your employer (such as the one from TU Delft), or at an e-print server such as arXiv.

The word “archive” indicates that the paper will remain available forever.

What is a pre-print?

A pre-print is a version of a paper that is entirely prepared by the authors.

Since no publisher has been involved in any way in the preparation of such a pre-print, it feels right that the authors can deposit such pre-prints where ever they want to. Before submission, the authors, or their employers such as universities, hold the copyright to the paper, and hence can publish the paper in on line repositories.

Following the definition of SHERPA‘s RoMEO project, pre-prints refer to the version before peer-review organized by a publisher.

What is a post-print?

Following the RoMEO definitions, a post-print is a final draft as prepared by the authors themselves after reviewing. Thus, feedback from the reviewers has typically been included.

Here a publisher may have had some light involvement, for example by selecting the reviewers, making a reviewing system available, or by offering a formatting template / style sheet. The post-print, however, is author-prepared, so copy-editing and final markup by the publisher has not been done.

What is a publisher’s version?

While pre- and post-prints are author-prepared, the final publisher’s version is created by the publisher.

The publishers involvement may vary from very little (camera ready version entirely created by authors) up to substantial (proof reading, new markup, copy editing, etc.).

Publishers typically make their versions available after a transfer of copyright, from the authors to the publisher. And with the copyright owned by the publisher, it is the publisher who determines not only where the publisher’s version can be made available, but also where the original author-prepared pre- or post-prints can be made available.

Do publishers allow Green Open Access?

Self-archiving of non-published material that you own the copyright to is always allowed.

Whether self-archiving of a paper that has been accepted by a publisher for publication is allowed depends on that publisher. You have transferred your copyright, so it is up to the publisher to decide who else can publish it as well.

Different publishers have different policies, and these policies may in turn differ per journal. Furthermore, the policies may vary over time.

The SHERPA project does a great job in keeping track of the open access status of many journals. You’ll need to check the status of your journal, and if it is green you can self-archive your paper (usually under certain publisher-specific conditions).

In the RoMEO definition, green open access means that authors can self-archive both pre-prints and post-prints.

Under what conditions is Green Open Access permitted?

Since the publisher holds copyright on your published paper, it can (and usually does) impose constraints on the self-archived versions. You should always check the specific constraints for your journal or publisher, for example via the RoMEO journal list.

The following conditions are fairly common:

  1. You generally can self-archive pre- and post-prints only, but not the publisher version.

  2. In the meta-data of the self-archived version you need to add a reference to the final version (for example through its DOI).

  3. In the meta-data of the self-archived version you need to include a statement of the current ownership of the copyright, sometimes through specific sentences that must be copy-pasted.

  4. The repository in which you self-archive should be non-commercial. Thus, arXiv and institutional repositories are usually permitted, but commercial ones like PeerJ Preprints, Academia.edu or ResearchGate are not.

  5. Some commercial publishers impose an embargo on post-prints. For example Elsevier permits sharing the post-print version on an institutional repository only after 12-24 months (depending on the journal).

Usually meeting the demands of a single publisher is relatively easy to do. However, every publisher has its own rules. If you publish your papers in a range of different venues (which is what good researchers do), you’ll have to know many different rules if you want to do green open access in the correct way.

What is Yellow Open Access?

Some publishers (such as Wiley) allow self-archiving of pre-prints only, and not of post-prints. This is referred to as yellow open access in RoMEO. Yellow is more restrictive than green.

As an author, I find yellow open access frustrating, as it forbids me to make the version of my paper that was improved thanks to the reviewers available via open access.

As a reviewer, I feel yellow open access wastes my effort: I tried to help authors by giving useful feedback, and the publisher forbids my improvements to be reflected in the open access version.

What is Gold Open Access?

Gold Open Access refers to journals (or conference proceedings) that are completely accessible to the public without requiring paid subscriptions.

Often, gold implies green, for example when a publisher such as PeerJ, PLOS ONE or LIPIcs adopts a Creative Commons license — which allows anyone, including the authors, to share a copy under the condition of proper attribution.

The funding model for open access is usually not based on subscriptions, but on Author Processing Charges, i.e., a payment by the authors for each article they publish (varying between $70 (LIPIcs) up to $1500 (PLOS ONE) per paper).

What is Hybrid Open Access?

Hybrid open access refers to a restricted (subscription-funded) journal that permits authors to pay extra to make their own paper available as open access.

This practice is also referred to as double dipping: The publisher catches revenues from both subscriptions and author processing charges.

University libraries and funding agencies do not like hybrid access, since they feel they have to pay twice, both for the authors and the readers.

Green open access is better than hybrid open access, simply because it achieves the same (an article is available) yet at lower costs.

What are the Self-Archiving policies of common computer science venues?

For your and my convenience, here is the green status of some publishers that are common in software engineering (check links for most up to date information):

  • ACM: Green, e.g., TOSEM, see also the ACM author rights. For ACM conferences, often the author-prepared camera-ready version includes a DOI already, making it easy to adhere to ACM’s meta-data requirements.
  • IEEE: Green, e.g., TSE. The IEEE has a policy that the IEEE makes a version available that meets all IEEE meta-data requirements, and that authors can use for self-archiving. See also their self-archiving FAQ.
  • Springer: Green, e.g., EMSE, SoSyM, LNCS. Pre-print on arXiv, post-print on personal page immediately and in repository in some cases immediately and in others after a 12 month embargo period.
  • Elsevier: Mostly green, e.g., JSS, IST. Pre-print allowed, post-print on personal page immediately and in repository after 12-48 month embargo period.
  • Wiley: Mostly yellow, i.e., only pre-prints can be immediately shared, and post-prints (even on personal pages) only after 12 month embargo. E.g. JSEP.

Luckily, there are also some gold open access publishers (which also permit green archiving):

Is Green Open Access compulsory?

Funding agencies (NWO, EU, Bill and Melinda Gates Foundation, …) as well as universities (TU Delft, University of California, UCL, ETH Zurich, Imperial College, …) are increasingly demanding that all publications resulting from their projects or employees are available in open access.

My own university TU Delft insists, like many others, on green open access:

As of 1 May 2016 the so-called Green Road to Open Access publishing is mandatory for all (co)authors at TU Delft. The (co)author must publish the final accepted author’s version of a peer-reviewed article with the required metadata in the TU Delft Institutional Repository.

This makes sense: The TU Delft wants to have copies of all the papers that its employees produce, and make sure that the TU Delft stakeholders, i.e. the Dutch citizens, can access all results. Note that TU Delft insists on post-prints that include reviewer-induced modifications.

The Dutch national science foundation NWO has a preference for gold open access, but accepts green open access if that’s impossible (“Encourage Gold, require immediate Green“).

What is a good place for self-archiving?

It depends on your needs.

Your employer may require that you use your institutional repository (such as the TU Delft Repository). This still allows you to post a version elsewhere as well.

Subject repositories such as arXiv offer good visibility to your peers. In fields like physics using arXiv is very common, whereas in Computer Science this is less so. A good thing about arXiv is that it permits versioning, making it possible to submit a pre-print first, which can then later be extended with the post-print. You can use several licenses. If you intend publishing your paper, however, you should adopt arXiv’s Non-Exclusive Distribution license (which just allows arXiv to distribute the paper) instead of the more generous Creative Commons license — which would likely conflict with the copyright claims of the publisher of the refereed paper.

Your personal home page is a good place if you want to offer an overview of your own research. Home page URLs may not be very permanent though, so as an approach to self archiving it is less ideal.

Can I use PeerJ Preprints for Self-Archiving?

Probably not — and it’s also not what PeerJ Preprints are intended for.

PeerJ Preprints is a commercial eprint server requiring a Creative Commons license. It is intended to share drafts that have not yet been peer reviewed for formal publication.

It offers good visibility (a preprint on goto statements attracted 15,000 views), and a smooth user interface for posting comments and receiving feedback. Articles can not be removed once uploaded.

The PeerJ Preprint service is compatible with other golden open access publishers (such as PeerJ itself or Usenix).

The PeerJ Preprint service, however, is incompatible with most other publishers (such as ACM, IEEE, or Springer) because (1) the service is commercial; (2) the service requires Creative Commons as license; (3) preprints once posted cannot be removed.

So, if you want to abide with the rules, uploading a pre-print to PeerJ Preprints severely limits your subsequent publication options.

Can I use ResearchGate or Academia.edu for Self-Archiving?

No — unless you only work with liberal publishers with permissive licenses such as Creative Commons.

ResearchGate and Academia.edu are researcher social networks that also offer self-archiving features. As they are commercial repositories, most publishers will not allow sharing your paper on these networks.

The ResearchGate copyright pages provide useful information on this.

The Academia.edu copyright pages state the following:

Many journals will also allow an author to retain rights to all pre-publication drafts of his or her published work, which permits the author to post a pre-publication version of the work on Academia.edu. According to Sherpa, which tracks journal publishers’ approach to copyright, 90% of journals allow uploading of either the pre-print or the post-print of your paper.

This seems misleading to me: Most publishers explicitly dis-allow posting preprints to commercial repositories such as Academia.edu.

In both cases, the safer route is to use permitted places such as your home page or institutional repository for self-archiving, and only share links to your papers with ResearchGate or Academia.edu.

Which version(s) should I self-archive?

It depends.

Publishing a pre-print as soon as it is ready has several advantages:

  • You can receive rapid feedback on a version that is available early.

  • You can extend your pre-print with an appendix, containing material (e.g., experimental data) that does not fit in a paper that you’d submit to a journal

  • It allows you to claim ownership of certain ideas before your competition.

  • You offer most value to society since you allow anyone to benefit as early as possible from your hard work

Nevertheless, publishing a post-print only can also make sense:

  • You may want to keep some results or data secret from your competition until your paper is actually accepted for publication.

  • You may want to avoid confusion between different versions (pre-print versus post-print).

  • You may be scared to leave a trail of rejected versions submitted to different venues.

For these reasons, and primarily to avoid confusion, I typically share just the post-print: The camera-ready version that I create and submit to the publisher is also the version that I self-archive as post-print.

What does Gold Open Access add to Green Open Access?

For open access, gold is better than green since:

  • it removes the burden of making articles publicly available from the researcher to the publisher.
  • it places a paper in a venue that is entirely open access. Thus, also other papers improving upon, or referring to your paper (published in the same journal) will be open access too.
  • gold typically implies green, i.e., the license of the journal is similar to Creative Commons, allowing anyone, including the authors, to share a copy under the condition of proper attribution.

Will Green Open Access hurt commercial publishers?

Maybe. But most academic publishers already allow green open access, and they are doing just fine. So I would not worry about it.

What is the greenest publisher in computer science?

The greenest publisher should be the one imposing the least restrictions on self-archiving.

From that perspective, publishers who want to be the greenest should in fact want to be gold, making their papers available under a permissive Creative Commons license. An example is Usenix.

Among the non-golden publishers, the greenest are probably the non-commercial ones, such as IEEE and ACM: They require simple conditions that are usually easy to meet.

The ACM, “the world’s largest educational and scientific computing society”, claims to be among the “greenest” publishers. Based on their tolerant attitude towards self-archiving of post-prints this may be somewhat justified. Furthermore, their Authorizer mechanism permits setting up free access to the publisher’s version.

But greenest is gold. So I look forward to the day the ACM follows its little sister Usenix in a full embrace of golden open access.

Should I use ACM Authorizer for Self-Archiving?

The ACM offers the Authorizer mechanism to provide free access to the Publisher’s Version of a paper, which only works from one user-specified URL. For example, I can use it to create a dedicated link from my institutional paper page to the publisher’s version.

However, Authorizer links cannot be accessed from other pages, and there is no point in emailing or tweeting them. Since only one authorizer link can exist per paper, I cannot use an authorizer link for both my institutional repository, and for the repository of my funding agency.

These restrictions on Authorizer links make them unsuitable as a replacement for self-archiving (let alone as a replacement for golden open access).

As a conference organizer, can I mandate Green Open Access?

Green open access is self-archiving, giving the authors the permission to archive their own papers.

As a conference organizer working with a non open access (ACM, IEEE, Springer-Verlag) publisher, you are not allowed to archive and distribute all the papers of the conference yourself.

OOSPLA program with DOI and preprint link

What several conferences do instead, though, is collecting links to pre- or post-prints. For example, the on line program of the recent OOPSLA 2016 conference has links to both the publisher’s version (through a DOI) and to an author-provided post-print.

For OOPSLA, 20 out of the 52 (38%) of the authors provided such a link to their paper, a number that is similar in other conferences adopting such preprint linking.

As a conference organizer, you can do your best to encourage authors to submit their pre-print links. Or you can use your influence in the steering committee to push the conference to switch to an open access publisher, such as LIPIcs or Usenix.

As an author, you can help by actually offering a link to your pre-print.

What does Green Open Access cost?

For authors, green open access typically costs no money. University repositories, arXiv, and PeerJ Preprints are all free to use.

It does cost (a bit of) effort though:

  • You need to find out the specific conditions under which the publisher of your current paper permits self-archiving.
  • You need to actually upload your paper to some repository, provide the correct meta-data, and meet the publisher’s constraints.

The fact that open access is free for authors does not mean that there are no costs involved. For example, the money to keep arXiv up and running comes from a series of sponsors, including TU Delft.

Should I adopt Green Open Access?

Yes.

Better availability of your papers will help you in several ways:

  • Impact in Research: Other researchers can access your papers more easily, increasing the chances that they will build upon your results in their work;
  • Impact in Practice: Practitioners may be interested in using your results: A pay-wall is an extra and undesirable impediment for such adoption;
  • Improved Results: Increased usage of your results in either industry or academia will put your results to the real test, and will help you improve your results.

Besides that, (green) open access is a way of delivering to the tax payers what they paid for: Your research results.

Where can I learn more about Green Open Access?

Useful resources include:


Version history:

  • 6 November 2016: Version 0.1, Initial version, call for feedback.
  • 14 November 2016: Version 0.2, update on commercial repositories.
  • 18 November 2016: Version 0.3, update on ACM Authorizer.
  • 20 November 2016: Version 0.4, added TOC, update on commercial repositories.
  • 06 December 2016: Version 0.5, updated information on ACM and IEEE.

Acknowledgments: I thank Moritz Beller (TU Delft) and Dirk Beyer (LMU Munich) for valuable feedback and corrections.

© Arie van Deursen, November 2016. By the end of December 2016 this FAQ will be licensed under CC BY-SA 4.0.

PhD Student Vacancy in Test Amplification

Within the Software Engineering Research Group of Delft University of Technology, we are looking for an enthusiastic and strong PhD student in the area of “test amplification”.

The PhD project will be in the context of the STAMP project funded by the H2020 programme of the European Union.

STAMP is a 3-year R&D project, which leverages advanced research in automatic test generation to push automation in DevOps one step further through innovative methods of test amplification. It will reuse existing assets (test cases, API descriptions, dependency models), in order to generate more test cases and test configurations each time the application is updated. This project has an ambitious agenda towards industry transfer. In this regard, the STAMP project gathers 3 research groups which have strong expertise in software testing and continuous development as well as 6 industry partners that develop innovative open source software products.

The STAMP project is led by Benoit Baudry from INRIA, France. The STAMP consortium consists of the following partners

The PhD student employed by Delft University of Technology will conduct research as part of the STAMP project together with the STAMP partners. Employment will be for a period of four years. The PhD student will enroll in the TU Delft Graduate School.

The primary line of research for the TU Delft PhD student will revolve around runtime test amplification. Online test amplification automatically extracts information from logs collected in production in order to generate new tests that can replicate failures, crashes, anomalies and outlier events. The research will be devoted to (i) defining monitoring techniques and log data analytics to collect run-time information; (ii) detecting interesting behaviors with respect to existing tests; (iii) creating new tests for testing the behaviors of interest, for example through state machine learning or genetic algorithms; (iv) adding new probes and new log messages into the production code to improve its testability.

stamp-wps

Besides this primary line of research, the PhD student will be involved in lines of research led by the other STAMP partners, addressing unit test amplification and configurability test amplification. Furthermore, the PhD student will be involved in case studies and evaluations conducted in collaboration with the industrial partners in the consortium.

From the TU Delft Software Engineering group, several people will be involved, including Arie van Deursen (principal investigator), Andy Zaidman, and Mauricio Aniche. Furthermore, where possible collaborations with existing projects will be setup, such as the 3TU Big Software on the Run and TestRoot projects.

Requirements for the PhD candidate include:

  • Being a team player;
  • Strong writing and presentation skills;
  • Being hungry for new knowledge in software engineering;
  • Ability to develop prototype research tools;
  • Interest in bringing program analysis, testing, and genetic algorithms together;
  • Eagerness to work with the STAMP partners on test amplification in their contexts;
  • Completed MSc degree in computer science

For more information on this vacancy and the STAMP project, please contact Arie van Deursen.

To apply, please send an application letter to Tamara Brusik. The letter should include a clear motivation why you want to work on the STAMP project, and an explanation of what you can bring to the STAMP project. Also provide your CV, (pointers to) written material (e.g. a term paper, an MSc thesis, or published conference or journal papers), and if possible pointers to (open source) software projects you have contributed to. The deadline for applications is November 15, 2016.

We look forward to receiving your application!

Asking Students to Create Exam Questions

Do you also find it hard to come up with good multiple choice questions? Then maybe you will like the idea of letting students propose (rather than just answer) questions. A colleague suggested this idea, arguing that it would benefit the students (creating a question requires mastering the material) and would save me work as well.

I liked this idea, and during the last three years I have applied it in my undergrad software testing course. This is a course for around 200 students which are evaluated based on an individual multiple choice exam (besides programming work conducted in pairs).

In class, I discuss example questions, and I invite students to come up with their own. The logistics are as follows:

  • An exam consists of 40 multiple choice questions.
  • Students can submit their questions until one week before the exam.
  • As a teacher I decide which (if any) of the questions I include, and whether I think changes to the questions are necessary.
  • If I include a student question, the student benefits from knowing the answer and from receiving a small bonus for submitting an included question.

To help the students in creating questions, I point them to Cem Kaner’s post on writing multiple choice test questions. I explain that for each question I need:

  • A clear stem of one or two sentences that is meaningful in itself;
  • One clear correct choice;
  • Three distractors that are approximately equally plausible yet also objectively incorrect.

So far, I have used this procedure for eight exams during the last three years.

The students who have proposed questions that I include consistently turn out to belong to the best. This probably means that only very good students go through the effort of creating a question; It also suggests that trying to come up with a question is a good way of preparing for an exam.

For each exam I receive 10-20 questions from around five students: This very much depends on the individual students and may vary per year. Some students recognize the opportunity and submit 20 questions; But most consider it too difficult and do not come up with any.

I typically include 3-5 student questions in the exam (so one in ten questions comes from a student). This essentially depends on the number of good student questions proposed — I don’t impose an upper limit on the number of questions students can submit nor on the total number of student questions that I’m willing to include.

It is only at the exam that the students find out which questions I ask, and whether any of their questions are included. So while there is the possibility that all students share and know in advance some of the questions that might be asked, the students still need to prepare to answer other questions.

My class wondered whether I would be willing to let all 40 questions be provided by a student: My answer was ‘yes’: if a student masters the material so well that he or she is able come up with 40 usable questions covering all the material, that students deserves the highest grade.

Not all submitted questions are usable. I haven’t done the precise math, but I think I include around 20% of the proposed questions. Reasons not to include a question typically are that the question is too simple, that it is ambiguous (some distractors can be considered correct too), or that it overlaps with another question that I consider better. Some students also propose (small variations on) questions that I had used in exams of earlier years. If the similarity is too big, I reject the question.

In some cases I adopt the underlying “idea” of a proposed question, yet rewrite it substantially. In those cases the proposing student still receives the bonus point; Furthermore, the student will probably still know the correct answer.

The best part about involving students in exam creation is that some of the proposed questions are better than I could have made myself. Such questions relate to the students’ own experience (e.g.: “In an earlier course we had to aim at 80% line coverage. In light of what we learned in this course, which of the following …”).

Overall, I am very happy with this way of involving students in the exam creation process. It not only saves me some (though not much) work — it also results in inspirational questions that I could not have invented myself. And, perhaps most importantly, it makes exam creation a lot more fun to me.


Acknowledgments:

  • The idea to let students propose their own questions was suggested to me by Julia Caussin, programme coordinator of the bachelor computer science at Delft University of Technology.

  • Image credit: bilal-kamoon, flickr, CC BY 2.0.


See also:

Embedded Software Development with C Language Extensions

Arie van Deursen, with Markus Voelter, Bernd Kolb, and Stephan Eberle.

In embedded systems development, C remains the dominant programming language, because it permits writing low level algorithms and producing efficient binaries. Unfortunately, the price to pay for this is limited support for explicit and safe abstractions.

To overcome this, engineers at itemis and fortiss created mbeddr: an extensible version of C that comes with extensions relevant to embedded software development. Examples include explicit support for state machines, variability management, physical units, interfaces and components, or unit testing. The extensions are supported by an IDE created through JetBrains MPS. Furthermore, mbeddr users can introduce their own extensions.

To me, the ideas under mbeddr are extremely appealing. But I also had concerns: Would this work in practice? Does this scale to real world embedded systems? What are the benefits of such an approach? What are the problems?

Therefore, when Markus Voelter, lead architect of mbeddr invited me to join in a critical evaluation of a system created with mbeddr that they just shipped, I happily accepted. Eventually, this resulted in our paper Using C Language Extensions for Developing Embedded Software: A Case Study, which was accepted for publication and presentation at OOPSLA 2015.

The subject system built with mbeddr is an electricity smart meter, which continuously senses the instantaneous voltage and current on a mains line using analog front ends and analog-to-digital converters. It’s mbeddr implementation consists of 80 interfaces and 167 components, corresponding to roughly 44,000 lines of C code.

Main layers, sub-systems, and components of the smart metering system.

Main layers, sub-systems, and components of the smart metering system.

Our goal in analyzing this system was to find out the degree to which C language extensions (as implemented in mbeddr) are useful for developing embedded software. We adopted the case study research method to investigate the use of mbeddr in an actual commercial project, since the true risks and benefits of language extensions can be observed only in such projects. Focussing on a single case allows us to provide significant details about that case.

To achieve this goal, we investigated the following aspects of the smart metering system:

  1. Complexity: Are the abstractions provided by mbeddr beneficial for mastering the complexity encountered in a real-world embedded system? Which additional abstractions would be needed or useful?
  2. Testing: Can the mbeddr extensions help with testing the system? In particular, is hardware-independent testing possible to support automated, continuous integration and build? Is incremental integration and commissioning supported?
  3. Overhead: Is the low-level C code generated from the mbeddr extensions efficient enough for it to be deployable onto a real-world embedded device?
  4. Effort: How much effort is required for developing embedded software with mbeddr?

The detailed analysis and answers are in the paper. Our main findings are the following:

  • The extensions help mastering complexity and lead to software that is more testable, easier to integrate and commission, and that is more evolvable.
  • Despite the abstractions introduced by mbeddr, the additional overhead is very low and acceptable in practice.
  • The development effort is reduced, particularly regarding evolution and commissioning.

In our paper, we also devote four pages to potential threats to the validity of our findings. Most importantly, in our experience with this case study and other projects, introducing mbeddr into an organization may be difficult, despite these benefits, due to a lack of developer skills and the need to adapt the development process.

The key insight for me is that mbeddr can help bring down one of the biggest cost and risk factors in embedded systems development, which is the integration and commissioning on the target hardware. Typically, this phase accounts for 40-50% of the project cost; for the smart meter system this was 13%. This was achieved by extensive unit and integration testing, using interfaces that could be instantiated both in a test as well as a target hardware environment.

Continuous integration is not just about the use of a continuous integration server. It is primarily about carefully modularizing the system into components that can be tested independently in different environments. Unfortunately, modularization is hard, especially in languages without explicit modularization primitives. Our study shows how extending C with language constructs can help to devise a modular, testable architecture, substantially reducing integration and commissioning costs.

For more information, see:

  • Markus Völter, Arie van Deursen, Bernd Kolb, Stephan Eberle. Using C Language Extensions for Developing Embedded Software: A Case Study. OOPSLA/SPLASH 2015 (pdf).
  • Presentation at OOSPLA 2015 by Markus Voelter (youtube, slides)
  • Information on this paper at the OOPSLA program pages.

Delft Technology Fellowship for Top Female (Computer) Scientists

TU Delft Logo

Delft University of Technology is aiming to substantially increase the number of top female faculty members. To help accelerate this, the Delft Technology Fellowship offers high-profile, tenure-track positions to top female scientists in research fields in which Delft University of Technology (TU Delft) is active.

One of those fields is of course Computer Science — so if you’re a female computer scientist (or software engineering researcher!) interested in working as an assistant, associate or even full professor (depending on your experience) at the departments of Computer Science and Engineering of the TU Delft Faculty of Electrical Engineering, Mathematics, and Computer Science (EEMCS), please consider applying.

Previous rounds of the TU Delft Fellowship program were held in 2012 and 2014. In both years, 9 top scientists were hired, in such diverse fields as interactive media design, protein machines, solid state physics, climate change, and more.

Since applicants can come from any field of research, the competition for the TU Delft fellowship program is fierce. The program is highly international, with just four out of the current 18 fellows from The Netherlands. As a fellow, you should be the best in your field, and you should be able to explain to non computer scientists what makes you so good.

As a Delft Technology Fellow, you can propose your own research program. As in previous years, it can be in any research field in which TU Delft is active, such as computer science.

The computer science and engineering research at TU Delft is organized into 12 so-called sections, covering such topics as algorithmics, embedded software, cyber security, pattern recognition, and my own topic software engineering. Each section consists of around four faculty members and 10-15 PhD students, and is typically headed by one full professor. PhD students are usually externally funded, through government subsidies obtained in competition, or via collaborations with industry.

As a fellow at the EEMCS faculty, you are expected to bring your own topic. You would, however, typically be working within one of the existing sections. Thus, if you apply, it makes sense to identify the section that is most related your area of work, and explore if you see collaboration opportunities. To that end, you can contact any of the section leaders, or me if you want to discuss where your topic would fit best. Naturally, if you are in software engineering, also feel free to contact me, or any current SERG group member.

For formal instructions on how to apply, please consult the Fellowship web site. The application procedure is open from 12 October 2015 until 8 January 2016.

PhD/PostDoc Vacancies in Persistent Code Reviews

logo-nwo

In the fall 2015 we are starting a brand new project that we titled Persistent Code Reviewing, funded by NWO. If you’re into code reviews, software quality, or software testing, please consider applying for a position as PhD student or Postdoc within this project.

To quote the abstract of the project proposal:

Code review is the manual assessment of source code by human reviewers. It is mainly intended to identify defects and quality problems in code changes before deployment in production. Code review is widely recommended: Several studies have shown that it supports software quality and reliability crucially. Properly doing code reviews requires expensive developer time and zeal, for each and every reviewed change.

The goal of “Persistent Code Reviews” project is to make the efforts and knowledge that reviewers put in a code review available outside the code change context to which they are directed.

Naturally, given my long term interest in software testing, we will include any test activities (test design and execution, test adequacy considerations) that affect the reviewing process in our analysis.

The project is funded by the Top Programme of NWO, the Netherlands Organization for Scientific Research.

Within the project, we have openings for two PhD students and one postdoctoral researcher. The research will be conducted at the Software Engineering Research Group (SERG) of Delft University of Technology in The Netherlands. At SERG, you will be working in a team of around 25 researchers, including 6 full time faculty members.

In this project you will be supervised by Alberto Bacchelli and myself. To learn more about any of these positions, please contact one of us.

Requirements for all positions include:

  • Being a team player;
  • Strong writing and presentation skills;
  • Being hungry for new knowledge in software engineering;
  • Ability to develop prototype research tools;
  • Interest in bringing program analysis, testing, and human aspects of software engineering together.

To apply, please send us an application letter, a CV, and (pointers) to written material (e.g. a term paper or an MSc thesis for applicants for the PhD positions, and published papers or the PhD thesis for the postdoc).

We are in the process of further distributing this announcement: Final decisions on the appointments will be made end of October.

We look forward to receiving your application as soon as possible.