May 25, 2021

Citation Hustlehooking into academe for fair code credit

This post is part of a series, Code Credit License.

I started the Code Credit License because I’ve never heard a good argument against public credit for software developers and the credit system we had—copyright notice—was never sufficient and broke down for network apps more two decades ago. If you give software away and other people use it to make public work, you should get public credit for contribution. Especially if others do.

Without a strong, consistent norm of credit giving and taking, backed by legal terms as necessary, developers are stuck not only begging for the credit they deserve, but begging awkwardly. Case in point: academic citation.

I’ve long been aware of GNU parallel’s plea for citations. The maintainer went so far as to block execution on first run to show a notice about citation and demanding the user type in confirmation that they will abide. The proper form of citation remains available behind parallel --citation:

Academic tradition requires you to cite works you base your article on. When using programs that use GNU Parallel to process data for publication please cite:

O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47.

Earlier this week, watching Hadley Wickham’s “State of the Tidyverse 2020” talk, I saw the same again, with a twist:

Really excited that we now have a paper that you can cite if you want to cite the entire Tidyverse, which you can get by using the citation function. The kind of idea of that paper is that rather than having to cite every package you might use individually, there’s just, like, one place you can cite. It’s a published article in The Journal of Open Source Software, where, it’s kind of…I have to admit that I’m, like, slightly addicted to checking the citation count. Like, normally when you check the citation count of a paper it, like, changes, like, once every six months of something. But we’ve already had, like, seventeen citations in the two months since it’s been published, which is really, really amazing.

Unlike parallel, the Tidyverse—a constellation of R packages for statistical analysis—stands close to academia both in who makes it and what it does. Hadley is a professor of statistics at three universities, in addition to chief scientist at R Studio. R and Tidyverse remain very popular among statistical analysts of all stripes, including academic researchers.

But to get citations, you can’t just publish good work. You have to publish something citable. Academics don’t know how to cite software, and aren’t terribly interested in finding out. They know how to cite PDFs in journals with ISSNs and DOIs. So at lease within academia, as it stands now, it’s on you to go above and beyond making software, to engage with a process like that of JOSS, and register yourself in the academic universe. Then, maybe, the ethical and professional norms of credit where due can apply to you.

Daniel S. Katz—a colleague I follow, and recommend following—has been involved on both sides of this realization: earlier in writing and pushing conventions for citing software as software, now in JOSS. That’s work worth acknowledging on its own.

I’m sure there are more examples of software projects jumping through these hoops, with or without the journal process. According to this page, JOSS has published more than a thousand papers. But I think I’ve seen enough to catch the pattern.

I took the first drafts of the Code Credit License aware of the efforts toward citation principles, but unaware of JOSS. I think the terms we ended up with still hold:

Give Credit

Give this software and each contributor credit for contributing to goods or services that you develop, test, produce, or provide with the help of this software.

How to Give Credit

In general, give credit in such a way that others can freely and readily find a written notice identifying this software, by name, as a contribution to your goods or services, as well as each contributor, by name, as a contributor to this software. Do not do anything to stop others from sharing, publishing, or using those credits.

Conventions

If widespread convention dictates a particular way to give credit for your kind of goods or services, such as by end credit for a film, citation for an academic paper, acknowledgment for a book, or billing for a show, then follow that convention. …

In order to cover multiple kinds of output—not just academic papers, but films, artworks, books, and so on—the terms had to be somewhat general. That looks every more like a strength, rather than a weakness. Credit norms evolve. New media arise.

I would certainly read the Conventions section as fully compatible with citation to a JOSS paper for a package or package ecosystem. In the absence of such a paper, I would still read the terms to require academic authors to cite projects and contributors by name, perhaps according to the software citation principles.

Your thoughts and feedback are always welcome by e-mail.

back to topedit on GitHubrevision history