Take a break

Motivation for today

Join the Facebook group

Researcher Voice

A private safe space for researchers around the world to be themselves, learn from and motivate each other, and have some fun!

Join now

See related articles

How journals make assertions: An insight into the publishing industry

This article is in

Richard Wynne

Dec 12, 2017

Reading time

8 mins

How journals make assertions: An insight into the publishing industry

When I was 19, I took a trip to the United States. I travelled across country by bus, and was able to do a lot of reading as I moved from state to state. One of the books I read was The Effective Executive by Peter Drucker. A key message within the book is to: “know your business.” For example, if you think you’re in the oil business, you’re wrong. You’re actually in the energy business. If you’re in the postal business, you’re really in communications, and you’d better understand the impact of e-mail!

Assertions journals make about their content

What “business” is the scholarly publishing industry in? The content business? No. What business are journals in? I would say that journals are in the “assertions” business. Here are some of the assertions that journals make: This content is worth reading because it is novel and because it is original; These are the authors; These are the institutions; The content has been through peer review; The statistics support the conclusions, etc. If you think about the work that a journal does, it’s really collecting a group of assertions about the content. It’s the assertions that make the content valuable.

Historically, assertions have been implied by journal format. A journal didn’t need to explicitly say “this is the title,” “these are the authors,” “this is the abstract.” Instead, all of this was woven in to the format that a journal used. It was a good way of communicating assertions for the past 350 years. This dates all the way back to Henry Oldenburg’s Philosophical Transactions: using format to communicate assertions.

The need to validate assertions

But the assertion business is changing, and fast! To begin with, it’s increasingly easy to make false or inaccurate assertions. If format is all a journal is using to communicate assertions, anyone else can replicate that format. Clearly that isn’t the case. What about the quality of assertions made? If you happen to make an unsubstantiated assertion these days, there’s a whole industry, including outlets like Retraction Watch, outing these kinds of failures. Format is no longer a guarantee of assertion quality.

Research funders are spending $1.6 trillion a year-- $50,000 a second—so understandably they want better tools to evaluate research output. They want higher quality, accurate, granular assertions. Those assertions must be machine readable, not just human readable. Return on research investment can be proven if assertions are more reliable.

From a journal perspective, is this an opportunity or a threat?

Let’s not think about content workflow. Rather let’s think about an assertion workflow. An author makes a number of assertions about what they submit to a journal: I’m an author. These are my co-authors. Here’s who funded the paper. Here are the methods I used. The data supports the results, etc. One of the things that the journal does is evaluate those assertions for accuracy. Then, the journal makes its own assertions: This work has been through peer review. This is original. Next, the journal boils the assertions down to published content. This is where it’s important to understand that the process of creating assertions is not the same as the process of creating format.

Related Infographic

4 Years and counting: A look back at the Editage Insights journey

As part of the four-year anniversary of Editage…

3.3k

Tools journals use to validate assertions

Semantic tagging: This means taking a look at the process or, from a technical perspective, at the workflow infrastructure. How assertions are tagged can be critical to their usefulness in workflow. Take the word Brown, for example. Is this word a university? A name? A color? A street address? Let’s use simple tagging. For example, <b> Brown </b>, tells software applications that Brown should be bolded: Brown. Still, that is format, which is not a reliable way to communicate assertions, because it doesn’t say anything about what Brown is.

Using semantic tagging, such as <author> Brown </author>, it is now clear what Brown is: an Author. However, because different publications might use different semantic tags to describe an author (e.g. Contributor, Author, or Article Author) there is still some ambiguity. What’s needed is an agreed way of tagging the author. This is where Document Type Definitions (DTDs) come in.

Formatting: The Journal Article Tag Suite (JATS) is one way (DTD) of agreeing to tag scholarly journal data. The software here knows that Brown is an author from reading the tags around the name. A style sheet can then be used to apply formatting ‘rules’ that anything tagged as an author should be represented in bold. Thus the word 'author' is bolded and positioned in the correct area of the manuscript.

Related Video

[Open Access Week 2022 special] What researchers need to know about open access publishing: Interview with industry experts from Brill

In this video, Mriganka Awati, senior writer, CACTUS,…

880

Formatting is nothing but presentation. And when there’s talk of formatting, XML cannot be far behind. It may be tempting to think of XML as just a ‘fancy’ and convenient way to do formatting. But it’s more than that. Journals can change the styling ruleset to say, for example, all author names should be shown in blue, and from that point on, that will happen.

Persistent identifiers: Still, in order to get the maximum value out of XML, persistent identifiers need to be used. Let’s take ORCID for the case of the author named Brown. In this case, ORCID will reliably tell the software which specific Brown the journal is referring to. Using API integrations, a journal can validate that it’s not just anyone asserting which Brown we’re talking about, that there is an authoritative source (i.e. ORCID) doing that. If an unsophisticated user decides to manually enter ORCID iDs to manuscript XML, that does not achieve anything other than adding a bit of text to the XML output. The problem is that the text-entered ORCID hasn’t been validated anywhere. The ‘right’ way to add an ORCID iD to XML is through an API call to the ORCID database. This empowers users themselves to validate their ORCID iD. Outputting a validated iD in XML is a reliable assertion that flows through the workflow.

This same approach can be taken to add more reliable assertions to workflow. For example, by identifying an institution to whom an author has an association. This is possible, for example, if the author and/or the institution has validated their institutional affiliation to “Northeastern University” in China. This validation is possible through Ringgold institutional identifiers. Now the assertion that author Brown belongs to Northeastern University in China is validated and can persist in workflow. This type of assertion is valuable to funders since it helps them track their investment. But we can go further. Now we know who did the research and which institution(s) they belong to but we do not know what they did. Here CRediT roles can assert the contribution, and degree of contribution in workflow. Open Funder Registry identifiers can confirm who funded the research, citations can be asserted using DOI linking, and so on. This logic of interconnected assertions can be applied to many aspects of journal workflow.

Tools to aid peer review: Speaking of journal workflow, let’s look at peer review. In peer review systems such as Editorial Manager, there are tools to support making and validating additional journal assertions: Has this content been plagiarized? Is this content novel? (Similarity Check, Meta) Are the citations accurate? (Reference linking) Have conflicts of interest been disclosed? Are the statistics accurate? (StatReviewer)