Preprint or not Preprint? A discussion worth having
by Maria Eichel
The times we live in have suddenly changed in spring 2020 with the Covid-19 pandemic impacting not only the social but also daily work life of everyone around the globe. In a recent article by the Open Science group in collaboration with the Offspring we highlighted how research opens up during the Covid-19 pandemic and how important transparency and open resources are in times when information needs to be accessed quickly and easily (for more information see: https://www.phdnet.mpg.de/131182/2020-04-14_openscience-covid19?c=22833) In this article we would like to inform you about the benefits and possible downsides of uploading an article to preprint servers such as bioRxiv.
First, let’s go back some months to December 2019. After attending the Open Access Ambassador conference in Berlin, an event organized by the Max Planck digital library and Max Planck PhDnet, I came back to my institute not only with more knowledge about Open Science practices but also an urge to spread the word about Open Access initiatives. For example, the availability and rising use of preprint servers like bioRxiv. Once I started talking about it with my peers, I noticed two different scenarios: I) those who are really into the topic and sometimes knew even more about it than me and II) those who responded, “sounds good but I would like to know more about this”. Of course, there was also a third, more wary scenario: “Isn’t there any danger in uploading to preprint servers?”. In exactly that moment I decided to summarize what I learned and collect different perspectives and existing data to give a talk: “Preprint or not preprint? A discussion worth having”. In exactly that moment I decided to summarize what I learned and collect different perspectives and existing data to write this article for a broader audience. Coming from the biological sciences, I will mostly relate to bioRxiv but some of this information applies to other fields as well. Please keep in mind this is an opinion piece even though I did some thorough research — it is not a guideline.
“I want to publish on a preprint server because….” – “I don’t want to publish on a preprint server because…”
This can be a scenario some of you might have experienced already. It can be a discussion between colleagues and friends, one between authors of a manuscript, one amongst early career researchers (ECRs) and senior researchers or between collaboration partners; but all of the time it is a discussion between scientists. This often leads to a circle of arguments with no one really convincing the other. Despite these differences, in the end we all share a common goal to foster good science; therefore we must find a common solution to this problem in publishing. Let us consider some important characteristics of an “ideal” scientist. In my opinion, scientists are rational, based on facts, curious, realistic, open and as transparent as possible, respectful, brave, and with an urge to shape and improve the future of science. Am I a bit optimistic or dreamy? Maybe yes, but isn't this kind of optimism the beauty of young minds?
Second, let's consider why we publish? As an ECR with a passion for my project I would like others, also lay people, to know about my science. Being at the end of my PhD I have started taking care of my future career steps and here comes the second reason: I want to continue doing what I like, which is working as a scientist. Publishing increases my chances to remain in science after my PhD, and later in my career I will face the point where I need money to keep doing what I like. Which leads me to one other reason for publishing: we need publications to get money! I am pretty sure all of you can see the circle there and are aware of the many hurdles along the way to just be able to continue doing what you like so much. These hurdles won`t be the content of this article though; it's rather about one possible way to jump across some of these hurdles – if there is a change in the way we publish altogether!
A short history of preprints
And now let's travel back in time to about 60 years ago: In the 1960s the first Information Exchange Groups (IEG) were created by the National Institutes of Health (NIH) which was basically a group of scientists from similar research areas. With the help of the NIH scientists could send drafts for consideration which were copied and distributed in the respective groups to foster scientific exchange. And it had quite some famous names joining in, such as James Watson and Francis Crick, who discovered the structure of DNA. Last but not least, due to immense costs and a lot of pressure from the publishing sector the IEGs were eventually forced shut down (see the full story at: https://www.sciencemag.org/news/2017/08/forgotten-experiment-biologists-almost-launched-preprint-revolution-5-decades-ago#). The first official preprint repository arXiv.org was founded in 1991 even before the internet became popular. It mainly contains articles from physics, mathematics and computer science and has grown immensely over the past decades . What has been common in the physics world for decades finally started for the biological sciences in 2013 with the launch of the nonprofit preprint server bioRxiv by Cold Spring Harbor Laboratory (CSHL). And it doesn’t stand alone: between 2007 - 2012 the Nature Publishing house started the server “Nature Precedings” and up to now other preprint servers like ASAPbio (https://asapbio.org/) or the multidisciplinary platform Preprints (https://www.preprints.org/) exist. One important contribution to the rise of the preprint was that research funders such as the NIH, the Wellcome Trust, U.K. Medical Research Council and the German DFG legitimized and even encourage the use of preprints in project proposals. As a reminder: this is where we get the money to continue doing what we like! In addition, big scientific journals like Cell, Science or Nature and plenty of others do formally accept the submission of manuscripts which are already posted on a preprint server [2, 3]. If you are unsure about the open access policies of the journals you plan to submit to you can find useful information at the Sherpa Romeo database (http://www.sherpa.ac.uk/romeo/search.php). By 2019 more than 1 million articles were downloaded monthlyfrom bioRxiv, with authors from neuroscience and bioinformatics submitting the majority of studies [4 and 5].
Are you still in awe and wonder what a preprint server actually is? Good question which is important to know before starting into our discussion. When asking Wikipedia it gives you a nicely summarized description: “ […] a preprint is a version of a scholarly or scientific paper [authors remark: with a citable DOI] that precedes formal peer review and publication in a peer-reviewed scholarly or scientific journal. The preprint may be available, often as a non-typeset version available free, before and/or after a paper is published in a journal” .
Now, let's start with the arguments against bioRxiv that I happened to come across either on the internet or during discussion with scientists independent of their career stage. Before engaging on the positive effects of posting your manuscript on a preprint server I will comment on some of the negative aspects right away.
“Preprint or not preprint? – a discussion”
One of the main arguments I heard so far is probably also the most common one: “A Journal might not recognize the preprint or reject my manuscript because of it.” Yes, there are journals out there whose open access policies and the usage of previously uploaded manuscripts on preprint servers are in my opinion rather old-fashioned. For the life-sciences this is mainly the New England Journal of Medicine. But keep in mind that especially for patented work, as well as for clinical studies, uploading to bioRxiv might come with some danger. Recently, CSHL also released medRxiv.org, a preprint server with tighter standards for health science, medicine and clinical research (https://www.medrxiv.org/).
Second argument: “A preliminary, non-peer reviewed study might be bad-mouthed publicly or on social media which could affect the decision of future editors or reviewers”. Plainly speaking — yes this can happen. But this can also happen post publishing and we are all aware of published studies that have minor or even major flaws. Question yourself: Do you trust in your science? Do you think your manuscript is ready for publishing and would you send it to a journal (whatever journal) in this state? Yes? Then go forward with it. Your manuscript can/will be judged by editors, reviewers and other scientists no matter if you publish in a journal or upload it to bioRxiv. A bioRxiv user survey by Sever and colleagues in 2019 could show that the majority of users received feedback on their manuscript either via Twitter (44%) or privately via email (37%) and conversation with colleagues (34%) . In my small scientific field, I checked various labs on bioRxiv and Twitter and could find no indication of public bad-mouthing of studies which potential reviewers/editors could see. And honestly speaking, would I openly bad mouth a scientist in my field or would I rather aim at a personal contact with my criticism? I am not trying to convince someone, but I want to make critics aware that what they fear might not be happening publicly on social media and affecting the future prospects if it is a well conducted paper. On the contrary one of the highest motivations to post manuscripts on bioRxiv is indeed increasing the awareness of your own research (about 80%, right before the argument to be benefitting science) .
Another critical argument I happened to come across is rushing to preprint might sacrifice accuracy. This argument is absolutely valid and that’s why you should never rush to a manuscript — preprint or journal. And yes, there are manuscripts on bioRxiv that are not complete. One should keep in mind that you set the standard for your science when you upload a manuscript on a preprint server. This way others can also see what your standard of science is and what you consider a finished manuscript. Especially for ECRs this point might be crucial, but I will come back to this later. In my opinion preprint servers should not replace journals but rather serve as an add on. An add on which has a lot of advantages for the scientific field and yourself as a scientist.
“There is no selection and revision, and everyone can just publish everything”. Well, let`s be honest here: How often did you come across a published paper that had major mistakes or appeared everything besides flawless or where you wondered really this came out in journal xxx? So even with published papers I turn on my brain and try to evaluate the content of the study and that’s what we all are supposed to do and need to learn (see characteristics of the ideal scientists above). One could even turn this around and see it as a chance for an ECR to learn to reflect more on the science when reviewing bioRxiv articles instead of trusting journal names and impact factors. However, there comes a danger regarding the misuse of publicly available scientific data by mainstream media and the private sector which can be quite dangerous especially during times of Covid19. For this reason, bioRxiv (and other preprint servers) reminds everyone that the published studies are preliminary data reports and have not been peer-reviewed. Also, it can happen with peer-reviewed published papers as well – one of the reasons why science communication and education of the media is even more important nowadays.
Last but not least, two more arguments against posting on preprint servers appeared in common discussions: “Preprints have a lower visibility” and “I fear getting scooped”. I will leave the two last contra-preprint arguments just standing in the room for you to discuss with yourself after reading the pro-preprint argument part of this article.
The aforementioned survey of 4000 bioRxiv users has a nice collection of the most common pro-preprint arguments I personally came across with (see figure below) and reflects what the majority of ECRs of the department I work in point out.
First, I will refer to the arguments related to quality, impact and discoverability: “To increase awareness of your research, to benefit science, to control when research is available and receive feedback”. When posting your article on a preprint server the public, but also private feedback you can receive as an author is naturally from a broader audience either on the bioRxiv platform, via mail, on Twitter. With this your science can reach a community beyond your lab and coauthors and selected reviewers. Input at this stage of a manuscript even has the opportunity to better your chances of getting published since you can use questions or criticisms to prepare for your revision phase or revise your manuscript before submission. Because of a faster dissemination of research your manuscript revision can potentially speed up. Further, the word about your science is out there and your research is visible and citable since you get a DOI with your upload to bioRxiv. Another study on bioRxiv, which was also featured on Natureindex.com, indeed suggests that published articles which have been uploaded on preprint servers get more citations and online visibility than those without a preprint . In addition, it comes in handy for job applications as well. By uploading a preprint (instead of waiting until your manuscript is published after long revision processes and possible paper hopping) future employers can already see your research and your standard of science and might be more interested in hiring you.
Notably, by now common research funding agencies such as the NIH, MRC, EMBL, DFG and many more take preprints into account for job and grant application. If you are at a career transition stage, it might be crucial to present the work you did so far for future supervisors or grant agencies to judge from which field you are coming from and to get insight on what you did. These points are especially crucial for, but not exclusively, ECRs that are applying for postdocs, new positions or grants but it can also come in handy for faculty positions and such. Mentors of ECRs should especially consider these points and support their graduates.
More than 50% of the survey respondents also answered to stake a priority claim on their research as a motivation to upload to bioRxiv. This motivation should be crucial for everyone especially in highly competitive fields because you do prove to have been the first to have a manuscript ready at a given time-point and thus receive a timestamp to your research. This priority claim is also commonly checked in journals and there are even examples of side-to-side publication of two studies because one was uploaded on a preprint server before. Nowadays some journals even state a so-called “scooping protection” which means they will consider your manuscript if it was uploaded to a preprint server within a given time frame even though a competitor might have published a similar story in the meantime. Nonetheless, it can be a double-edged sword since competing labs might try to claim priority with an unfinished manuscript. This emphasizes why a discussion with your coauthors and supervisors about uploading a preprint is indispensable and is also vastly dependent on your field of research
Last but not least, posting articles on preprint servers has the chance to prevent redundant work (e.g. posting of negative data) and foster collaboration of similar projects or be a platform for posting controversial findings. In addition, we should not forget that by using preprint servers’ minorities in the scientific field (e.g. smaller unknown/younger groups, minorities or research from more underdeveloped countries) have a chance to make their research available without the high submission and publishing fees, and can also easily access research of others. This has the chance to spread your research even wider and is one major reason for open access policies — to better the scientific field together! And let’s not forget uploading your manuscript to bioRxiv is for free [3, 7, 8].
The development of the preprint servers has changed the way we think about distributing data and in the future it will most likely increase in several research fields, how quickly (or slowly) this occurs remains to be seen. Some journals, such as EMBO together with ASAPbio & others and recently also eLife, started initiatives called “Review commons” or “Preprint Review”. These services offer to review your manuscript on bioRxiv and consider its publication in a respective journal alongside [9, 10]. A future article by the Offspring will focus on this initiative so stay tuned.
If you read this article until this point you will hopefully have an overview of the most common arguments about preprint servers but by far not all. This can serve as a starting line for you to form your own opinion, read up on this topic (see literature below) and increase the awareness amongst your peers. You are convinced? Go out and have discussions with your fellow scientists, amongst departments, graduate schools and with your supervisors if your next manuscript will be uploaded to bioRxiv. Keep an open mind also for opposing arguments. After all, in my opinion scientists are rational, based on facts, curious, realistic, open and as transparent as possible, respectful, brave, and with an urge to shape and improve the future of science. Would you like to join me in this dream?
(also published in ELife https://elifesciences.org/articles/45133)
also featured in nature: https://www.natureindex.com/news-blog/preprints-boost-article-citations-and-mentions