I have found about 2,000 problematic papers, says Dr. Elisabeth Bik
When we talk about misconduct in research writing or publishing we often think of plagiarism or duplicate publication. Another common (unfortunately so) problem under the umbrella of misconduct is that of image manipulation or duplication. And today, we’re talking to a scientist who decided to promote and ensure the scientific integrity of published research. Meet Dr. Elisabeth Bik, a science consultant and founder of the blog Microbiome Digest, who recently announced her decision to devote all of her time to scanning scientific papers with problematic images, pro bono.
I am taking a year off from paid work to focus more on my science misconduct volunteer work. Science needs more help to detect image duplication, plagiarism, fabricated results, and predatory publishers.— Elisabeth Bik (@MicrobiomDigest) April 26, 2019
Dr. Bik completed her doctorate on cholera at the Dutch National Institute for Health. After gaining four years of experience at the St. Antonius Hospital in Nieuwegein, she moved to the US and spent 15 years working on microbiome research at the School of Medicine at Stanford. She started the Microbiome Digest blog in 2014 as a way to share the latest research in the field on a daily basis. Dr. Bik’s work on identifying image manipulation has been talked about widely. She also held the positions of Science Editor and Scientific and Editorial Director at a San-Francisco-based biotechnology company uBiome, and as the Director of Science at Astarte Medical, a precision medicine company. In 2019, she became a Microbiome and Science Integrity consultant and began to spend time on identifying problematic images in scientific papers and talking about research integrity. Dr. Bik has considerable publication as well as peer review experience. She has also been an invited speaker at several conferences and universities, trained and mentored graduate students as well as research assistants, and set up several molecular labs.
In this conversation, Dr. Bik talks about her journey from being a researcher to a science integrity consultant. We talk in detail about her work and understand how she goes about identifying problematic images in scientific papers and how technology could help ensure the integrity of published research. She also talks about how peer reviewers and journals could help weed out cases of image manipulation before publication.
How did you think of setting up the Microbiome Digest? What criteria do you use to include papers about microbiology or microbiome on your blog?
The idea for the Microbiome Digest blog started during a May 2014 happy hour of the David Relman lab at Stanford, the lab where I was working in at that time. In the previous year, I had been sending my lab mates weekly or sometimes daily digests of new literature that I found on PubMed. The microbiome research field was growing very fast, and the list of new papers was getting longer and longer. During that happy hour, my coworker Tomer Altman suggested that other labs might be interested in those papers too and that I should turn my emails into a public website. That evening I bought the MicrobiomeDigest.com domain and started the blog.
Around 2017, the blog was starting to take too much toll on my private life. Using Twitter, I recruited a wonderful team of volunteers who are now running the blog. We still try to post daily. We include a variety of papers published in peer reviewed journals and articles from popular science websites. Topics can range from human-associated microbiome papers to those about the microbiome of soil, plants, animals, or the built environment. We also include papers on other topics within microbiology, general science, or science and art.
You recently made the conscious decision to spend a considerable amount of time helping identify instances of image manipulation or data fabrication in published manuscripts. What led you to take this up and commit yourself to becoming a “data cop”?
I started my work in science integrity by reading about plagiarism in scientific papers. This led me to search for plagiarized text using quotes from trustworthy papers as the search term in Google Scholar and analyzing those that came up multiple times. Most sentences are unique, but in several cases I found the original paper and the same quote used in a newer paper, suggesting that the text was recycled. I found and reported about 80 papers and PhD theses that contained large chunks of plagiarized text.
During one of my analyses of a PhD thesis that contained plagiarized text, I noticed a blot that was visible multiple times in several chapters, but representing different images. These chapters had also been published as scientific papers. I reported those two papers - now retracted - and started to search for duplicated images systematically. That was in the summer of 2014. I did those searches for 5 years in my spare time (weekends and nights). Together with my co-authors Arturo Casadevall and Ferric Fang, we scanned 20,000 biomedical papers containing photos and found duplicated or manipulated photos in about 800 (4%) papers. Our study was published in 2016.
Since then, I kept on scanning more papers, either by focusing on particular journals or scientists, or by following up on other people's leads or requests. As of today, I have found about 2,000 problematic papers. It is a lot of work to keep track of these papers, and to make reports of the problems, and send them to publishers, journals, or institutions. So I decided earlier this year to take at least one year off from paid work and do this full time.
How do you go about identifying instances of misconduct in scientific manuscripts? How easy or difficult is it? And what do you do when you identify issues?
I mainly focus on duplicated images or part of images within a biomedical paper, focusing on protein blots and microscopy images and other photographic figures. I flip through the images, and try to remember all the photos within that paper. If I see two panels that look similar or that might have an overlapping or repeated pattern, I make screenshots and compare them to each other. I might use Mac's Preview contrast enhancer to make an image lighter or darker to better see some features, but I do not use any software to initially find duplicated photos.
The most common problem I encounter are control Western blots, where the same blot of a household protein such as actin or globin, is used multiple times to represent different experiments. Another very common problem is overlapping microscopy images, where multiple photos are shown representing different experiments, but where parts of the photos show overlapping areas. If two photos show an overlap, that means that only one original sample was used, and that one of the experiments might not have taken place.
It took me a while to learn and recognize the different types of overlaps and manipulations, and some of them are easy to find, while others are much harder. It is of course quite some work to scan hundreds or thousands of papers, especially if they contain lots of complicated figure panels. I probably still miss a lot of duplications or manipulations, since I scan completely manually. I am hoping for software to replace me! But, on the other hand, the problems I am finding are all in published papers, meaning they have been peer reviewed by two or three other scientists, and scanned by editors and publisher's staff. Some of these papers have even been cited multiple times. On all those occasions, people reading the paper had not noticed the problems. So even though I might miss certain duplications, all the duplications that I am finding had not been detected by others.
It is probably hard to see the duplications if you are not aware that this is a problem. Once I point out the duplications, other people usually see the problem too. By posting some duplications or manipulations on Twitter, I hope to raise awareness with other scientists, so that they might check for duplications when they are a peer-reviewer, or discourage others from including duplicated images in their manuscripts.
Recently, I have also come across papers with extraordinary claims, published in low quality journals. There are many new publishers that misuse the Open Access publishing model, by appearing to care more about cashing in the publication fees than about the quality of their content. These so-called predatory publishers and journals are a threat to science because they appear to publish "peer reviewed" papers while in reality there is no quality check. It is very hard for the general audience to distinguish between a "real" publisher and a predatory one, and certain authors specifically publish in predatory journals to pretend their wild ideas are "peer reviewed".
How do authors typically react or respond when they notice irregularities pointed out in their work?
In the first 5 years of my image searches I usually avoided reporting directly to authors. Instead, I reported to journals or institutions, and let the editors then contact the authors. But recently I started to publish signed (using my full name) concerns on PubPeer, a site where scientists can leave feedback on papers. Most authors do not respond to these posts, even though the website will send them an alert. The authors that do respond usually will say something along the lines of "Thank you for pointing out this error, we will send a correction to the journal, and by the way, this does not affect the conclusions of our paper" - or they will counter attack me personally. One of the authors of a paper that I commented on posted my home address online! Another author replied reminding me of about some legal troubles that my previous employer is facing. Those responses are disturbing, but they also tell that I must have struck a nerve by posting my concerns.
With the increasing publication volumes and the current allure of technology, many journals are choosing to automate some processes in the publication workflow, such as plagiarism check or basic data sanity checks? What is your view on automation in the screening and evaluation of research papers? Do you think we will be able to develop a tool that provides a foolproof way of detecting malpractices in research or its reportage? Can technology replace the human eye when it comes to ensuring that the best and most high-quality research is published?
Textual duplications (plagiarism) detection software has worked well and is used by most publishers of scientific journals. However, human assessment is still needed to rule out false positives. For example, text similarities in the Methods, Acknowledgements, or Citation sections of a manuscript are often acceptable, as are definitions and quotes, so not all papers flagged for plagiarism are necessarily "bad".
With respect to image duplication and manipulation, software is still in the development phase, and there is no current tool that I know of that is currently used by any of the big publishers. However, there are some very promising developments, such as by David Acuna et al., so I expect those tools to hit the market soon. As with text similarity detection software, human assessment would still be needed, but such image screening software will hopefully be a great tool for publishers and independent journals to use in the near future.
What role do you think peer review (and reviewers) plays in weeding out misconduct? Do you have any tips for peer reviewers to help identify cases of data or image manipulation?
Peer review appears to not be used a lot to detect image duplication. Peer reviewers are usually not trained to find duplicated images, or they might not be aware of the problem. I hope that by Tweeting and blogging about such cases, on my new Science Integrity Digest blog, more scientists might become aware of the types of problems to screen for.
In addition, there could also be more scrutiny by journals to screen for science integrity issues, e.g., after acceptance of a manuscript. One of the concerns I feel that journals are not yet looking for are papers published by non-academic institutions or companies that might have a certain "agenda". I found several papers where such institutions - often consisting of only 1 or a few persons - approved their own human research, or where conflicts of interests had not been disclosed. These are things that journals could pay more attention to.
On a personal level, what has the experience of maintaining your blog Microbiome Digest taught you?
I know it has helped many others in their research - the blog is a very easy way to get up to speed with the literature by just reading it every day for 5 min. Every now and then someone will write a “thank you” note or tell us personally that they are very grateful for the work that the Microbiome Digest team does. It taught me how great the Internet is to bring scientists together. Even though we have never met each other personally, we now have an international team - in all corners of the world - that runs a nearly-daily blog.
What do you do when you’re not at work or focusing on your blog?
I don't have a paid job anymore, so I work from home all of the time, which is wonderful. I do not miss those long San Francisco Bay Area commutes! Most of my time is spent on looking at image duplication in scientific papers or writing emails to journals or institutions. In the past couple of months, I have gotten a couple of private messages pointing out shoddy research groups or requests for help in reporting cases, all of which take quite some time to follow up on. And I spend way too much time on Twitter!
When I am not behind a computer, I love working in the yard. There is always a sprinkler that needs to be repaired or a shrub that needs to be pruned.
Thank you, Dr. Bik, for sharing your views with us. It has been great talking to you!
[Photo credits for Dr. Bik's profile pic: Michel & Co. Photography, San Jose, CA.]