New Hack of the Scholarly Publication System: Danger to Google Scholar
5December 21, 2018 by mzelenec
By Town Peterson and Jorge Soberon, University of Kansas
Google Scholar (GS) has emerged as one of principal portals into the scholarly literature, providing broad search capacity, links for papers for which access is open, and summary statistics for journals and for individual scholars. Although cautions have been offered about the need for careful curation of one’s profile on GS, such as this or this, we have not heard of hacks or attacks specifically aimed at this system, until now.
In recent days, several of our colleagues noted the appearance of odd publications in their Google Scholar profiles. These publications are multi-authored, indeed generally (but not always) with exactly 150 authors. Their titles are current titles, and they are indicated as having been published in edited book volumes or in scholarly journals. We have noted these publications particularly in the profiles of highly prolific scholars such as Robert May, Richard Dawkins, or Richard Levins. Here is an example of such a publication–one must sort by publication year, and inspect the first several entries (as of 20 December 2018, the first three entries were all “fake” papers).
We set about characterizing this phenomenon… All of the papers that we detected have 144-150 authors, with all but one having 150. Titles are harvested from among current titles in the literature, and meshed with an author list that seems to be drawn from highly-cited individuals, but also in some way from among authors that have published either with or in the same field as the supposed author. These publications are listed either in edited books as chapters, or in journals that either do not exist (e.g., Ecological Complexity and Agroecology), or in which the given volume and pages do not make sense. We inspected these “papers” for repeated author names: for example, five such papers had 150 authors each, for a total of 750 names; 605 of them were unique, and no name was common to all of them.
In sum, someone has invented a “bot” that harvests titles, author names, and journals and edited books, and creates article profiles that are ingested by Google Scholar, creating a mess that will have to be cleaned up by the scholars themselves, in all probability. Why? No idea, to be honest. We initially thought that it might be a ploy by which to gain some sort of fame or credit via listing collaborations with leading scientists (which, by the way includes Charles Darwin!). However, the lack of repeated names suggests that such is not the case. Rather, we hypothesize that this phenomenon reflects an attempt either to “bring down” Google Scholar in particular, or to discredit the entire scholarly communications system more generally. We would greatly appreciate communications from readers who might have further insights.
This is rather odd. Do you have a method for finding these fake ones, or a list of what you’ve found somewhere so others can dig into it?
And have you found out where Google Scholar is picking them up from, the source?
LikeLiked by 1 person
We are still exploring the phenomenon. For example, on Richard Dawkins’ profile page there are such papers that are very social science-related, whereas on Robert May’s page there are papers that are more ecology and pop bio. A colleague in Mexico got one with a few mega-cited scientists on it, as well as a bunch of Mexican authors. We are trying to get people to share the “fake” papers via Twitter, using the hashtag #FakePapers
LikeLiked by 1 person
Do any of these have full text links? If they did, one could think of a mere link spamming/”SEO” effort. I didn’t see any in the examples provided, though.
LikeLike
From Twitter regarding this issue …
Alberto Martín, @albertomartin, Dec 25, 2018
Very interesting! However, after taking a quick look, I don’t think there is a malicious mastermind trying to bring down Google Scholar. I just think something went terribly wrong with GS’s bots when they ingested the citations in those edited books (indexed in Google Books)
The authors that appear in the cited references of certain books and articles have been added (for some unknown reason) as authors of the citing document, thus generating long list of authors. The limit of 150 is probably GS’s author limit. Hopefully GS will solve this soon.
LikeLike
Can I merely say thats a relief to discover somebody that in fact knows what theyre preaching about over the internet. You actually have learned to bring an issue to light and earn it critical. The diet must read this and understand this side on the story. I cant believe youre no more well-known when you undoubtedly contain the gift.
LikeLike