Using predatory

Marcelo Perlin

2017-11-03

Motivation

The recent rise of publications in predatory journals is a big issue in the academic community and it hurts the development of science. This is especially true for developing countries such as Brazil, where the academic evaluation system is not yet well established. Surprinsingly, even standard impact assessment systems such as JCR and SJR are not immune to predatory publishers.

One of the problems in doing empirical research regarding predatory journals is that the only available resource for their identification is Beall’s list. While it is certainly useful and its author deserves all the credit we can give him, the information in the site is unstructured as it only shows the names and links of predatory publishers and journals.

As part of a research paper related to the analysis of predatory publications in Brazil, I’ve built a database of predatory journals based on Beall’s site. All data is gathered by a webscrapping algorithm that searches web pages from the list of Beall’s site for an ISSN pattern and saves it all in a .csv file.

Based on this file, the package predatory makes it easy for researchers and librarians to check whether a particular journal or publisher is in Beall’s list or not. Not only that, it also allows direct access to the database that we are building. A shiny app is also available in https://msperlin.shinyapps.io/shiny-predatory/.

Examples of usage

Name or issn lookup

Let’s say you have the name of a journal called Biomedical Laboratory and Clinical Research that you want to check whether it is in Beall’s list or not. For that, you can simply use the following code:

library(predatory)

name <- 'International Journal of Electrochemical Science'
temp <- find.predatory(x = name)
## 
## Found 1 row(s)
temp
## # A tibble: 1 x 5
##        issn                                             name
##       <chr>                                            <chr>
## 1 1452-3981 International Journal of Electrochemical Science
## # ... with 3 more variables: main.url <chr>, type <chr>, ind.url <chr>

As you can see, this journal is found in Beall’s list and has an issn of 1452-3981.

Another example would be to look for a journal based on its ISSN number. Let’s try the value 0028-0836.

my.issn <- '00208-0836'
temp <- find.predatory(x = my.issn, by = 'issn')
## 
## Found 0 row(s)
temp
## # A tibble: 0 x 5
## # ... with 5 variables: issn <chr>, name <chr>, main.url <chr>,
## #   type <chr>, ind.url <chr>

This time, however, the search returned a dataframe with 0 length. The result is not surprising since the issn belongs to the journal Nature.

Partial lookup

Another possibility of usage is to look for all predatory journals within a particular subject. Let’s try all journals that have the word finance in its title.

my.str <- 'finance'
temp <- find.predatory(x = my.str, type.match = 'partial')
## 
## Found 3 row(s)
head(temp)
## # A tibble: 3 x 5
##        issn                                                          name
##       <chr>                                                         <chr>
## 1 1931-0285               The Institute for Business and Finance Research
## 2 1944-592X               The Institute for Business and Finance Research
## 3 2048-125X British Journal of Economics, Finance and Management Sciences
## # ... with 3 more variables: main.url <chr>, type <chr>, ind.url <chr>

This time we found 3 journals that are related to finance.

Acessing the database of predatory journals

The database of predatory journals is available within the package. It is stored as a csv file in the inst/extdat folder. If you are interested in its contents, you can find it using command system.file("extdata", 'predpub.csv', package = "predatory") or simply calling function Get_PredPubTable from the package, as follows:

df.predpub <- get.predpubTable()

head(df.predpub)
## # A tibble: 6 x 5
##        issn
##       <chr>
## 1 1459-0255
## 2 0195-9131
## 3 1452-3981
## 4 2158-2742
## 5 1991-8178
## 6 2158-2750
## # ... with 4 more variables: name <chr>, main.url <chr>, type <chr>,
## #   ind.url <chr>
str(df.predpub)
## Classes 'tbl_df', 'tbl' and 'data.frame':    1136 obs. of  5 variables:
##  $ issn    : chr  "1459-0255" "0195-9131" "1452-3981" "2158-2742" ...
##  $ name    : chr  "WFL Publisher" "International Agency for Development of Culture, Education and Science" "International Journal of Electrochemical Science" "Scientific Research Publishing" ...
##  $ main.url: chr  "http://world-food.net/" "http://iadces.com/" "http://www.electrochemsci.org/" "http://www.scirp.org/" ...
##  $ type    : chr  "publisher" "publisher" "ind_journal" "publisher" ...
##  $ ind.url : chr  "http://world-food.net/category/journals/" "https://iadces.com/" "http://www.electrochemsci.org/" "http://www.scirp.org/journal/CategoryOfJournal.aspx?CategoryID=1" ...
##  - attr(*, "spec")=List of 2
##   ..$ cols   :List of 5
##   .. ..$ issn    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ name    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ main.url: list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ type    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ ind.url : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   ..$ default: list()
##   .. ..- attr(*, "class")= chr  "collector_guess" "collector"
##   ..- attr(*, "class")= chr "col_spec"