Codemeta intro

Carl Boettiger

2018-02-12

codemetar: generate codemeta metadata for R packages

The ‘Codemeta’ Project defines a ‘JSON-LD’ format for describing software metadata, as detailed at https://codemeta.github.io. This package provides utilities to generate, parse, and modify codemeta.jsonld files automatically for R packages, as well as tools and examples for working with codemeta json-ld more generally.

It has three main goals:

For more general information about the CodeMeta Project for defining software metadata, see https://codemeta.github.io. In particular, new users might want to start with the User Guide, while those looking to learn more about JSON-LD and consuming existing codemeta files should see the Developer Guide.

A brief intro to common terms we’ll use:

Installation

You can install codemetar from github with:

# install.packages("devtools")
devtools::install_github("codemeta/codemetar")
library("codemetar")

Example

This is a basic example which shows you how to generate a codemeta.json for an R package (e.g. for testthat):

write_codemeta("testthat")

codemetar can take the path to the package root instead. This may allow codemetar to glean some additional information that is not available from the description file alone.

write_codemeta(".")
{
  "@context": [
    "http://purl.org/codemeta/2.0",
    "http://schema.org"
  ],
  "@type": "SoftwareSourceCode",
  "identifier": "testthat",
  "description": "Software testing is important, but, in part because it is \n    frustrating and boring, many of us avoid it. 'testthat' is a testing framework \n    for R that is easy learn and use, and integrates with your existing 'workflow'.",
  "name": "testthat: Unit Testing for R",
  "issueTracker": "https://github.com/r-lib/testthat/issues",
  "datePublished": "2017-12-13 09:30:12 UTC",
  "license": "https://spdx.org/licenses/MIT",
  "version": "2.0.0",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "version": "3.4.3",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 3.4.3 (2017-11-30)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Central R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Hadley",
      "familyName": "Wickham",
      "email": "hadley@rstudio.com"
    }
  ],
  "contributor": [
    {
      "@type": "Organization",
      "name": "R Core team"
    }
  ],
  "copyrightHolder": [
    {
      "@type": "Organization",
      "name": "RStudio"
    }
  ],
  "maintainer": {
    "@type": "Person",
    "givenName": "Hadley",
    "familyName": "Wickham",
    "email": "hadley@rstudio.com"
  },
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "covr",
      "name": "covr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "devtools",
      "name": "devtools",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "xml2",
      "name": "xml2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    }
  ],
  "softwareRequirements": [
    {
      "@type": "SoftwareApplication",
      "identifier": "cli",
      "name": "cli",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "crayon",
      "name": "crayon",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "digest",
      "name": "digest",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "magrittr",
      "name": "magrittr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "methods",
      "name": "methods"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "praise",
      "name": "praise",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "R6",
      "name": "R6",
      "version": "2.2.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rlang",
      "name": "rlang",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "withr",
      "name": "withr",
      "version": "2.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Central R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      }
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": "3.1"
    }
  ]
}

Modifying or enriching CodeMeta metadata

The best way to ensure codemeta.json is as complete as possible is to begin by making full use of the fields that can be set in an R package DESCRIPTION file, such as BugReports and URL. Using the Authors@R notation allows a much richer specification of author roles, correct parsing of given vs family names, and email addresses.

In the current implementation, developers may specify an ORCID url for an author in the optional comment field of Authors@R, e.g.

Authors@R: person("Carl", "Boettiger", role=c("aut", "cre", "cph"), email="cboettig@gmail.com", comment="http://orcid.org/0000-0002-1642-628X")

which will allow codemetar to associate an identifier with the person. This is clearly something of a hack since R’s person object lacks an explicit notion of id, and may be frowned upon.

Using the DESCRIPTION file

The DESCRIPTION file is the natural place to specify any metadata for an R package. The codemetar package can detect certain additional terms in the CodeMeta context. Almost any additional codemeta field can be added to and read from the DESCRIPTION into a codemeta.json file (see codemetar:::additional_codemeta_terms for a list).

CRAN requires that you prefix any additional such terms to indicate the use of schema.org explicitly, e.g. keywords would be specified in a DESCRIPTION file as:

X-schema.org-keywords: metadata, codemeta, ropensci, citation, credit, linked-data

Where applicable, these will override values otherwise guessed from the source repository. Use comma-separated lists to separate multiple values to a property, e.g. keywords.

See the DESCRIPTION file of the codemetar package for an example.

Going further

Check out all the codemetar vignettes for tutorials on other cool stuff you can do with codemeta and json-ld.