dwapi quickstart

Configuration

Make sure to configure the library at the begining of every new R session. To do so, invoke dwapi::configure() passing the data.world authentication token obtained at https://data.world/settings/advanced

DO NOT SHARE YOUR AUTHENTICATION TOKEN

For your security, do not include your API authentication token in code that is intended to be shared with others.

Call this function via console, always when possible.

If you must call it in code do not include the actual API token. Instead, pass the token via a variable in .Renviron, and do not share your .Renviron file. For example:

dwapi::configure(auth_token = Sys.getenv("DW_AUTH_TOKEN"))

Creating datasets and updating datasets

Use dwapi::create_dataset() to create a new dataset. The library includes number of constructor functions to facilitate the praparation of complex requests like this. The example here is dwapi::dataset_create_request().

create_cars_dataset <- dwapi::dataset_create_request(
  title = sprintf("My cars dataset %s", runif(1)),
  visibility = "PRIVATE",
  license_string = "Other"
)

cars_dataset <- dwapi::create_dataset(Sys.getenv("DW_USER"), create_cars_dataset)
cars_dataset
## $uri
## [1] "https://data.world/testy-tester/my-cars-dataset-0-897647370351478"
## 
## $message
## [1] "Dataset created successfully."
## 
## attr(,"class")
## [1] "create_dataset_response"

Additional information can be added over time, with dataset updates.

update_cars_dataset <- dwapi::dataset_update_request(
  description = "This is a dataset created from R's cars dataset."
)

dwapi::update_dataset(cars_dataset$uri, update_cars_dataset)
## https://api.data.world/v0/datasets/testy-tester/my-cars-dataset-0-897647370351478
## $message
## [1] "Dataset updated successfully."
## 
## attr(,"class")
## [1] "success_message"

Uploading files

Files can be added via URL, from the local file system, or directly as a data frame.

upload_response <- dwapi::upload_data_frame(cars_dataset$uri, cars, "cars.csv")
## tmp file /tmp/Rtmpmb1ehi/file10c06e0f75c2csv created.
## 
Downloading: 48 B     
Downloading: 48 B     
Downloading: 48 B     
Downloading: 48 B
Sys.sleep(10) # Files are processed asyncronously.
upload_response
## $message
## [1] "File uploaded."
## 
## attr(,"class")
## [1] "success_message"

Queries

Datasets can be queried using SQL and SPARQL. Once again, it’s important to keep the concept of tables and their names in mind.

sql_query <- "SELECT * FROM cars"
dwapi::sql(cars_dataset$uri, sql_query)
## # A tibble: 50 x 2
##    speed  dist
##    <int> <int>
##  1     4     2
##  2     4    10
##  3    11    28
##  4    12    14
##  5    12    20
##  6    12    24
##  7    12    28
##  8    13    26
##  9    13    34
## 10    13    34
## # ... with 40 more rows
## $message
## [1] "Dataset has been successfully deleted."
## 
## attr(,"class")
## [1] "success_message"