Introduction to hansard

Evan Odell

2018-03-02

hansard is an R package to pull data from the UK parliament through the http://www.data.parliament.uk/ API. It emphasises simplicity and ease of use, so that users unfamiliar with APIs can easily retrieve large volumes of high quality data. Each function accepts a single argument at a time, and functions that require additional information to retrieve the data you requested will ask for it after you execute the function. Functions retrieve data in json format and convert it to a tibble. The hansard_generic function supports the building of API requests for XML, csv or HTML formats if required. Note that the API is rate limited to returning 5500 rows per request in some circumstances.

Installing hansard

From CRAN

install.packages("hansard")

From GitHub (Development Version)

install.packages("devtools")
devtools::install_github("EvanOdell/hansard")

Load hansard

library(hansard)

Using hansard

hansard contains functions for calling data for the UK Parliament API. The functions are designed to call data from a specific http://www.data.parliament.uk/ API. The parameter options for each function vary, depending on the specific information available from each API, but there are four constant parameters in every function (with the exception of the hansard_generic() and research_topics_list() functions described below): extra_args, tidy, tidy_style and verbose.

extra_args

The extra_args parameters allows additional arguments and queries to be passed to the API. This can include searching, limiting the parameters actually returned and ordering data. extra_args queries must follow the syntax used by the API, which requires an ampersand at the beginning of each argument e.g. &_search=education. Multiple arguments can be included, e.g. &_search=education&_sort=title. See the http://explore.data.parliament.uk/

tidy

tidy is a logical parameter accepting either TRUE or FALSE, defaulting to TRUE. If TRUE, hansard will fix variable names, which by default contain non alpha-numeric characters and appear to use an inconsistent/idiosyncratic naming convention, at least by the standards of the various naming conventions used in R. Dates and datetimes are converted to POSIXct class. Some extra URL data included in the API is also stripped out.

tidy_style

The naming convention for variables used if tidy==TRUE is indicated by tidy_style. tidy_style accepts one of "snake_case", "camelCase" and "period.case", defaulting to "snake_case". All variable names will be converted to match the given naming convention.

verbose

verbose is a logical parameter accepting either TRUE or FALSE, defaulting to FALSE. If TRUE, the function will print the progress of the API query to the console.

hansard_ prefixes

In addition to the more generic sounding function names, each function in hansard has a wrapper where the name is prefixed with hansard_. For example, both bills() and hansard_bills() will return the same result.

Almost all hansard functions (the exceptions being the functions that retrieve more reference style data: bill_stage_types(), commons_division_date(), commons_terms() constituencies(), election_candidates(), election_results(), members(), members_search(), research_briefings_lists() and hansard_generic()) include a start_date and end_date parameter, which can be used to set the earliest (start_date) and latest (end_date) data to be returned from the API.

Example using the commons_divisions() and mp_vote_record() functions

The commons_divisions() function returns divisions in the House of Commons, including the result of votes and information on what we being voted on. mp_vote_record() returns a data frame the voting record of a given MP on each division they voted in. The example below returns all Commons Divisions where Diane Abbott voted aye in 2017. To find the parliamentary ID of Diane Abbott (or any other member of the House of Commons or House of Lords), use the members_search() function described below.

## Connecting to API
## Retrieving page 1 of 1
## Observations: 38
## Variables: 5
## $ about         <chr> "http://data.parliament.uk/resources/722300", "h...
## $ title         <chr> "Early Parliamentary General Election", "Pension...
## $ uin           <chr> "CD:2017-04-19:264", "CD:2017-03-29:260", "CD:20...
## $ date_value    <dttm> 2017-04-19, 2017-03-29, 2017-03-29, 2017-03-29,...
## $ date_datatype <chr> "POSIXct", "POSIXct", "POSIXct", "POSIXct", "POS...

Using commons_divisions(), we can see the result of one of those votes, ID 722300, the Early Parliamentary General Election bill that dissolved parliament for the 2017 General Election. The function default is to return a list of every MP and how they voted:

## Connecting to API
## Observations: 535
## Variables: 7
## $ number               <chr> "1", "10", "100", "101", "102", "103", "1...
## $ member_party         <chr> "Labour", "Labour", "Labour", "Labour (Co...
## $ type                 <chr> "aye_vote", "aye_vote", "aye_vote", "aye_...
## $ member_printed_value <chr> "Ms Diane Abbott", "Dr Rosena Allin-Khan"...
## $ vote_id              <chr> "722300", "722300", "722300", "722300", "...
## $ about                <chr> "172", "4573", "1579", "4088", "3950", "1...
## $ label_value          <chr> "Biography information for Ms Diane Abbot...

With the summary parameter, we can return a brief summary table of votes:

## Connecting to API
## Observations: 1
## Variables: 13
## $ abstain_count                     <chr> "0"
## $ ayes_count                        <chr> "522"
## $ noes_vote_count                   <chr> "13"
## $ did_not_vote_count                <chr> "0"
## $ error_vote_count                  <chr> "0"
## $ non_eligible_count                <chr> "0"
## $ suspended_or_expelled_votes_count <chr> "0"
## $ margin                            <chr> "509"
## $ date                              <dttm> 2017-04-19
## $ division_number                   <chr> "196"
## $ session                           <chr> "2016/17"
## $ title                             <chr> "Early Parliamentary General...
## $ uin                               <chr> "CD:2017-04-19:264"

The results of votes in the House of Lords can be retrieved with the lords_divisions function. The voting record of individual Lords can be retrieved using the lords_vote_record functions.

Multiple Parameter Functions

The following functions accept vectors of member IDs and departmental names for applicable parameters:

For example, the following function returns all questions answered by Nichola Blackwood (4019) or Sam Gyimah (3980), asked by Keith Vaz (338) or Diane Abbot (172), and covered by the Department for Health or the Ministry of Justice, between 2016-12-18 and 2017-03-12.

## Connecting to API
## Retrieving page 1 of 1
## Connecting to API
## Retrieving page 1 of 1
## The request did not return any data.
##               Please check your parameters.
## Connecting to API
## Retrieving page 1 of 1
## The request did not return any data.
##               Please check your parameters.
## Connecting to API
## Retrieving page 1 of 1
## The request did not return any data.
##               Please check your parameters.
## Connecting to API
## Retrieving page 1 of 1
## The request did not return any data.
##               Please check your parameters.
## Connecting to API
## Retrieving page 1 of 1
## Connecting to API
## Retrieving page 1 of 1
## The request did not return any data.
##               Please check your parameters.
## Connecting to API
## Retrieving page 1 of 1
## Observations: 23
## Variables: 31
## $ about                               <chr> "705625", "705626", "70562...
## $ answering_body                      <chr> "Department of Health", "D...
## $ question_text                       <chr> "To ask the Secretary of S...
## $ tabling_member_printed              <chr> "Keith Vaz", "Keith Vaz", ...
## $ uin                                 <chr> "65590", "65591", "65592",...
## $ attachment                          <list> [<c("http://data.parliame...
## $ grouped_question_uin                <list> ["list(c(\"65591\", \"655...
## $ answer_text_value                   <chr> "<p>73 clinical commission...
## $ answering_member_about              <chr> "4019", "4019", "4019", "4...
## $ answering_member_label_value        <chr> "Biography information for...
## $ answering_member_constituency_value <chr> "Oxford West and Abingdon"...
## $ answering_member_printed_value      <chr> "Nicola Blackwood", "Nicol...
## $ date_of_answer_value                <chr> "2017-03-07", "2017-03-07"...
## $ answer_date_time                    <dttm> 2017-03-07 17:15:39, 2017...
## $ date_of_answer_datatype             <chr> "POXIXct", "POXIXct", "POX...
## $ is_ministerial_correction_value     <chr> "false", "false", "false",...
## $ is_ministerial_correction_datatype  <chr> "boolean", "boolean", "boo...
## $ answering_dept_id_value             <chr> "17", "17", "17", "17", "1...
## $ answering_dept_short_name_value     <chr> "Health", "Health", "Healt...
## $ answering_dept_sort_name_value      <chr> "Health", "Health", "Healt...
## $ date_value                          <chr> "2017-02-27", "2017-02-27"...
## $ date_datatype                       <chr> "dateTime", "dateTime", "d...
## $ hansard_heading_value               <chr> "Diabetes", "Diabetes", "D...
## $ house_id_value                      <chr> "1", "1", "1", "1", "1", "...
## $ registered_interest_value           <chr> "false", "false", "false",...
## $ registered_interest_datatype        <chr> "boolean", "boolean", "boo...
## $ tabling_member_about                <chr> "338", "338", "338", "338"...
## $ tabling_member_label_value          <chr> "Biography information for...
## $ tabling_member_constituency_value   <chr> "Leicester East", "Leicest...
## $ legislature_pref_label_value        <chr> "House of Commons", "House...
## $ legislature_about                   <chr> "25259", "25259", "25259",...

Special functions

Several functions have special or experimental features:

The research_briefings() function

The research_briefings() function includes the feature of requesting data using lists created using the research_briefings_lists functions:

## [1] "Defence"
## [1] "Falkland Islands"
## [1] "Lords Library notes"

In this case I have given them the same name as their function, but you can assign any name you wish to them.

Having created the lists, they can be used to specify which topics and subtopics to call, although strings can also be used. In the example below, a and c contain the same data.

## Connecting to API
## Retrieving page 1 of 1
## Connecting to API
## Retrieving page 1 of 1
## Connecting to API
## Retrieving page 1 of 1

If a specific subtopic is called, but the topic is not specified, the function will still return all data within that specific subtopic. Note that this is slower than specifying the topic and subtopic.

## Connecting to API
## Retrieving page 1 of 1
##    user  system elapsed 
##   0.462   0.136   7.358
## Connecting to API
## Retrieving page 1 of 1
##    user  system elapsed 
##   0.031   0.003   0.621
## [1] TRUE

If a specified subtopic is not a subtopic of the specified topic, the function will not return any data.

The hansard_generic() function

The hansard_generic() function allows you to put in your own paths to the API. Information on all the paths available in the API can be found on the DDP Explorer website.

Note that the API defaults to returning 10 items per page, but allows up to 500 items per page, the default used by hansard.

The members_search() function

Looking up information on an individual MP or Lord through the Parliamentary API requires knowing their parliamentary ID number. This can be hard to find on the web, but luckily you can look it up through the API. We want information on the voting record of the Labour MP for Hackney North and Stoke Newington Diane Abbott, but we don’t know her ID number, so we search for her:

## Connecting to API
## Retrieving page 1 of 1
## # A tibble: 4 x 12
##   mnis_id home_page   additional_name_… constituency_abo… constituency_la…
##   <chr>   <chr>       <chr>             <chr>             <chr>           
## 1 172     http://www… Julie             http://data.parl… Hackney North a…
## 2 1651    <NA>        Granville         <NA>              <NA>            
## 3 4249    http://www… <NA>              http://data.parl… Newton Abbot    
## 4 3827    http://www… <NA>              <NA>              <NA>            
## # ... with 7 more variables: family_name_value <chr>,
## #   full_name_value <chr>, gender_value <chr>, given_name_value <chr>,
## #   label_value <chr>, party_value <chr>, twitter_value <chr>

The same function, without the use of the tidy parameter, illustrating the differences in the variable names and the presentation of information in the first column:

## Connecting to API
## Retrieving page 1 of 1
## # A tibble: 4 x 12
##   `_about`   homePage  additionalName.… constituency._ab… constituency.la…
##   <chr>      <chr>     <chr>            <chr>             <chr>           
## 1 http://da… http://w… Julie            http://data.parl… Hackney North a…
## 2 http://da… <NA>      Granville        <NA>              <NA>            
## 3 http://da… http://w… <NA>             http://data.parl… Newton Abbot    
## 4 http://da… http://w… <NA>             <NA>              <NA>            
## # ... with 7 more variables: familyName._value <chr>,
## #   fullName._value <chr>, gender._value <chr>, givenName._value <chr>,
## #   label._value <chr>, party._value <chr>, twitter._value <chr>

The same function as above, but with tidy_style = "period.case", so it returns variables with a different naming convention.

## Connecting to API
## Retrieving page 1 of 1
## # A tibble: 4 x 12
##   mnis.id home.page   additional.name.… constituency.abo… constituency.la…
##   <chr>   <chr>       <chr>             <chr>             <chr>           
## 1 172     http://www… Julie             http://data.parl… Hackney North a…
## 2 1651    <NA>        Granville         <NA>              <NA>            
## 3 4249    http://www… <NA>              http://data.parl… Newton Abbot    
## 4 3827    http://www… <NA>              <NA>              <NA>            
## # ... with 7 more variables: family.name.value <chr>,
## #   full.name.value <chr>, gender.value <chr>, given.name.value <chr>,
## #   label.value <chr>, party.value <chr>, twitter.value <chr>

The search function is not case sensitive, and searchs in the names and constituencies of all MPs and Lords. So even though we spelled her surname incorrectly, we can still find her. This API provides limited biographical details, to retrieve more detailed biographical information, use the mnis package to retrive data from the Members’ Names Information Service.