The Google Cloud Translation API provides a simple programmatic interface for translating an arbitrary string into any supported language. Translation API is highly responsive, so websites and applications can integrate with Translation API for fast, dynamic translation of source text from the source language to a target language (e.g. French to English).
Read more on the Google Cloud Translation Website
You can detect the language via
gl_translate_detect, or translate and detect language via
Translate text via
gl_translate. Note this is a lot more refined than the free version on Google’s translation website.
You can choose the target language via the argument
target. The function will automatically detect the language if you do not define an argument
source. This function which will also detect the langauge. As it costs the same as
gl_translate_detect, its usually cheaper to detect and translate in one step.
You can pass a vector of text which will first be attempted to translate in one API call - if that fails due to being greater than the API limits, it will attempt again but vectorising the API calls. This will result in more calls and be slower, but cost the same as you are charged per character translated, not per API call.
You can also supply web HTML and select the
format='html' which will handle HTML tags to give you a cleaner translation.
rvest - an example is shown below:
# translate webpages
my_url <- "http://www.dr.dk/nyheder/indland/greenpeace-facebook-og-google-boer-foelge-apples-groenne-planer"
## in this case the content to translate is in css select .wcms-article-content
read_html(my_url) %>% # read html
html_node(css = ".wcms-article-content") %>% # select article content
html_text %>% # extract text
gl_translate(format = "html") %>% # translate with html flag
dplyr::select(translatedText) # show translatedText column of output tibble
This function only detects the language:
The more text it has, the better. And it helps if its not Danish…
It may be better to use
cld2 to translate offline first, to avoid charges if the translation is unnecessary (e.g. already in English). You could then verify online for more uncertain cases.
The API limits in three ways: characters per day, characters per 100 seconds, and API requests per 100 seconds. All can be set in the API manager in Google Cloud console:
The library will limit the API calls for the characters and API requests per 100 seconds. The API will automatically retry if you are making requests too quickly, and also pause to make sure you only send
100000 characters per 100 seconds.