str_replace_all() with a named vector now respects modifier functions (#207)
str_trunc() is once again vectorised correctly (#203, @austin3dickey).
NA values more gracefully (#217). I’ve also tweaked the sizing policy so hopefully it should work better in notebooks, while preserving the existing behaviour in knit documents (#232).
Error : object ‘ignore.case’ is not exported by 'namespace:stringr'. This is because the long deprecated
perl()have now been removed.
str_glue_data() provide convenient wrappers around
glue_data() from the glue package (#157).
str_flatten() is a wrapper around
stri_flatten() and clearly conveys flattening a character vector into a single string (#186).
str_remove_all() functions. These wrap
str_replace_all() to remove patterns from strings. (@Shians, #178)
str_squish() removes spaces from both the left and right side of strings, and also converts multiple space (or space-like characters) to a single space within strings (@stephlocke, #197).
omit_na argument for ignoring
str_replace() now ignores
NAs and keeps the original strings. (@yutannihilation, #164)
str_trunc() now preserves NAs (@ClaytonJY, #162)
str_trunc() now throws an error when
width is shorter than
ellipsis (@ClaytonJY, #163).
perl() have now been removed.
str_match_all()now returns NA if an optional group doesn’t match (previously it returned “”). This is more consistent with
str_match()and other match failures (#134).
replacement can now be a function that is called once for each match and whose return value is used to replace the match.
A new vignette (
vignette("regular-expressions")) describes the details of the regular expressions supported by stringr. The main vignette (
vignette("stringr")) has been updated to give a high-level overview of the package.
str_sort() gain explicit
numeric argument for sorting mixed numbers and strings.
str_replace_all() now throws an error if
replacement is not a character vector. If
NA_character_ it replaces the complete string with replaces with
All functions that take a locale (e.g.
str_sort()) default to “en” (English) to ensure that the default is consistent across platforms.
Add sample datasets:
coll() now throw an error if you use them with anything other than a plain string (#60). I’ve clarified that the replacement for
boundary() has improved defaults when splitting on non-word boundaries (#58, @lmullen).
str_detect() now can detect boundaries (by checking for a
str_count() > 0) (#120).
str_subset() works similarly.
str_extract_all() now work with
boundary(). This is particularly useful if you want to extract logical constructs like words or sentences.
str_extract_all() respects the
simplify argument when used with
str_subset() now respects custom options for
fixed() patterns (#79, @gagolews).
str_replace_all() now behave correctly when a replacement string contains
\\\\1, etc. (#83, #99).
str_split() gains a
simplify argument to match
str_view_all() create HTML widgets that display regular expression matches (#96).
NA for indexes greater than number of words (#112).
stringr is now powered by stringi instead of base R regular expressions. This improves unicode and support, and makes most operations considerably faster. If you find stringr inadequate for your string processing needs, I highly recommend looking at stringi in more detail.
stringr gains a vignette, currently a straight forward update of the article that appeared in the R Journal.
str_c() now returns a zero length vector if any of its inputs are zero length vectors. This is consistent with all other functions, and standard R recycling rules. Similarly, using
str_c("x", NA) now yields
NA. If you want
str_replace_na() on the inputs.
str_replace_all() gains a convenient syntax for applying multiple pairs of pattern and replacement to the same vector:
input <- c("abc", "def") str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
str_match() now returns NA if an optional group doesn’t match (previously it returned “”). This is more consistent with
str_extract() and other match failures.
str_subset() keeps values that match a pattern. It’s a convenient wrapper for
x[str_detect(x)] (#21, @jiho).
str_sort() allow you to sort and order strings in a specified locale.
str_conv() to convert strings from specified encoding to UTF-8.
boundary() allows you to count, locate and split by character, word, line and sentence boundaries.
The documentation got a lot of love, and very similar functions ( e.g. first and all variants) are now documented together. This should hopefully make it easier to locate the function you need.
ignore.case(x) has been deprecated in favour of
fixed|regex|coll(x, ignore.case = TRUE),
perl(x) has been deprecated in favour of
str_join() is deprecated, please use
fixed path in
str_wrap example so works for more R installations.
remove dependency on plyr
Zero input to
str_split_fixed returns 0 row matrix with
perl that switches to Perl regular expressions
str_match now uses new base function
regmatches to extract matches - this should hopefully be faster than my previous pure R algorithm
str_wrap function which gives
strwrap output in a more convenient format
word function extract words from a string given user defined separator (thanks to suggestion by David Cooper)
str_locate now returns consistent type when matching empty string (thanks to Stavros Macrakis)
str_count counts number of matches in a string.
str_trim receive performance tweaks - for large vectors this should give at least a two order of magnitude speed up
str_length returns NA for invalid multibyte strings
fix small bug in internal