Extract last word in string in R
tail(strsplit('this is a sentence',split=" ")[[1]],1)
Basically as suggested by @Señor O.
Extract last word in string in R - error faced
I realise that there is white space at the beginning of some of the rows of the Description
variable, which isn't shown when viewed in R.
Removing the whitespace using stri_trim()
solved the issue.
c1$Description = stri_trim(c1$Description, "left")
#remove whitespace
extracting the second last word between the special characters /
You can use word
but you need to specify the separator,
library(stringr)
word(url, -2, sep = '/')
#[1] "ani" "bmc"
extract last word from string only if more than one word R
Maybe something like the following.
x <- c("Genus species", "Genus", "Genus (word) species")
y <- gsub(".*[[:blank:]](\\w+)$", "\\1", x)
is.na(y) <- y == "Genus"
y
[1] "species" NA "species"
Note that it should be very difficult to search for "species"
since we don't have a full list of them. That's why I've opted by this, to set the elements of the result y
to NA
if they are equal to "Genus"
.
R remove last word from string
This will work:
gsub("\\s*\\w*$", "", df1$city)
[1] "Middletown" "Sunny Valley" "Hillside"
It removes any substring consisting of one or more space chararacters, followed by any number of "word" characters (spaces, numbers, or underscores), followed by the end of the string.
Extract last word in a string after comma if there are multiple words else the first word
You can try sub
df$country <- sub('.*,\\s*', '', df$location)
df$country
#[1] "New Zealand" "USA" "France"
Or
library(stringr)
str_extract(df$location, '\\b[^,]+$')
#[1] "New Zealand" "USA" "France"
Extracting the last n characters from a string in R
I'm not aware of anything in base R, but it's straight-forward to make a function to do this using substr
and nchar
:
x <- "some text in a string"
substrRight <- function(x, n){
substr(x, nchar(x)-n+1, nchar(x))
}
substrRight(x, 6)
[1] "string"
substrRight(x, 8)
[1] "a string"
This is vectorised, as @mdsumner points out. Consider:
x <- c("some text in a string", "I really need to learn how to count")
substrRight(x, 6)
[1] "string" " count"
R: Extract last N words from character column in data.table
I would probably use
n = 5
patt = sprintf("\\w+( \\w+){0,%d}$", n-1)
library(stringi)
test[, ext := stri_extract(original, regex = patt)]
original ext
1: the green shirt totally brings out your eyes totally brings out your eyes
2: ford focus hatchback ford focus hatchback
Comments:
- This breaks if you set
n=0
, but there's probably no good reason to do that. - This is vectorized, in case you have
n
differing across rows (e.g.,n=3:4
). @eddi provided a base analogue (for fixed
n
):test[, ext := sub('.*?(\\w+( \\w+){4})$', '\\1', original)]
Related Topics
Splitting a File Name into Name,Extension
How to Change Python Path in Reticulate
How to Delete Everything After Nth Delimiter in R
Listing Contents of an R Data File Without Loading
Convert Binary String to Binary or Decimal Value
Using Different Scales as Fill Based on Factor
Number of Significant Digits in Dplyr Summarise
How to Reorder a Legend in Ggplot2
Most Frequent Value (Mode) by Group
From Data Table, Randomly Select One Row Per Group
How Convert Decimal to Posix Time
How to Prevent Rbind() from Geting Really Slow as Dataframe Grows Larger
Merge by Range in R - Applying Loops
Dplyr on Data.Table, am I Really Using Data.Table