Extracting the Last N Characters from a String in R

Extracting the last n characters from a string in R

I'm not aware of anything in base R, but it's straight-forward to make a function to do this using substr and nchar:

x <- "some text in a string"

substrRight <- function(x, n){
substr(x, nchar(x)-n+1, nchar(x))
}

substrRight(x, 6)
[1] "string"

substrRight(x, 8)
[1] "a string"

This is vectorised, as @mdsumner points out. Consider:

x <- c("some text in a string", "I really need to learn how to count")
substrRight(x, 6)
[1] "string" " count"

Extract the first (or last) n characters of a string

See ?substr

R> substr(a, 1, 4)
[1] "left"

how would you extract last 3 characters from a string in a column of R dataframe?

You can try substring with nchar to extract last 3 characters from a string.

substring(x, nchar(x)-2)
#[1] "LKO" "CHE"

Data:

x <- c("WH-LKO", "WH-CHE")

extracting last n characters from a character column in r

Use as.character inside nchar or use stringsAsFactors = F when creating the data.frame.

library(magrittr)
df <- data.frame(A = c("Blue", "Orange", "Black"), stringsAsFactors = F)
df %<>% mutate(B = substr(A, nchar(A)-3+1, nchar(A)))
df

A B
1 Blue lue
2 Orange nge
3 Black ack

Extract values based on last n characters

strcapture, as a base R corollary to the tidyr extract answer from Wiktor:

strcapture("([^-]*)-([^-]*)-([^-]*)$", df$vector, proto=list(Col1="",Col2="",Col3=""))
# Col1 Col2 Col3
#1 abc bec ndj
#2 jfj nej ndjk
#3 nemd nemdkd nedke

How to remove last n characters from every element in the R vector

Here is an example of what I would do. I hope it's what you're looking for.

char_array = c("foo_bar","bar_foo","apple","beer")
a = data.frame("data"=char_array,"data2"=1:4)
a$data = substr(a$data,1,nchar(a$data)-3)

a should now contain:

  data data2
1 foo_ 1
2 bar_ 2
3 ap 3
4 b 4

Extract only characters after a space that comes after the last number in a string

We can use sub to do this i..e match character (.*) till one or more digits (\\d+) followed by one or more space (\\s+) and replace with blank ("")

sub(".*\\d+\\s+", "", v1)

-output

[1] "ABC, efg xyz"  "abcdef ghijkl" "ghijkl"   

Or use str_remove

library(stringr)
str_remove(v1, ".*\\d+\\s+")
[1] "ABC, efg xyz" "abcdef ghijkl" "ghijkl"

data

v1 <- c("54 ABC, efg xyz", "ABC 08 abcdef ghijkl", "ABC 01-02 ghijkl")

How to remove the last character in a string variable in R?

Add this line to your code. Using substring and nchar you could define to remove the last character:

myData$subject <- substring(myData$subject, 1, nchar(myData$subject)-1)
myData
  subject session       RT
1 AN11GR 1 2.925415
2 AN11GR 2 1.715415
3 BR13ST 1 2.645415
4 BR13ST 2 1.925415


Related Topics



Leave a reply



Submit