R get last element from str_split
As the comment on your question suggests, this is suitable for gsub
:
gsub("^.*_", "", string_thing)
I'd recommend you take note of the following cases as well.
string_thing <- c("I_AM_STRING", "I_AM_ALSO_STRING_THING", "AM I ONE", "STRING_")
gsub("^.*_", "", string_thing)
[1] "STRING" "THING" "AM I ONE" ""
Use strsplit to get last character in r
For your strsplit
method to work, you can use tail
with sapply
df$LastInit <- sapply(strsplit(as.character(df$Name), ""), tail, 1)
df
# Name Sex LastInit
# 1 Anna F a
# 2 Michael M l
# 3 David M d
# 4 Sarah F h
Alternatively, you can use substring
with(df, substring(Name, nchar(Name)))
# [1] "a" "l" "d" "h"
Write a function to get later elements from str_split()
We may use tail
- as there are more than one element to be returned, return as a list
column
Orgsplit_abrev <- function(x){
lapply(str_split(x," "), tail, 2)
}
-testing
foo %>%
summarise(Orgsplit_abrev(Organisms))
Orgsplit_abrev(Organisms)
1 Enterobacter, aerogenes
2 Enterobacter, aerogenes
3 Klebsiella, pneumoniae
4 Acinetobacter, baumannii
5 Enterobacter, cloacae
6 Klebsiella, pneumoniae
Also, if we want to specify the index, create a lambda function
Orgsplit_abrev <- function(x){
lapply(str_split(x," "), function(x) x[c(3, 4)])
}
Or may also use Extract with [
Orgsplit_abrev <- function(x){
lapply(str_split(x," "),`[`, c(3, 4))
}
R split text string into last and first elements
You can use tail
to grab the last element:
df$name2 = as.character(lapply(strsplit(as.character(df$PREFIX), split="_"),
tail, n=1))
df
# PREFIX VALUE name1 name2
# 1 A_B 1 A B
# 2 A_C 2 A C
# 3 A_D 3 A D
# 4 B_A 4 B A
# 5 A_B_C 5 A C
# 6 B_D_E 6 B E
# 7 C_B_A 7 C A
# 8 B_A 8 B A
How to get empty last elements from strsplit() in R?
Here are a couple ideas
scan(text="1,2,3,", sep=",", quiet=TRUE)
#[1] 1 2 3 NA
unlist(read.csv(text="1,2,3,", header=FALSE), use.names=FALSE)
#[1] 1 2 3 NA
Those both return integer vectors. You can wrap as.character
around either of them to get the exact output you show in the Question:
as.character(scan(text="1,2,3,", sep=",", quiet=TRUE))
#[1] "1" "2" "3" NA
Or, you could specify what="character"
in scan
, or colClasses="character"
in read.csv
for slightly different output
scan(text="1,2,3,", sep=",", quiet=TRUE, what="character")
#[1] "1" "2" "3" ""
unlist(read.csv(text="1,2,3,", header=FALSE, colClasses="character"), use.names=FALSE)
#[1] "1" "2" "3" ""
You could also specify na.strings=""
along with colClasses="character"
unlist(read.csv(text="1,2,3,", header=FALSE, colClasses="character", na.strings=""),
use.names=FALSE)
#[1] "1" "2" "3" NA
accessing individual values split by str_split in R, finding the last one?
Taken from: Find file name from full file path
basename("C:/some_dir/a")
> [1] "a"
dirname("C:/some_dir/a")
>[1] "C:/some_dir"
Although I think the above approach is much better, you can also use the str_split
approach - which I really only mention to show how to select the last elements from a list using lapply
.
example <- c("C:/some_dir/a","C:/some_dir/sdfs/a","C:/some_dir/asdf/asdf/a")
example.split <- strsplit(example,"/")
files <- unlist(lapply(example.split, tail , 1 ))
split string last delimiter
These use no packages. They assume that each element of col2
has at least one underscore. (See note if lifting this restriction is needed.)
1) The first regular expression (.*)_
matches everything up to the last underscore followed by everything remaining .*
and the first sub
replaces the entire match with the matched part within parens. This works because such matches are greedy so the first .*
will take everything it can leaving the rest for the second .*
. The second regular expression matches everything up to the last underscore and the second sub
replaces that with the empty string.
transform(df, col2 = sub("(.*)_.*", "\\1", col2), col3 = sub(".*_", "", col2))
2) Here is a variation that is a bit more symmetric. It uses the same regular expression for both sub
calls.
pat <- "(.*)_(.*)"
transform(df, col2 = sub(pat, "\\1", col2), col3 = sub(pat, "\\2", col2))
Note: If we did want to handle strings with no underscore at all such that "xyz" is split into "xyz" and "" then use this for the second sub
. It tries to match the left hand side of the | first and if that fails (which will occur if there are no underscores) then the entire string will match the right hand side and sub
will replace that with the empty string.
sub(".*_|^[^_]*$", "", col2)
Related Topics
How to Make Single Stacked Bar Chart in Ggplot2
How to Apply a Gradient Fill to a Geom_Rect Object in Ggplot2
How to Adjust the Font Size of Tablegrob
How to Remove Na Data in Only One Columns
Drawing Non-Intersecting Circles
How to Rename All Columns of a Data Frame Based on Another Data Frame in R
Extract Columns from Data Table by Numeric Indices Stored in a Vector
Italic Greek Letters in R Plot
R Ggplot2: Labeling a Horizontal Line Without Associating the Label with a Series
Draw a Trend Line Using Ggplot
Applying Gsub to Various Columns
Shiny Dashboard Mainpanel Height Issue
How to Always Display 3 Decimal Places in Datatables in R Shiny
Rename Columns in Multiple Dataframes, R
Read.Table Reads "T" as True and "F" as False, How to Avoid