R remove last word from string
This will work:
gsub("\\s*\\w*$", "", df1$city)
[1] "Middletown" "Sunny Valley" "Hillside"
It removes any substring consisting of one or more space chararacters, followed by any number of "word" characters (spaces, numbers, or underscores), followed by the end of the string.
Extract last word in string in R
tail(strsplit('this is a sentence',split=" ")[[1]],1)
Basically as suggested by @Señor O.
Remove last word of string to first word in R
Here is an option using dplyr
and stringr
.
library(dplyr)
library(stringr)
df %>%
mutate(temp = str_extract(string, str_c(trail, collapse = '|')),
result = ifelse(is.na(temp), string, str_c(temp, str_remove(string, temp), sep = ' '))) %>%
select(-temp)
# string result
#1 ABA PRIMARY SCHOOL PRIMARY SCHOOL ABA
#2 BLABLA SECONDARY SCHOOL SECONDARY SCHOOL BLABLA
#3 WAZA INSTITUT INSTITUT WAZA
#4 INSTITUT WAMA INSTITUT WAMA
#5 PRIMARY SCHOOL WAMA PRIMARY SCHOOL WAMA
data
string <- c("ABA PRIMARY SCHOOL", "BLABLA SECONDARY SCHOOL", "WAZA INSTITUT", "INSTITUT WAMA", "PRIMARY SCHOOL WAMA")
df <- data.frame(string)
trail = c(" PRIMARY SCHOOL", " SECONDARY SCHOOL", " INSTITUT")
R - Regex to Remove Last Word from String
We can capture the substring as groups using sub
in pattern
, then we add a delimiter (,
) between the capture groups in the replacement
, use that as sep
in the read.table
. If there are leading/lagging spaces, remove it by str_trim
from stringr
by looping through the columns.
library(stringr)
d1 <- read.table(text=sub('(.*)\\s+(\\S+)$', '\\1,\\2', v1),sep=',')
d1[] <- lapply(d1, str_trim)
d1
# V1 V2
#1 PLAYSTORE BANGKOK
#2 FLOAT@THE BAY SINGAPORE
#3 YANTRA SINGAPORE
#4 AIRASIA_QS9DQQL SINGAPORE
Or as suggested by @RichardScriven, a base R
option for trimming leading/lagging spaces is trimws
.
d1[] <- lapply(d1, trimws)
data
v1 <- c('PLAYSTORE BANGKOK','FLOAT@THE BAY SINGAPORE',
'YANTRA SINGAPORE',
'AIRASIA_QS9DQQL SINGAPORE')
extract last word from string only if more than one word R
Maybe something like the following.
x <- c("Genus species", "Genus", "Genus (word) species")
y <- gsub(".*[[:blank:]](\\w+)$", "\\1", x)
is.na(y) <- y == "Genus"
y
[1] "species" NA "species"
Note that it should be very difficult to search for "species"
since we don't have a full list of them. That's why I've opted by this, to set the elements of the result y
to NA
if they are equal to "Genus"
.
Extract last word in string in R - error faced
I realise that there is white space at the beginning of some of the rows of the Description
variable, which isn't shown when viewed in R.
Removing the whitespace using stri_trim()
solved the issue.
c1$Description = stri_trim(c1$Description, "left")
#remove whitespace
How to remove the last word in a string using JavaScript
Use:
var str = "I want to remove the last word.";
var lastIndex = str.lastIndexOf(" ");
str = str.substring(0, lastIndex);
Get the last space and then get the substring.
Delete the first 2 words and the last 2 words in a string in a dataframe using Regex Python
You can use a single call to Series.str.replace
with
df['Sentence'].str.replace(r'(?<![^,])\s*\w+(?:\W+\w+)?\s*|\s*\w+(?:\W+\w+)?\s*(?![^,])', '')
See the Pandas demo:
>>> pattern = r'(?<![^,])\s*\w+(?:\W+\w+)?\s*|\s*\w+(?:\W+\w+)?\s*(?![^,])'
>>> df['Sentence'].str.replace(pattern, '')
0 is jumping off
1 jumped over the,is
2
Regex details
(?<![^,])
- a comma or start of string must appear immediately to the left of the current location\s*
- 0+ whitespaces\w+
- one or more word chars(?:\W+\w+)?
- an optional occurrence of one or more non-word chars followed with one or more word chars\s*
- 0+ whitespaces|
- or\s*
- 0+ whitespaces\w+
- a word (one or more word chars)(?:\W+\w+)?
- an optional occurrence of one or more non-word chars followed with one or more word chars\s*
- 0+ whitespaces(?![^,])
- end of string, or a location that is immediately followed with a comma.
Related Topics
How to Create a Bar Plot for Two Variables Mirrored Across the X-Axis in R
Add a New Column Between Other Dataframe Columns
Double Clustered Standard Errors for Panel Data
How to Read Data with Different Separators
Harnessing .F List Names with Purrr::Pmap
Read Multiple Xlsx Files with Multiple Sheets into One R Data Frame
Filter a Vector of Strings Based on String Matching
R: What's the How to Overwrite a Function from a Package
Adding Prefix or Suffix to Most Data.Frame Variable Names in Piped R Workflow
How to Use a Graphic Imported with Grimport as Axis Tick Labels in Ggplot2 (Using Grid Functions)
The Perils of Aligning Plots in Ggplot
How to Handle Vectors Without Knowing the Type in Rcpp
Ternary Plot and Filled Contour
Create Barplot from Data.Frame
Collapse All Columns by an Id Column
How to Split an Igraph into Connected Subgraphs