Gsub Partial Replace

gsub partial replace

You could do something like this:

my_string.gsub(/(<--MARKER_START-->)(.*)(<--MARKER_END-->)/, '\1replace_text\3')

R gsub partial replacement wildcards

I only want to replace variables starting with FOO and ending with 1

Capture FOO and everything after it into Group 1 and just match _1 at the end of the string. Then, in the replacement pattern, use a replacement backreference to the Group1 value:

str <- c("FOO_1", "FOO_2", "BAR_1", "BAR_2")
sub("^(FOO.*)_1$", "\\1_A", str)
## => [1] "FOO_A" "FOO_2" "BAR_1" "BAR_2"

See this R demo

If any digit amount at the end of the string must be matched, replace 1 with \\d+.

Details

  • ^ - string start
  • (FOO.*) - FOO substring and then any 0+ chars, as many as possible
  • _1 - a _1 substring (if you replace 1 with \\d+, it will match 1 or more digits)
  • $ - end of string.

Partially replace regex pattern in string using gsub in R?

You may use a capturing group (a pair of unescaped parentheses) around the part of the pattern you need to keep after replacement and a backreference to the group value inside the replacement pattern:

gsub('(\\d)"', "\\1IN", uuuu)
^ ^ ^^^

See the regex demo.

Pattern details

  • (\d) - Capturing group 1 (whose value can be referenced to with a \1 backreference from the replacement pattern): any digit
  • " - a double quote.

R demo:

uuuu<- c('BELT, "V" 5L610, LONG 4.5" WIDE 7.5", TYPE "K"')
cat(gsub('(\\d)"', "\\1IN", uuuu))
## => BELT, "V" 5L610, LONG 4.5IN WIDE 7.5IN, TYPE "K"

How to use Ruby gsub with regex to do partial string substitution

You may replace the first occurrence of 8 digits inside pipes if a string starts with H using

s = "H||CUSTCHQH2H||PHPCCIPHP|1010032000|28092017|25001853||||"
p s.gsub(/\A(H.*?\|)[0-9]{8}(?=\|)/, '\100000000')
# or
p s.gsub(/\AH.*?\|\K[0-9]{8}(?=\|)/, '00000000')

See the Ruby demo. Here, the value is replaced with 8 zeros.

Pattern details

  • \A - start of string (^ is the start of a line in Ruby)
  • (H.*?\|) - Capturing group 1 (you do not need it when using the variation with \K): H and then any 0+ chars as few as possible
  • \K - match reset operator that discards the text matched so far
  • [0-9]{8} - eight digits
  • (?=\|) - the next char must be |, but it is not added to the match value since it is a positive lookahead that does not consume text.

The \1 in the first gsub is a replacement backreference to the value in Group 1.

Partial string replace with gsub

A gsub! (! because you do a each and not a map) with a simple string (instead of a regex) should work:

"path/to/s_image.jpg".gsub '/s_', '/xl_'
# => "path/to/xl_image.jpg"

Update

As pointed out in the comments, the solution might result in unexpected behavior if the path contains multiple occurrences of '/s_'.

"path/s_thing/s_image.jpg".gsub '/s_', '/xl_'
#=> "path/xl_thing/xl_image.jpg"
▲ ▲

Borodin posted a nice, short regex substitution, which works in that case:

"path/s_thing/s_image.jpg".sub %r|/s_(?!.*/)|, '/xl_'
#=> "path/s_thing/xl_image.jpg"
△ ▲

It only replaces the last occurrence of '/s_'.

Using gsub or sub function to only get part of a string?

Following may help you here too.

sub("([^:]*):([^:]*).*","\\1:\\2",df$dat)

Output will be as follows.

> sub("([^:]*):([^:]*).*","\\1:\\2",df$dat)
[1] "WBU-ARGU*06:03" "WBU-ARDU*08:01" "WBU-ARFU*11:03" "WBU-ARFU*03:456b"

Where Input for data frame is as follows.

dat <- c("WBU-ARGU*06:03:04","WBU-ARDU*08:01:01","WBU-ARFU*11:03:05","WBU-ARFU*03:456b")
df <- data.frame(dat)

Explanation: Following is only for explanation purposes.

sub("      ##using sub for global subtitution function of R here.
([^:]*) ##By mentioning () we are keeping the matched values from vector's element into 1st place of memory(which we could use later), which is till next colon comes it will match everything.
: ##Mentioning letter colon(:) here.
([^:]*) ##By mentioning () making 2nd place in memory for matched values in vector's values which is till next colon comes it will match everything.
.*" ##Mentioning .* to match everything else now after 2nd colon comes in value.
,"\\1:\\2" ##Now mentioning the values of memory holds with whom we want to substitute the element values \\1 means 1st memory place \\2 is second memory place's value.
,df$dat) ##Mentioning df$dat dataframe's dat value.

Replace using gsub

I played around with rubular and I came up with a regexp which works if you don't mind adding a dot manually to the replacement.

Here is what I came up with

"this-is-a-string.jpg".gsub(/\w+\./, 'hash.')

So I guess you could make a simple function which replaces it like

def replace_string(string_to_replace, replacement)
string_to_replace(/\w+\./, "#{replacement}.")
end

in ruby 1.9.2 I managed to extract the word "string" but I don't know if that is of any use to you.

  /[-\w]+\-(?<word>(\w+))\.\w+/ =~ "this-is-a-string.jpg"
=> 0
word
=> "string"

I hope I've helped you and given the information you needed

Replace a whole word containing a pattern - gsub and R

You need

gsub("\\s*[[:alpha:]]*([[:alpha:]])\\1{2}[[:alpha:]]*", "", string)
gsub("\\s*\\p{L}*(\\p{L})\\1{2}\\p{L}*", "", string, perl=TRUE)
stringr::str_replace_all(string, "\\s*\\p{L}*(\\p{L})\\1{2}\\p{L}*", "")

See an R demo:

string <- "This is a baaaad unnnnecessary short word"
gsub("\\s*[[:alpha:]]*([[:alpha:]])\\1{2}[[:alpha:]]*", "", string)
gsub("\\s*\\p{L}*(\\p{L})\\1{2}\\p{L}*", "", string, perl=TRUE)
library(stringr)
str_replace_all(string, "\\s*\\p{L}*(\\p{L})\\1{2}\\p{L}*", "")

All yielding [1] "This is a short word".

See the regex demo. Regex details:

  • \s* - zero or more whitespaces
  • \p{L}* / [[:alpha:]]* - zero or more letters
  • (\p{L}) - Capturing group 1: any single letter
  • \1{2} - two occurrences of the same value as in Group 1
  • \p{L}* / [[:alpha:]]* - zero or more letters.


Related Topics



Leave a reply



Submit