Replacing Variable Length String with Some Word

Replacing variable length string with some word

First get rid of the multiple underscores, then do the replace.

Here is one method:

select replace(replace(replace(@string, '_', '><'
), '<>', ''
), '><', 'LASTNAME'
)

Find and replace string with variable length between 2 signs

You can also use Regular Expressions to do this.

Option Explicit

Sub BereinigenRgEx()
Dim Text As String, outText As String
Dim RgEx As Object
'Text = ActiveSheet.Cells(1, 1)
Text = "26125 Oldenburg (Oldenburg), Alexandersfeld"
Set RgEx = CreateObject("VBScript.RegExp")
With RgEx
.Global = True
.Pattern = "\(.+\)"
outText = .Replace(Text, "")
Debug.Print outText
End With
End Sub

Replace variable length string with characters to matching original string length

With GNU awk for the 3rd arg to match() and gensub():

$ cat tst.awk
{
while ( match($3,/(.*)(>\.*<)(.*)/,a) ) {
$3 = a[1] gensub(/./,"L","g",a[2]) a[3]
}
print
}

$ awk -f tst.awk file
field1 field2 >>>>>.>............>>LLLLLLLLLLL<<.......>>>LLLLLLLLL<<<.<.<<<<<.

With any awk:

$ cat tst.awk
{
while ( match($3,/>\.*</) ) {
tgt = substr($3,RSTART,RLENGTH)
gsub(/./,"L",tgt)
$3 = substr($3,1,RSTART-1) tgt substr($3,RSTART+RLENGTH)
}
print
}

$ awk -f tst.awk file
field1 field2 >>>>>.>............>>LLLLLLLLLLL<<.......>>>LLLLLLLLL<<<.<.<<<<<.

Replace found string with different length?

abc(?![^ZER])

Try this.See demo.

http://regex101.com/r/lS5tT3/43

Just replace by whateveryouwant

Sed to replace variable length string between 2 known patterns

Using sed loops

You can use sed, though the thinking required is not wholly obvious:

sed ':a;s/^\(Hello\.x*\)[^x]\(.*\.SecondString\)/\1x\2/;t a'

This is for GNU sed; BSD (Mac OS X) sed and other versions may be fussier and require:

sed -e ':a' -e 's/^\(Hello\.x*\)[^x]\(.*\.SecondString\)/\1x\2/' -e 't a'

The logic is identical in both:

  • Create a label a
  • Substitute the lead string and a sequence of x's (capture 1), followed by a non-x, and arbitrary other data plus the second string (capture 2), and replace it with the contents of capture 1, an x and the content of capture 2.
  • If the s/// command made a change, go back to the label a.

It stops substituting when there are no non-x's between the two marker strings.

Two tweaks to the regex allow the code to recognize two copies of the pattern on a single line. Lose the ^ that anchors the match to the beginning of the line, and change .* to [^.]* (so that the regex is not quite so greedy):

$ echo Hello.StringToBeReplaced.SecondString Hello.StringToBeReplaced.SecondString |
> sed ':a;s/\(Hello\.x*\)[^x]\([^.]*\.SecondString\)/\1x\2/;t a'
Hello.xxxxxxxxxxxxxxxxxx.SecondString Hello.xxxxxxxxxxxxxxxxxx.SecondString
$

Using the hold space

hek2mgl suggests an alternative approach in sed using the hold space. This can be implemented using:

$ echo Hello.StringToBeReplaced.SecondString |
> sed 's/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/\1@\3@@\2/
> h
> s/.*@@//
> s/./x/g
> G
> s/\(x*\)\n\([^@]*\)@\([^@]*\)@@.*/\2\1\3/
> '
Hello.xxxxxxxxxxxxxxxxxx.SecondString
$

This script is not as robust as the looping version but works OK as written when each line matches the lead-middle-tail pattern. It first splits the line into three sections: the first marker, the bit to be mangled, and the second marker. It reorganizes that so that the two markers are separated by @, followed by @@ and the bit to be mangled. h copies the result to the hold space. Remove everything up to and including the @@; replace each character in the bit to be mangled by x, then copy the material in the hold space after the x's in the pattern space, with a newline separating them. Finally, recognize and capture the x's, the lead marker, and the tail marker, ignoring the newline, the @ and @@ plus trailing material, and reassemble as lead marker, x's, and tail marker.

To make it robust, you'd recognize the pattern and then group the commands shown inside { and } to group them so they're only executed when the pattern is recognized:

sed '/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/{
s/^\(Hello\.\)\([^.]\{1,\}\)\(\.SecondString\)/\1@\3@@\2/
h
s/.*@@//
s/./x/g
G
s/\(x*\)\n\([^@]*\)@\([^@]*\)@@.*/\2\1\3/
}'

Adjust to suit your needs...

Adjusting to suit your needs

[I tried one of your solutions and it worked fine.]
However when I try to replace the 'hello' by my real string (which is
'1.2.840.') and my second string (which is simply a dot '.'), things stop
working. I guess all these dots confuse the sed command.
What I try to achieve is transform this '1.2.840.10008.' to
'1.2.840.xxxxx.'

And this pattern happens several times in my file with variable number
of characters to be replaced between the '1.2.840.' and the next dot '.'

There are times when it is important to get your question close enough to the real scenario — this may be one such. Dot is a metacharacter in
sed regular expressions (and in most other dialects of regular expression — shell globbing being the noticeable exception). If the 'bit to be mangled' is always digits, then we can tighten up the regular expressions, though actually (when I look at the code ahead) the tightening really isn't imposing much in the way of a restriction.

Pretty much any solution using regular expressions is a balancing act that has to pit convenience and abbreviation against reliability and precision.

Revised code plus data

cat <<EOF |
transform this '1.2.840.10008.' to '1.2.840.xxxxx.'
OK, and hence 1.2.840.21. and 1.2.840.20992. should lose the 21 and 20992.
EOF

sed ':a;s/\(1\.2\.840\.x*\)[^x.]\([^.]*\.\)/\1x\2/;t a'

Example output:

transform this '1.2.840.xxxxx.' to '1.2.840.xxxxx.'
OK, and hence 1.2.840.xx. and 1.2.840.xxxxx. should lose the 21 and 20992.

The changes in the script are:

sed ':a;s/\(1\.2\.840\.x*\)[^x.]\([^.]*\.\)/\1x\2/;t a'
  1. Add 1\.2\.840\. as the start pattern.
  2. Revise the 'character to replace' expression to 'not x or .'.
  3. Use just \. as the tail pattern.

You could replace the [^x.] with [0-9] if you're sure you only want digits matched, in which case you won't have to worry about spaces as discussed below.

You may decide you don't want spaces to be matched so that a casual comment like:

The net prefix is 1.2.840. And there are other prefixes too.

does not end up as:

The net prefix is 1.2.840.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.

In which case, you probably need to use:

sed ':a;s/\(1\.2\.840\.x*\)[^x. ]\([^ .]*\.\)/\1x\2/;t a'

And so the changes continue until you've got something precise enough to do what you want without doing anything you don't want on your current data set. Writing bullet-proof regular expressions requires a precise specification of what you want matched, and can be quite hard.

Bash: Replace word with spaces equal to the length of the word

Here's one way you could do it, using a combination of shell parameter expansion and the sed command.

$ var="XXXX This is a line"
$ word_to_replace="XXXX"
$ replacement=${word_to_replace//?/ }
$ sed "s/$word_to_replace/$replacement/" <<<"$var"
This is a line

? matches any character and ${var//find/replace} does a global substitution, so the variable $replacement has the same length as $word_to_replace, but is composed solely of spaces.

You can save the result to a variable in the usual way:

new_var=$(sed "s/$word_to_replace/$replacement/" <<<"$var")

How to replace a string if position and length are unknown/variable?

You need to do a replacement of the url string inside the corresponding area.

area.replace(anchor, createAnchorTag(anchor))


Related Topics



Leave a reply



Submit