How to extract a substring using regex
Assuming you want the part between single quotes, use this regular expression with a Matcher
:
"'(.*?)'"
Example:
String mydata = "some string with 'the data i want' inside";
Pattern pattern = Pattern.compile("'(.*?)'");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
Result:
the data i want
How to extract a substring using regex in java
You can use this regex \{\$(.*?)\}
with pattern like this :
String xmlnode = "<firstname id=\"{$person.id}\"> {$person.firstname} </firstname>";
Pattern pattern = Pattern.compile("\\{\\$(.*?)\\}");
Matcher matcher = pattern.matcher(xmlnode);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Note : you have to escape each character {
$
}
with \
because each one is special character in regex.
Outputs
person.id
person.firstname
How to extract a substring using a regular expression
To make sure that we have one of the answers we need to check what symbol is before it. To do so we can use Positive Lookbehind
like this (?<=Y)X
. Idea is something like "find the X if there is a Y before it". Then all we need is just to take all the text not matching special chars like \
, #
, [
or ]
using [^XYZ]
that matches everything besides X
, Y
and Z
.
To fix the issue from the comment we also need to check what comes next after the answer. There are 2 options: \
or ]
. Now we are going to use Positive Lookahead which is like Lookbehind but checks the text after X
. Example X(?=Y)
means "find the X if there is a Y after it".
Final patterns are:
- Wrong answers:
(?<=\\|\[)[^\\#\[\]]+(?=\\|\])
- Correct answers:
(?<=#)[^\\#\[\]]+(?=\\|\])
How to extract a substring using a regular expression?
REGEX is a filter test, REPLACE is the extraction operation.
SELECT *
WHERE {
?s ?p ?o .
FILTER REGEX(?o, "\\{.*:(.*)\\}")
}
which tests ?o
, and does not extract the ()
part.
Note the double \\
.
To extract use BIND-REPLACE.
SELECT * {
?s ?p ?o .
BIND(REPLACE(?o, "^.*\\{.*:(.*)\\}.*$", "$1") AS ?substring)
}
In the general case, you may need str(?o)
instead of ?o
in functions.
How to extract a substring using this regex pattern? It's give a ValueError: too many values to unpack (expected 1)
let's update patterns string to a logical view and follow main feature.
regex_what_who = r"(que sabes|que sabias|que sabrias|que te referis|que te refieres|que te referias|que te habias referido|que habias referido|a que|que|quienes|quien|con que|con lo que|con la que|con|acerca de que|acerca de quienes|acerca de quien|sobre de que|sobre que|sobre de quienes|sobre quienes|sobre de quien|sobre quien|son|sean|es|serian|seria|iguales|igual|similares|similar|parecidos|parecido|comparables|comparable|asociables|asociable|distinguibles|distinguible|distintos|distinto|diferentes|diferente|diferenciables|diferenciable).*(a|del|de)\s*((?:\w+\s*)+)?"
then, fix error first error in case if we got one result or many:
association, _temp = l.groups()
It Work's! -)
How to extract part of string in Bash using regex
This gnu sed
should work with ignore case flag:
sed -E 's~^(.*/){0,1}((def|foo|bar)-[0-9]{2,6})-.*~\2~I' file
def-1234
foo-12
bar-12345
This sed matches:
(.*/){0,1}
: Match a string upto/
optionally at the start(
: Start capture group #2(def|foo|bar)
: Matchdef
orfoo
orbar
-
: Match a-
[0-9]{2,6}
: Match 2 to 6 digits
)
: End capture group #2-.*
: Match-
followed by anything till end- Substitution is value we capture in group #2
Or you may use this awk
:
awk -v IGNORECASE=1 -F / 'match($NF, /^(def|foo|bar)-[0-9]{2,6}-/) {print substr($NF, 1, RLENGTH-1)}' file
def-1234
foo-12
bar-12345
Awk explanation:
-v IGNORECASE=1
: Enable ignore case matching-F /
: Use/
as field separatormatch($NF, /^(def|foo|bar)-[0-9]{2,6}-/)
: Match text using regex^(def|foo|bar)-[0-9]{2,6}-
in$NF
which is last field using/
as field separator (to ignore text before/
)- If match is successful then using
substr
print text from position1
toRLENGTH-1
(since we matching until-
after digits)
How to extract the substring between two markers?
Using regular expressions - documentation for further reference
import re
text = 'gfgfdAAA1234ZZZuijjk'
m = re.search('AAA(.+?)ZZZ', text)
if m:
found = m.group(1)
# found: 1234
or:
import re
text = 'gfgfdAAA1234ZZZuijjk'
try:
found = re.search('AAA(.+?)ZZZ', text).group(1)
except AttributeError:
# AAA, ZZZ not found in the original string
found = '' # apply your error handling
# found: 1234
Related Topics
Compiling a Java Program into an Executable
Is There a Destructor for Java
Map Implementation with Duplicate Keys
Clone() VS Copy Constructor- Which Is Recommended in Java
How to Search Google Programmatically Java API
How to Calculate a Time Difference in Java
How to Sort an Arraylist in Java
Is the Use of Java's Default Package a Bad Practice
How to Convert Strings to and from Utf8 Byte Arrays in Java
What Is the Easiest/Best/Most Correct Way to Iterate Through the Characters of a String in Java
Jpanel in Puzzle Game Not Updating
Regex for Matching Something If It Is Not Preceded by Something Else