Find JSON Strings in a String

Find JSON strings in a string

Extracting the JSON string from given text

Since you're looking for a simplistic solution, you can use the following regular expression that makes use of recursion to solve the problem of matching set of parentheses. It matches everything between { and } recursively.

Although, you should note that this isn't guaranteed to work with all possible cases. It only serves as a quick JSON-string extraction method.

$pattern = '
/
\{ # { character
(?: # non-capturing group
[^{}] # anything that is not a { or }
| # OR
(?R) # recurses the entire pattern
)* # previous group zero or more times
\} # } character
/x
';

preg_match_all($pattern, $text, $matches);
print_r($matches[0]);

Output:

Array
(
[0] => {"action":"product","options":{...}}
[1] => {"action":"review","options":{...}}
)

Regex101 Demo



Validating the JSON strings

In PHP, the only way to know if a JSON-string is valid is by applying json_decode(). If the parser understands the JSON-string and is according to the defined standards, json_decode() will create an object / array representation of the JSON-string.

If you'd like to filter out those that aren't valid JSON, then you can use array_filter() with a callback function:

function isValidJSON($string) {
json_decode($string);
return (json_last_error() == JSON_ERROR_NONE);
}

$valid_jsons_arr = array_filter($matches[0], 'isValidJSON');

Online demo

find json in string with R

I use regexpr() and regmatches().

  • regexpr(pattern,text) : Take the position of text which match the pattern.
  • regmatches(m,x) : Extract matched text.
  • pattern : Turn \{ \} into \\{ \\}.
regexpr("\\{(?:[^{}]|(?R))*\\}",txt,perl = T) %>% regmatches(x=txt)
#[1] "{a:b, c:d}"

This pattern may be easier for understanding.

  • This pattern is \\{(\\S|\\s)+\\} :

    • \\{ means the curly bracket "{"
    • (\\S|\\s)+ means all whitespace characters and non-whitespace characters between curly brackets.
    • \\} means the curly bracket "}"
regexpr("\\{(\\S|\\s)+\\}",txt,perl = T) %>% regmatches(x=txt)
#[1] "{a:b, c:d}"

Hope it is useful to you :)

check for a JSON string inside of a larger string in PHP

Use preg_match_all to find all the JSON and store it into an array like so:

$text = '[22-Aug-2017 16:19:58 America/New_York] WP_Community_Events::maybe_log_events_response: Valid response received. Details: {"api_url":"https:\/\/api.wordpress.org\/events\/1.0\/","request_args":{"body":{"number":5,"ip":"192.168.99.0","locale":"en_GB","timezone":"America\/New_York"}},"response_code":200,"response_body":{"location":{"ip":"47.197.97.47"},"events":"5 events trimmed."}}';

preg_match_all('/\{(?:[^{}]|(?R))*\}/x', $text, $matches);

echo '<pre>';
print_r($matches[0]);

This yields:

Array
(
[0] => {"api_url":"https:\/\/api.wordpress.org\/events\/1.0\/","request_args":{"body":{"number":5,"ip":"192.168.99.0","locale":"en_GB","timezone":"America\/New_York"}},"response_code":200,"response_body":{"location":{"ip":"47.197.97.47"},"events":"5 events trimmed."}}
)

You can read up more from:
Extracting the JSON string from given text

OR if you want the opposite and remove the JSON out and keep the string then you can use preg_replace to do this:

$text = '[22-Aug-2017 16:19:58 America/New_York] WP_Community_Events::maybe_log_events_response: Valid response received. Details: {"api_url":"https:\/\/api.wordpress.org\/events\/1.0\/","request_args":{"body":{"number":5,"ip":"192.168.99.0","locale":"en_GB","timezone":"America\/New_York"}},"response_code":200,"response_body":{"location":{"ip":"47.197.97.47"},"events":"5 events trimmed."}}';

$cleantext = preg_replace('~\{(?:[^{}]|(?R))*\}~', '', $text);

echo $cleantext;

Credit from PHP: How to extract JSON strings out of a string dump

This yields:

[22-Aug-2017 16:19:58 America/New_York] WP_Community_Events::maybe_log_events_response: Valid response received. Details:

Extract Json String from Mixed String with JAVA

There are two ways to achieve the solution:

  1. Using Regex
  2. Write your own parser to achieve the solution

Using Regex

Regex is not the recommended solution:
They can be very inefficient sometimes. See this and this.

Even if you want regex, here is the solution:

see this

Write your own parser to achieve the solution:

void getJsonFromString(String input) {

List<Character> stack = new ArrayList<Character>();
List<String> jsons = new ArrayList<String>();
String temp = "";
for(char eachChar: input.toCharArray()) {
if(stack.isEmpty() && eachChar == '{') {
stack.add(eachChar);
temp += eachChar;
} else if(!stack.isEmpty()) {
temp += eachChar;
if(stack.get(stack.size()-1).equals('{') && eachChar == '}') {
stack.remove(stack.size()-1);
if(stack.isEmpty()) {
jsons.add(temp);
temp = "";
}
}
else if(eachChar == '{' || eachChar == '}')
stack.add(eachChar);
} else if(temp.length()>0 && stack.isEmpty()) {
jsons.add(temp);
temp = "";
}
}
for(String json: jsons)
System.out.println(json);
}

Regex for capturing *only* the value for a specific key out of a JSON string

The RegEx that you link to,

/"Key1": "(.+?)"/i

is actually perfect - it's just that you need to get the output of the capture group, rather than of the whole RegEx, and the tool that you're using (RegExr) doesn't show this.

If you ask for just that group, then you'll have what you wanted, e.g. in JavaScript:

window.onload = ()=>{
document.write('{"Key1": "Value1", "Key2": "Value2"}'.match(/"Key1": "(.+?)"/i)[1]);
}

PHP/Regex: Parse JSON in string

PCRE supports recursive matching for that kind of nested structures. Here is a demo:

$data = 'text text 
{"key":{"key":"value{1}","key2":false}}
{"key":{"key":"value2"}}
{"key":{"key":{"key":"value3"}}} text';

$pattern = '(
\{ # JSON object start
(
\s*
"[^"]+" # key
\s*:\s* # colon
(
# value
(?:
"[^"]+" | # string
\d+(?:\.\d+)? | # number
true |
false |
null
) |
(?R) # pattern recursion
)
\s*
,? # comma
)*
\} # JSON object end
)x';
preg_replace_callback(
$pattern,
function ($match) {
var_dump(json_decode($match[0]));
},
$data
);


Related Topics



Leave a reply



Submit