Split a string that has white spaces, unless they are enclosed within quotes?
string input = "one \"two two\" three \"four four\" five six";
var parts = Regex.Matches(input, @"[\""].+?[\""]|[^ ]+")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
C++ Split a string by blank spaces unless it is enclosed in quotes and store in a vector
Here is a working example:
#include <string>
#include <vector>
#include <iostream>
using namespace std;
int main(void) {
string str = "12345 Hello World \"This is a group\"";
vector<string> v;
size_t i = 0, j = 0, begin = 0;
while(i < str.size()) {
if(str[i] == ' ' || i == 0) {
if(i + 1 < str.size() && str[i + 1] == '\"') {
j = begin + 1;
while(j < str.size() && str[j++] != '\"');
v.push_back(std::string(str, begin, j - 1 - i));
begin = j - 1;
i = j - 1;
continue;
}
j = begin + 1;
while(j < str.size() && str[j++] != ' ');
v.push_back(std::string(str, begin, j - 1 - i - (i ? 1 : 0) ));
begin = j;
}
++i;
}
for(auto& str: v)
cout << str << endl;
return 0;
}
Output:
12345
Hello
World
"This is a group"
However, notice that this code is for demonstration, since it doesn't handle all cases. For example, if yuo have onle double quote in your input, then this while(j < str.size() && str[j++] != '\"');
will case the whole string from that point to not be splitted.
Regular Expression to split on spaces unless in quotes
No options required
Regex:
\w+|"[\w\s]*"
C#:
Regex regex = new Regex(@"\w+|""[\w\s]*""");
Or if you need to exclude " characters:
Regex
.Matches(input, @"(?<match>\w+)|\""(?<match>[\w\s]*)""")
.Cast<Match>()
.Select(m => m.Groups["match"].Value)
.ToList()
.ForEach(s => Console.WriteLine(s));
Splitting string on spaces unless in double quotes but double quotes can have a preceding string attached
We can do this using a formal pattern matcher. The secret sauce of the answer below is to use the not-much-used Matcher#appendReplacement
method. We pause at each match, and then append a custom replacement of anything appearing inside two pairs of quotes. The custom method removeSpaces()
strips all whitespace from each quoted term.
public static String removeSpaces(String input) {
return input.replaceAll("\\s+", "");
}
String input = "abc test=\"x y z\" magic=\" hello \" hola";
Pattern p = Pattern.compile("\"(.*?)\"");
Matcher m = p.matcher(input);
StringBuffer sb = new StringBuffer("");
while (m.find()) {
m.appendReplacement(sb, "\"" + removeSpaces(m.group(1)) + "\"");
}
m.appendTail(sb);
String[] parts = sb.toString().split("\\s+");
for (String part : parts) {
System.out.println(part);
}
abc
test="xyz"
magic="hello"
hola
Demo
The big caveat here, as the above comments hinted at, is that we are really using a regex engine as a rudimentary parser. To see where my solution would fail fast, just remove one of the quotes by accident from a quoted term. But, if you are sure you input is well formed as you have showed us, this answer might work for you.
How to split on white spaces not between quotes?
\s(?=(?:[^'"`]*(['"`])[^'"`]*\1)*[^'"`]*$)
You can use this regex with lookahead
to split upon.See demo.
https://regex101.com/r/5I209k/4
or if mixed tick types.
https://regex101.com/r/5I209k/7
Split a string by spaces -- preserving quoted substrings -- in Python
You want split
, from the built-in shlex
module.
>>> import shlex
>>> shlex.split('this is "a test"')
['this', 'is', 'a test']
This should do exactly what you want.
If you want to preserve the quotation marks, then you can pass the posix=False
kwarg.
>>> shlex.split('this is "a test"', posix=False)
['this', 'is', '"a test"']
Regex for splitting a string using space when not surrounded by single or double quotes
I don't understand why all the others are proposing such complex regular expressions or such long code. Essentially, you want to grab two kinds of things from your string: sequences of characters that aren't spaces or quotes, and sequences of characters that begin and end with a quote, with no quotes in between, for two kinds of quotes. You can easily match those things with this regular expression:
[^\s"']+|"([^"]*)"|'([^']*)'
I added the capturing groups because you don't want the quotes in the list.
This Java code builds the list, adding the capturing group if it matched to exclude the quotes, and adding the overall regex match if the capturing group didn't match (an unquoted word was matched).
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("[^\\s\"']+|\"([^\"]*)\"|'([^']*)'");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
if (regexMatcher.group(1) != null) {
// Add double-quoted string without the quotes
matchList.add(regexMatcher.group(1));
} else if (regexMatcher.group(2) != null) {
// Add single-quoted string without the quotes
matchList.add(regexMatcher.group(2));
} else {
// Add unquoted word
matchList.add(regexMatcher.group());
}
}
If you don't mind having the quotes in the returned list, you can use much simpler code:
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("[^\\s\"']+|\"[^\"]*\"|'[^']*'");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
javascript split string by space, but ignore space in quotes (notice not to split by the colon too)
s = 'Time:"Last 7 Days" Time:"Last 30 Days"'
s.match(/(?:[^\s"]+|"[^"]*")+/g)
// -> ['Time:"Last 7 Days"', 'Time:"Last 30 Days"']
Explained:
(?: # non-capturing group
[^\s"]+ # anything that's not a space or a double-quote
| # or…
" # opening double-quote
[^"]* # …followed by zero or more chacacters that are not a double-quote
" # …closing double-quote
)+ # each match is one or more of the things described in the group
Turns out, to fix your original expression, you just need to add a +
on the group:
str.match(/(".*?"|[^"\s]+)+(?=\s*|\s*$)/g)
# ^ here.
Related Topics
How to Programmatically Limit My Program's CPU Usage to Below 70%
Serializable Classes and Dynamic Proxies in Ef - How
Including Pictures in an Outlook Email
A Pattern for Self-Cancelling and Restarting Task
ASP.NET Is There a Better Way to Find Controls That Are Within Other Controls
How to Check If a Number Is Positive or Negative in C#
C# Constructing Parameter Query SQL - Like %
Can Unity Be Made to Not Throw Synchronizationlockexception All the Time
Comboboxes Are Linked for Some Reason
Change Flow of Messages in Microsoft Bot Framework
Switch Between Dotnet Core Sdk Versions
Execute Multiple Queries in Single Oracle Command in C#
Copying Free Hand Drawing from Panel in Visual Studio 2013
Xaml Gridview Itemtemplate Not Binding to Control
How to Create Custom Http Status Codes