splitting a string into an array in C++ without using vector
It is possible to turn the string into a stream by using the std::stringstream
class (its constructor takes a string as parameter). Once it's built, you can use the >>
operator on it (like on regular file based streams), which will extract, or tokenize word from it:
#include <iostream>
#include <sstream>
using namespace std;
int main(){
string line = "test one two three.";
string arr[4];
int i = 0;
stringstream ssin(line);
while (ssin.good() && i < 4){
ssin >> arr[i];
++i;
}
for(i = 0; i < 4; i++){
cout << arr[i] << endl;
}
}
Splitting string into words in array without using any pre-made functions in C
You have quite a few errors in your program:
arr = (char **)malloc(size * sizeof(char));
is not right sincearr
is of typechar**
. You should usesizeof(char*)
or better(sizeof(*arr))
sincesizeof(char)
is usually not equal tosizeof(char*)
for modern systems.You don't have braces
{}
around yourelse
statement inft_split_whitespaces
which you probably intended. So your
conditional logic breaks.You are allocating a new
char[]
for every non--whitespace
character in thewhile
loop. You should only allocate one for
every new word and then just fill in the characters in that array.*(arr+index2) = &str[index];
This doesn't do what you think it
does. It just points the string at*(arr+index2)
tostr
offset
byindex
. You either need to copy each character individually or
do amemcpy()
(which you probably can't use in the question). This
explains why your answer just provides offsets into the main string and
not the actual tokens.**arr = '\0';
You will lose whatever you store in the0th
index
ofarr
. You need to individually append a\0
to each string inarr
.*(arr+index2) = (char*) malloc(index * sizeof(char));
You will end up
allocating progressively increasing size ofchar
arrays at because
you are usingindex
for the count of characters, which keeps on
increasing. You need to figure out the correct length of each token in
the string and allocate appropriately.
Also why *(arr + index2)
? Why not use the much easier to read arr[index2]
?
Further clarifications:
Consider str = "abc de"
You'll start with
*(arr + 0) = (char*) malloc(0 * sizeof(char));
//ptr from malloc(0) shouldn't be dereferenced and is mostly pointless (no pun), probably NULL
*(arr + 0) = &str[0];
Here str[0] = 'a'
and is a location somehwhere in memory, so on doing &str[0]
, you'll store that address in *(arr + 0)
Now in the next iteration, you'll have
*(arr + 0) = (char*) malloc(1 * sizeof(char));
*(arr + 0) = &str[1];
This time you replace the earlier malloc'd array at the same index2
again with a different address. In the next iterations *(arr + 0) = (char*) malloc(2 * sizeof(char));
. You end up resetting the same *(arr + index2)
position till you encounter a whitespace after which you do the same thing again for the next word. So don't allocate arrays for every index
value but only if and when required. Also, this shows that you'll keep on increasing the size passed to malloc
with the increasing value of index
which is what #6 indicated.
Coming to &str[index]
.
You are setting (arr + index2)
i.e. a char*
(pointer to char
) to another char*
. In C, setting a pointer to another pointer doesn't copy the contents of the second pointer to the first, but only makes both of them point to the same memory location. So when you set something like *(arr + 1) = &str[4]
, it's just a pointer into the original string at index = 4
. If you try to print this *(arr + 1)
you'll just get a substring from index = 4
to the end of the string, not the word you're trying to obtain.
**arr = '\0'
is just dereferencing the pointer at *arr
and setting its value to \0
. So imagine if you had *(arr + 0) = "hello\0"
, you'll set it to "\0ello\0"
. If you're ever iterating over this string, you'll never end up traversing beyond the first '\0'
character. Hence you lose whatever *arr
was earlier pointing to.
Also, *(arr + i)
and arr[i]
are exactly equivalent and make for much better readability. It better conveys that arr
is an array and arr[i]
is dereferencing the i
th element.
Split string by a character?
stringstream
can do all these.
Split a string and store into int array:
string str = "102:330:3133:76531:451:000:12:44412";
std::replace(str.begin(), str.end(), ':', ' '); // replace ':' by ' '
vector<int> array;
stringstream ss(str);
int temp;
while (ss >> temp)
array.push_back(temp); // done! now array={102,330,3133,76531,451,000,12,44412}Remove unneeded characters from the string before it's processed such as
$
and#
: just as the way handling:
in the above.
PS: The above solution works only for strings that don't contain spaces. To handle strings with spaces, please refer to here based on std::string::find()
and std::string::substr()
.
C - Split string into an array of strings at certain characters
use strtok()?
string str as apples,cakes,cupcakes,bannanas and delim ",".
char *token;
token = strtok(str, delim);
while(token != NULL)
{
printf("%s\n", token);
token = strtok(NULL,delim);
}
may this help.
Parse (split) a string in C++ using string delimiter (standard C++)
You can use the std::string::find()
function to find the position of your string delimiter, then use std::string::substr()
to get a token.
Example:
std::string s = "scott>=tiger";
std::string delimiter = ">=";
std::string token = s.substr(0, s.find(delimiter)); // token is "scott"
The
find(const string& str, size_t pos = 0)
function returns the position of the first occurrence ofstr
in the string, ornpos
if the string is not found.The
substr(size_t pos = 0, size_t n = npos)
function returns a substring of the object, starting at positionpos
and of lengthnpos
.
If you have multiple delimiters, after you have extracted one token, you can remove it (delimiter included) to proceed with subsequent extractions (if you want to preserve the original string, just use s = s.substr(pos + delimiter.length());
):
s.erase(0, s.find(delimiter) + delimiter.length());
This way you can easily loop to get each token.
Complete Example
std::string s = "scott>=tiger>=mushroom";
std::string delimiter = ">=";
size_t pos = 0;
std::string token;
while ((pos = s.find(delimiter)) != std::string::npos) {
token = s.substr(0, pos);
std::cout << token << std::endl;
s.erase(0, pos + delimiter.length());
}
std::cout << s << std::endl;
Output:
scott
tiger
mushroom
How can I split a string by a delimiter into an array?
Here's my first attempt at this using vectors and strings:
vector<string> explode(const string& str, const char& ch) {
string next;
vector<string> result;
// For each character in the string
for (string::const_iterator it = str.begin(); it != str.end(); it++) {
// If we've hit the terminal character
if (*it == ch) {
// If we have some characters accumulated
if (!next.empty()) {
// Add them to the result vector
result.push_back(next);
next.clear();
}
} else {
// Accumulate the next character into the sequence
next += *it;
}
}
if (!next.empty())
result.push_back(next);
return result;
}
Hopefully this gives you some sort of idea of how to go about this. On your example string it returns the correct results with this test code:
int main (int, char const **) {
std::string blah = "___this_ is__ th_e str__ing we__ will use__";
std::vector<std::string> result = explode(blah, '_');
for (size_t i = 0; i < result.size(); i++) {
cout << "\"" << result[i] << "\"" << endl;
}
return 0;
}
Related Topics
How to Create a Pause/Wait Function Using Qt
How to Catch Python Stdout in C++ Code
Variable Declarations in Header Files - Static or Not
How to Make an Expandable/Collapsable Section Widget in Qt
Lambda Implicit Capture Fails with Variable Declared from Structured Binding
Copy an Cv::Mat Inside a Roi of Another One
Tools to Find Included Headers Which Are Unused
How to Sort a Std::Map First by Value, Then by Key
What Is the Simplest Way to Convert Array to Vector
Common Array Length MACro for C
A C++ Implementation That Detects Undefined Behavior
How to Get the Class Name from a C++ Object
Warning: Format Not a String Literal and No Format Arguments