How to Strip All Non Alphanumeric Characters from a String in C++

Removing non alphanumeric characters from a string in C

The integer bounds you gave don't match the ASCII codes. For example, 'H' is 72.

As the commenters suggest, instead of reading up on the ASCII table, you should use char literals. So,

if ( ( inputString[i] >= 'A' && inputString[i] <= 'Z' )
|| ( inputString[i] >= 'a' && inputString[i] <= 'z' )
|| ( inputString[i] >= '0' && inputString[i] <= '9' ) ) {

You could also avoid all of this by using isalnum from ctype.h.

Remove all non alphabet characters from a string in C -- possible compiler issue

The strcpy function is not guaranteed to work if the ranges overlap, as they do in your case. From C11 7.24.2.3 The strcpy function /2 (my emphasis):

The strcpy function copies the string pointed to by s2 (including the terminating null character) into the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

You can use something like memmove, which does work with overlapping ranges, as per C11 7.24.2.2 The memmove function /2:

The memmove function copies n characters from the object pointed to by s2 into the object pointed to by s1. Copying takes place as if the n characters from the object pointed to by s2 are first copied into a temporary array of n characters that does not overlap the objects pointed to by s1 and s2, and then the n characters from the temporary array are copied into the object pointed to by s1.


But there's a better solution that's O(n) rather than O(n2) in time complexity, while still being overlap-safe:

void strclean (char* src) {
// Run two pointers in parallel.

char *dst = src;

// Process every source character.

while (*src) {
// Only copy (and update destination pointer) if suitable.
// Update source pointer always.

if (islower(*src)) *dst++ = *src;
src++;
}

// Finalise destination string.

*dst = '\0';
}

You'll notice I'm also using islower() (from ctype.h) to detect lower case alphabetic characters. This is more portable since the C standard does not mandate that the alpha characters have consecutive code points (the digits are the only ones guaranteed to be consecutive).

There's also no separate need to check for isalpha() since, as per C11 7.4.1.2 The isalpha function /2, islower() == true implies isalpha() == true:

The isalpha function tests for any character for which isupper or islower is true, or ...

How do I remove all non alphanumeric characters from a string except dash?

Replace [^a-zA-Z0-9 -] with an empty string.

Regex rgx = new Regex("[^a-zA-Z0-9 -]");
str = rgx.Replace(str, "");

Removing non alpha characters in C

As far as I know, there isn't a way to do this that is better than O(n) anyways. Even if there was such a function or even a regex engine, it will probably less efficient than the simple linear complexity solution. You can just loop through the array and anything less than 'A' or greater than 'z' gets set to ' '.

while(*array)
{
if(!isalpha(*array))
*array = ' ';

array++;
}

Remove all non-alpha characters from string

You can use std::remove_if along with erase:

#include <cctype>
#include <algorithm>
#include <string>
//...
std::wstring FileHandler::removePunctuation(std::wstring word)
{
word.erase(std::remove_if(word.begin(), word.end(),
[](char ch){ return !::iswalnum(ch); }), word.end());
return word;
}

cStrings Remove non-alpha/non-space character - C++

After One Line After this loop

for (int k = i; documentCopy[k] != '\0'; k++)
{
documentCopy[k] = documentCopy[k+1];
}
i--; //Add This line in your Code.

This will work.

for example

if you are checking a[0] and shifting a[0] = a[1]
So you need to check a[0] again because now it is holding value of a[1] now, so need to decrease the index value.



Related Topics



Leave a reply



Submit