Add Spaces Before Capital Letters

Insert space before capital letters

You can just add a space before every uppercase character and trim off the leading and trailing spaces

s = s.replace(/([A-Z])/g, ' $1').trim()

Add spaces before Capital Letters

The regexes will work fine (I even voted up Martin Browns answer), but they are expensive (and personally I find any pattern longer than a couple of characters prohibitively obtuse)

This function

string AddSpacesToSentence(string text, bool preserveAcronyms)
{
if (string.IsNullOrWhiteSpace(text))
return string.Empty;
StringBuilder newText = new StringBuilder(text.Length * 2);
newText.Append(text[0]);
for (int i = 1; i < text.Length; i++)
{
if (char.IsUpper(text[i]))
if ((text[i - 1] != ' ' && !char.IsUpper(text[i - 1])) ||
(preserveAcronyms && char.IsUpper(text[i - 1]) &&
i < text.Length - 1 && !char.IsUpper(text[i + 1])))
newText.Append(' ');
newText.Append(text[i]);
}
return newText.ToString();
}

Will do it 100,000 times in 2,968,750 ticks, the regex will take 25,000,000 ticks (and thats with the regex compiled).

It's better, for a given value of better (i.e. faster) however it's more code to maintain. "Better" is often compromise of competing requirements.

Hope this helps :)

Update

It's a good long while since I looked at this, and I just realised the timings haven't been updated since the code changed (it only changed a little).

On a string with 'Abbbbbbbbb' repeated 100 times (i.e. 1,000 bytes), a run of 100,000 conversions takes the hand coded function 4,517,177 ticks, and the Regex below takes 59,435,719 making the Hand coded function run in 7.6% of the time it takes the Regex.

Update 2
Will it take Acronyms into account? It will now!
The logic of the if statment is fairly obscure, as you can see expanding it to this ...

if (char.IsUpper(text[i]))
if (char.IsUpper(text[i - 1]))
if (preserveAcronyms && i < text.Length - 1 && !char.IsUpper(text[i + 1]))
newText.Append(' ');
else ;
else if (text[i - 1] != ' ')
newText.Append(' ');

... doesn't help at all!

Here's the original simple method that doesn't worry about Acronyms

string AddSpacesToSentence(string text)
{
if (string.IsNullOrWhiteSpace(text))
return "";
StringBuilder newText = new StringBuilder(text.Length * 2);
newText.Append(text[0]);
for (int i = 1; i < text.Length; i++)
{
if (char.IsUpper(text[i]) && text[i - 1] != ' ')
newText.Append(' ');
newText.Append(text[i]);
}
return newText.ToString();
}

Add spaces before Capital Letters then turn them to lowercase string

The reason why it gets weird with strings that have more than 1 capital letter is that every time you find one, you add a blank space which makes the following indices increase in a single unit.

It's a simple workaround: just place a counter splitCount to keep track of how many spaces you've added and sum it with the index i to correct the indices.

function isUpper(str) {
return !/[a-z]/.test(str) && /[A-Z]/.test(str);
}

function capSpace(txt) {
var arr = Array.from(txt);
var splitCount = 0; // added a counter
for (let i = 1; i < txt.length; i++){
if (isUpper(txt[i]) === true) {
// sum it with i
arr.splice((i + splitCount),0,' ')
splitCount++; // increase every time you split
}
}
return arr.join('').toString().toLowerCase();
}
console.log(capSpace('iLikeSwimming'))

How do I insert space before capital letter if and only if previous letter is not capital?

You should use regular expressions carefully. They can easily transform to gargantuan monsters nobody can understand. You can solve your problem with simple loop instead of regexp:

a = 'SMThingAnotherThingBIGCapitalLetters'
result = a[0]

for i, letter in enumerate(a):
if letter.isupper() and (result[-1].islower() or a[i+1].islower()):
result += ' '
if i: result += letter
result

'SM Thing Another Thing BIG Capital Letters'

Add space before capital letters in a dataframe or column in python using regex

Use Series.str.replace with replace uppercase by same vales with space before and then remove first space:

df = pd.DataFrame({'U.N.Region':['WestAfghanistan','NorthEastAfghanistan']})

df['U.N.Region'] = df['U.N.Region'].str.replace( r"([A-Z])", r" \1").str.strip()
print (df)
U.N.Region
0 West Afghanistan
1 North East Afghanistan

Add spaces before capital letter

This is certainly not a query that I'm very proud of but it does the job (assuming you only care about capital letters in English language)

SELECT 
LTRIM(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
YourCol COLLATE Latin1_General_100_BIN2
,'A',' A')
,'B',' B')
,'C',' C')
,'D',' D')
,'E',' E')
,'F',' F')
,'G',' G')
,'H',' H')
,'I',' I')
,'J',' J')
,'K',' K')
,'L',' L')
,'M',' M')
,'N',' N')
,'O',' O')
,'P',' P')
,'Q',' Q')
,'R',' R')
,'S',' S')
,'T',' T')
,'U',' U')
,'V',' V')
,'W',' W')
,'X',' X')
,'Y',' Y')
,'Z',' Z')
)
FROM YourTable

A pythonic way to insert a space before capital letters

You could try:

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWord")
'Word Word Word'


Related Topics



Leave a reply



Submit