Chop a String in Ruby into Fixed Length String Ignoring (Not Considering/Regardless) New Line or Space Characters

Chop a string in Ruby into fixed length string ignoring (not considering/regardless) new line or space characters

"This is some\nText\nThis is some text".scan(/.{1,17}/m)
# => ["This is some\nText", "\nThis is some tex", "t"]

How to split the string by certain amount of characters in Ruby

Using Enumerable#each_slice

'some_string'.chars.each_slice(3).map(&:join)
# => ["som", "e_s", "tri", "ng"]

Using regular expression:

'some_string'.scan(/.{1,3}/)
# => ["som", "e_s", "tri", "ng"]

How to remove line break when reading files in Ruby

Your code has a minor issue that causes the results you are experiencing.

when you use:

name1 = File.readlines('first.txt').sample(1)

The returned value ISN'T a String, but rather an Array with 1 random sample. i.e:

["Jhon"]

This is why you get the output ["Jhon"] when using print.

Since you expect (and prefer) a string, try this instead:

name1 = File.readlines('first.txt').sample(1)[0]
name2 = File.readlines('middle.txt').sample(1)[0]
name3 = File.readlines('last.txt').sample(1)[0]

or:

name1 = File.readlines('first.txt').sample(1).pop
name2 = File.readlines('middle.txt').sample(1).pop
name3 = File.readlines('last.txt').sample(1).pop

or, probably what you meant, with no arguments, sample will return an object instead of an Array:

name1 = File.readlines('first.txt').sample
name2 = File.readlines('middle.txt').sample
name3 = File.readlines('last.txt').sample

Also, while printing, it would be better if you created one string to include all the spaces and formatting you wanted. i.e.:

name1 = File.readlines('first.txt').sample(1).pop
name2 = File.readlines('middle.txt').sample(1).pop
name3 = File.readlines('last.txt').sample(1).pop

puts "#{name1} #{name2} #{name3}."
# or
print "#{name1} #{name2} #{name3}."

C# string manipulation handling white space with \r\n inbetween

Thank to the patterns provided by @Rik and @Tom Fenech I could find a way around the problem by using:

   string pattern = "\\s+";
string replacement = " ";
Regex rgx = new Regex(pattern);
string test = rgx.Replace(str, replacement);

To remove all the redundant white space from the string (but still leaving at least one space) and:

    test = test.Replace("\r\n", String.Empty);

To remove the line breaks that was unnecessary. I could then match "value item" and get the index.

Thanks for the answers guys.

Regular expression to match a line that doesn't contain a word

The notion that regex doesn't support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:

^((?!hede).)*$

Non-capturing variant:

^(?:(?!:hede).)*$

The regex above will match any string, or line without a line break, not containing the (sub)string 'hede'. As mentioned, this is not something regex is "good" at (or should do), but still, it is possible.

And if you need to match line break chars as well, use the DOT-ALL modifier (the trailing s in the following pattern):

/^((?!hede).)*$/s

or use it inline:

/(?s)^((?!hede).)*$/

(where the /.../ are the regex delimiters, i.e., not part of the pattern)

If the DOT-ALL modifier is not available, you can mimic the same behavior with the character class [\s\S]:

/^((?!hede)[\s\S])*$/

Explanation

A string is just a list of n characters. Before, and after each character, there's an empty string. So a list of n characters will have n+1 empty strings. Consider the string "ABhedeCD":

    ┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐
S = │e1│ A │e2│ B │e3│ h │e4│ e │e5│ d │e6│ e │e7│ C │e8│ D │e9│
└──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘

index 0 1 2 3 4 5 6 7

where the e's are the empty strings. The regex (?!hede). looks ahead to see if there's no substring "hede" to be seen, and if that is the case (so something else is seen), then the . (dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they don't consume any characters. They only assert/validate something.

So, in my example, every empty string is first validated to see if there's no "hede" up ahead, before a character is consumed by the . (dot). The regex (?!hede). will do that only once, so it is wrapped in a group, and repeated zero or more times: ((?!hede).)*. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed: ^((?!hede).)*$

As you can see, the input "ABhedeCD" will fail because on e3, the regex (?!hede) fails (there is "hede" up ahead!).

How do I match any character across multiple lines in a regular expression?

It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:

/(.*)<FooBar>/s

The s at the end causes the dot to match all characters including newlines.

How to split a string while ignoring the case of the delimiter?

There's no easy way to accomplish this using string.Split. (Well, except for specifying all the permutations of the split string for each char lower/upper case in an array - not very elegant I think you'll agree.)

However, Regex.Split should do the job quite nicely.

Example:

var parts = Regex.Split(input, "aa", RegexOptions.IgnoreCase);


Related Topics



Leave a reply



Submit