Split String into Equal Slices/Chunks

Split string into equal slices/chunks

What about this?

string.scan(/.{,#{L}}/)

Split a string into N equal parts?

import textwrap
print(textwrap.wrap("123456789", 2))
#prints ['12', '34', '56', '78', '9']

Note: be careful with whitespace etc - this may or may not be what you want.

"""Wrap a single paragraph of text, returning a list of wrapped lines.

Reformat the single paragraph in 'text' so it fits in lines of no
more than 'width' columns, and return a list of wrapped lines. By
default, tabs in 'text' are expanded with string.expandtabs(), and
all other whitespace characters (including newline) are converted to
space. See TextWrapper class for available keyword args to customize
wrapping behaviour.
"""

Splitting a string into chunks of a certain size

static IEnumerable<string> Split(string str, int chunkSize)
{
return Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize));
}

Please note that additional code might be required to gracefully handle edge cases (null or empty input string, chunkSize == 0, input string length not divisible by chunkSize, etc.). The original question doesn't specify any requirements for these edge cases and in real life the requirements might vary so they are out of scope of this answer.

Split a string to even sized chunks

Use textwrap.wrap:

>>> import textwrap
>>> s = 'Split String into Equal Slices/ChunksSplit String into Equal Slices/Chunksaaaaaaa'
>>> textwrap.wrap(s, 4)
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa']

Split the string into different lengths chunks

>>> s = '25c319f75e3fbed5a9f0497750ea12992b30d565'
>>> n = [8, 4, 4, 4, 4, 12]
>>> print '-'.join([s[sum(n[:i]):sum(n[:i+1])] for i in range(len(n))])

Output

25c319f7-5e3f-bed5-a9f0-4977-50ea12992b30

Split a given string into equal parts where number of sub strings will be of equal size and dynamic in nature?

You could give the length of the substrings and iterate until the end of the adjusted string.

function split(string, size) {    var splitted = [],        i = 0;            string = string.match(/\S+/g).join('');    while (i < string.length) splitted.push(string.slice(i, i += size));    return splitted;}
console.log(...split('Hello World', 2));console.log(...split('Hello Worlds', 2));

Split large string in n-size chunks in JavaScript

You can do something like this:

"1234567890".match(/.{1,2}/g);
// Results in:
["12", "34", "56", "78", "90"]

The method will still work with strings whose size is not an exact multiple of the chunk-size:

"123456789".match(/.{1,2}/g);
// Results in:
["12", "34", "56", "78", "9"]

In general, for any string out of which you want to extract at-most n-sized substrings, you would do:

str.match(/.{1,n}/g); // Replace n with the size of the substring

If your string can contain newlines or carriage returns, you would do:

str.match(/(.|[\r\n]){1,n}/g); // Replace n with the size of the substring

As far as performance, I tried this out with approximately 10k characters and it took a little over a second on Chrome. YMMV.

This can also be used in a reusable function:

function chunkString(str, length) {
return str.match(new RegExp('.{1,' + length + '}', 'g'));
}

What's the best way to split a string into fixed length chunks and work with them in Python?

One solution would be to use this function:

def chunkstring(string, length):
return (string[0+i:length+i] for i in range(0, len(string), length))

This function returns a generator, using a generator comprehension. The generator returns the string sliced, from 0 + a multiple of the length of the chunks, to the length of the chunks + a multiple of the length of the chunks.

You can iterate over the generator like a list, tuple or string - for i in chunkstring(s,n):
, or convert it into a list (for instance) with list(generator). Generators are more memory efficient than lists because they generator their elements as they are needed, not all at once, however they lack certain features like indexing.

This generator also contains any smaller chunk at the end:

>>> list(chunkstring("abcdefghijklmnopqrstuvwxyz", 5))
['abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z']

Example usage:

text = """This is the first line.
This is the second line.
The line below is true.
The line above is false.
A short line.
A very very very very very very very very very long line.
A self-referential line.
The last line.
"""

lines = (i.strip() for i in text.splitlines())

for line in lines:
for chunk in chunkstring(line, 16):
print(chunk)


Related Topics



Leave a reply



Submit