Index or Range of Second Ocurence of Bytes in File

Index or range of second ocurence of bytes in file

You can find the second occurrence this way :

if let rg1 = data.range(of: mtrkChunk),
let rg2 = data[rg1.upperBound...].range(of: mtrkChunk) {
print(rg2)
}

How to find the second range in the given string in swift

You can run a for-loop through the String and match the substrings with "hello" and get an array of ClosedRange <Int>.

let str = "Hi,hello hi hello"
let arr = Array(str)
var ranges = [ClosedRange<Int>]()

for index in stride(from: 0, to: str.count-1, by: 1) {
if index+4 < str.count {
if String(arr[index...index+4]) == "hello" {
ranges.append(index...index+4)
}
}
}

Now use ranges Array to get whatever range you want using the index, i.e.

print(ranges[1])

index out of range while attempting to write to file

The problem seems to be coming from these two lines inside your ingest function:

n := strings.TrimSpace(a[1])
v := strings.TrimSpace(a[2])

Should that maybe be a[0] and a[1]?

Find indexOf a byte array within another byte array

Java strings are composed of 16-bit chars, not of 8-bit bytes. A char can hold a byte, so you can always make your byte arrays into strings, and use indexOf: ASCII characters, control characters, and even zero characters will work fine.

Here is a demo:

byte[] big = new byte[] {1,2,3,0,4,5,6,7,0,8,9,0,0,1,2,3,4};
byte[] small = new byte[] {7,0,8,9,0,0,1};
String bigStr = new String(big, StandardCharsets.UTF_8);
String smallStr = new String(small, StandardCharsets.UTF_8);
System.out.println(bigStr.indexOf(smallStr));

This prints 7.

However, considering that your large array could be up to 10,000 bytes, and the small array is only ten bytes, this solution may not be the most efficient, for two reasons:

  • It requires copying your big array into an array that is twice as large (same capacity, but with char instead of byte). This triples your memory requirements.
  • String search algorithm of Java is not the fastest one available. You may get sufficiently faster if you implement one of the advanced algorithms, for example, the Knuth–Morris–Pratt one. This could potentially bring the execution speed down by a factor of up to ten (the length of the small string), and will require additional memory that is proportional to the length of the small string, not the big string.

GoLang Get String at Line N in Byte Slice

Dealing with just the question part (and not the sanity of this) - you have a []byte and want to get a specific string line from it - the bytes.Reader has no ReadLine method which you will have already noticed.

You can pass a bytes reader to bufio.NewReader, and gain the ReadLine functionality you are trying to access.

bytesReader := bytes.NewReader([]byte("test1\ntest2\ntest3\n"))
bufReader := bufio.NewReader(bytesReader)
value1, _, _ := bufReader.ReadLine()
value2, _, _ := bufReader.ReadLine()
value3, _, _ := bufReader.ReadLine()
fmt.Println(string(value1))
fmt.Println(string(value2))
fmt.Println(string(value3))

Obviously it is not sensible to ignore the errors, but for the purpose of brevity I do it here.

https://play.golang.org/p/fRQUfmZQke

Results:

test1
test2
test3

From here, it is straight forward to fit back into your existing code.

Find the nth occurrence of substring in a string

Mark's iterative approach would be the usual way, I think.

Here's an alternative with string-splitting, which can often be useful for finding-related processes:

def findnth(haystack, needle, n):
parts= haystack.split(needle, n+1)
if len(parts)<=n+1:
return -1
return len(haystack)-len(parts[-1])-len(needle)

And here's a quick (and somewhat dirty, in that you have to choose some chaff that can't match the needle) one-liner:

'foo bar bar bar'.replace('bar', 'XXX', 1).find('bar')

Reading binary file and looping over each byte

Python >= 3.8

Thanks to the walrus operator (:=) the solution is quite short. We read bytes objects from the file and assign them to the variable byte

with open("myfile", "rb") as f:
while (byte := f.read(1)):
# Do stuff with byte.

Python >= 3

In older Python 3 versions, we get have to use a slightly more verbose way:

with open("myfile", "rb") as f:
byte = f.read(1)
while byte != b"":
# Do stuff with byte.
byte = f.read(1)

Or as benhoyt says, skip the not equal and take advantage of the fact that b"" evaluates to false. This makes the code compatible between 2.6 and 3.x without any changes. It would also save you from changing the condition if you go from byte mode to text or the reverse.

with open("myfile", "rb") as f:
byte = f.read(1)
while byte:
# Do stuff with byte.
byte = f.read(1)

Python >= 2.5

In Python 2, it's a bit different. Here we don't get bytes objects, but raw characters:

with open("myfile", "rb") as f:
byte = f.read(1)
while byte != "":
# Do stuff with byte.
byte = f.read(1)

Note that the with statement is not available in versions of Python below 2.5. To use it in v 2.5 you'll need to import it:

from __future__ import with_statement

In 2.6 this is not needed.

Python 2.4 and Earlier

f = open("myfile", "rb")
try:
byte = f.read(1)
while byte != "":
# Do stuff with byte.
byte = f.read(1)
finally:
f.close()

Find Character String In Binary Data

Convert your substring to an NSData object, and search for those bytes in the larger NSData using rangeOfData:options:range:. Make sure that the string encodings match!

On iPhone, where that isn't available, you may have to do this yourself. The C function strstr() will give you a pointer to the first occurrence of a pattern within the buffer (as long as neither contain nulls!), but not the index. Here's a function that should do the job (but no promises, since I haven't tried actually running it...):

- (NSUInteger)indexOfData:(NSData*)needle inData:(NSData*)haystack
{
const void* needleBytes = [needle bytes];
const void* haystackBytes = [haystack bytes];

// walk the length of the buffer, looking for a byte that matches the start
// of the pattern; we can skip (|needle|-1) bytes at the end, since we can't
// have a match that's shorter than needle itself
for (NSUInteger i=0; i < [haystack length]-[needle length]+1; i++)
{
// walk needle's bytes while they still match the bytes of haystack
// starting at i; if we walk off the end of needle, we found a match
NSUInteger j=0;
while (j < [needle length] && needleBytes[j] == haystackBytes[i+j])
{
j++;
}
if (j == [needle length])
{
return i;
}
}
return NSNotFound;
}

This runs in something like O(nm), where n is the buffer length, and m is the size of the substring. It's written to work with NSData for two reasons: 1) that's what you seem to have in hand, and 2) those objects already encapsulate both the actual bytes, and the length of the buffer.



Related Topics



Leave a reply



Submit