What Is Ruby's Stringio Class Really

What is ruby's StringIO class really?

no, StringIO is more similar to StringReader/StringWriter than StringBuffer.

In Java StringBuffer is the mutable version of String (since String is immutable).

StringReader/StringWriter are handy classes meant to be used when you want to fake file access . You can read/write in a String with the same stream-oriented interface of Reader/Writer: it is immensely useful in unit testing.

What are the advantages to using StringIO in Ruby as opposed to String?

Basically, it makes a string look like an IO object, hence the name StringIO.

The StringIO class has read and write methods, so it can be passed to parts of your code that were designed to read and write from files or sockets. It's nice if you have a string and you want it to look like a file for the purposes of testing your file code.

def foo_writer(file)
file.write "foo"
end

def test_foo_writer
s = StringIO.new
foo_writer(s)
raise 'fail' unless s.string == 'foo'
end

What is `StringIO` in the context of RSpec testing (Ruby on Rails)?

StringIO is a string-based replacement for an IO object. It acts same as a file, but it's kept in memory as a String.

In your case I don't think it's really applicable. At least not with your current code. That's because you have File.open call that creates an IO object and immediately does something with it.

If for example you had something like this:

def write_data(f)
f.write(snapshot)
end

# your code would be
f = File.open("data.dat", "wb")
write_data(f)

# test would be
testIO = StringIO.new
write_data(testIO)
testIO.string.should == "Hello, world!"

Why doesn't Ruby have a real StringBuffer or StringIO?

I looked at the ruby documentation for StringIO, and it looks like what you want is StringIO#string, not StringIO#to_s

Thus, change your code to:

s = StringIO.new
s << 'foo'
s << 'bar'
s.string

Using Ruby StringIO's gets method

To see what's actually going on, let's replace print with a more explicit output:

require 'stringio'

def parse(chunked_data, separator)
data = ""
until chunked_data.eof?
if chunk_head = chunked_data.gets(separator)
puts "chunk_head: #{chunk_head.inspect}"
if chunk = chunked_data.read(chunk_head.to_i)
puts " chunk: #{chunk.inspect}"
data << chunk
end
end
end
data
end

result = parse(StringIO.new("7\r\nHello, \r\n6\r\nworld!\r\n0\r\n"), "\r\n")
puts " result: #{result.inspect}"

Output:

chunk_head: "7\r\n"
chunk: "Hello, "
chunk_head: "\r\n"
chunk: ""
chunk_head: "6\r\n"
chunk: "world!"
chunk_head: "\r\n"
chunk: ""
chunk_head: "0\r\n"
chunk: ""

Now with a .:

result = parse(StringIO.new("7.Hello, .6.world!.0."), ".")
puts " result: #{result.inspect}"

Output:

chunk_head: "7."
chunk: "Hello, "
chunk_head: "."
chunk: ""
chunk_head: "6."
chunk: "world!"
chunk_head: "."
chunk: ""
chunk_head: "0."
chunk: ""
result: "Hello, world!"

As you can see, it works either way and the result is identical.

Although the result is correct, there seems to be a bug in your code: you don't read the separator after chunk. This can be fixed by adding a chunked_data.gets(separator) or chunked_data.read(separator.bytesize)
after the if/end block:

def parse(chunked_data, separator)
data = ""
until chunked_data.eof?
if chunk_head = chunked_data.gets(separator)
puts "chunk_head: #{chunk_head.inspect}"
if chunk = chunked_data.read(chunk_head.to_i)
puts " chunk: #{chunk.inspect}"
data << chunk
end
chunked_data.read(separator.bytesize)
end
end
data
end

result = parse(StringIO.new("7.Hello, .6.world!.0."), ".")
puts " result: #{result.inspect}"

Output:

chunk_head: "7."
chunk: "Hello, "
chunk_head: "6."
chunk: "world!"
chunk_head: "0."
chunk: ""
result: "Hello, world!"

That looks better.

Is it necessary to close StringIO in ruby?

  • StringIO#close does not free any resources or drop its reference to the accumulated string. Therefore calling it has no effect upon resource usage.

  • Only StringIO#finalize, called during garbage collection, frees the reference to the accumulated string so that it can be freed (provided the caller does not retain its own reference to it).

  • StringIO.open, which briefly creates a StringIO instances, does not keep a reference to that instance after it returns; therefore that StringIO's reference to the accumulated string can be freed (provided the caller does not retain its own reference to it).

  • In practical terms, there is seldom a need to worry about a memory leak when using StringIO. Just don't hang on to references to StringIO once you're done with them and all will be well.


Diving into the source

The only resource used by a StringIO instance is the string it is accumulating. You can see that in stringio.c (MRI 1.9.3); here we see the structure that holds a StringIO's state:

static struct StringIO *struct StringIO {
VALUE string;
long pos;
long lineno;
int flags;
int count;
};

When a StringIO instance is finalized (that is, garbage collected), its reference to the string is dropped so that the string may be garbage collected if there are no other references to it. Here's the finalize method, which is also called by StringIO#open(&block) in order to close the instance.

static VALUE
strio_finalize(VALUE self)
{
struct StringIO *ptr = StringIO(self);
ptr->string = Qnil;
ptr->flags &= ~FMODE_READWRITE;
return self;
}

The finalize method is called only when the object is garbage collected. There is no other method of StringIO which frees the string reference.

StringIO#close just sets a flag. It does not free the reference to the accumulated string or in any other way affect resource usage:

static VALUE
strio_close(VALUE self)
{
struct StringIO *ptr = StringIO(self);
if (CLOSED(ptr)) {
rb_raise(rb_eIOError, "closed stream");
}
ptr->flags &= ~FMODE_READWRITE;
return Qnil;
}

And lastly, when you call StringIO#string, you get a reference to the exact same string that the StringIO instance has been accumulating:

static VALUE
strio_get_string(VALUE self)
{
return StringIO(self)->string;
}

How to leak memory when using StringIO

All of this means that there is only one way for a StringIO instance to cause a resource leak: You must not close the StringIO object, and you must keep it around longer than you keep the string you got when you called StringIO#string. For example, imagine a class having a StringIO object as an instance variable:

class Leaker

def initialize
@sio = StringIO.new
@sio.puts "Here's a large file:"
@sio.puts
@sio.write File.read('/path/to/a/very/big/file')
end

def result
@sio.string
end

end

Imagine that the user of this class gets the result, uses it briefly, and then discards it, and yet keeps a reference to the instance of Leaker. You can see that the Leaker instance retains a reference to the result via the un-closed StringIO instance. This could be a problem if the file is very large, or if there are many extant instance of Leaker. This simple (and deliberately pathological) example can be fixed by simply not keeping the StringIO as an instance variable. When you can (and you almost always can), it's better to simply throw away the StringIO object than to go through the bother of closing it explicitly:

class NotALeaker

attr_reader :result

def initialize
sio = StringIO.new
sio.puts "Here's a large file:"
sio.puts
sio.write File.read('/path/to/a/very/big/file')
@result = sio.string
end

end

Add to all of this that these leaks only matter when the strings are large or the StringIO instances numerous and the StringIO instance is long lived, and you can see that explicitly closing StringIO is seldom, if ever, needed.

StringIO instance mutating original string

This is intentional - if the stream is writable (in the case of IOString this would be if the underlying string is writable) then a set_encoding on the stream also sets the encoding on the underlying string.

https://github.com/ruby/ruby/blob/trunk/ext/stringio/stringio.c#L1602

Ruby Mock a file with StringIO

Your method get_symbols_from_file is never called in the test. You're just testing that StringIO#readlines works, i.e.:

StringIO.new("YHOO,141414").readlines == ["YHOO,141414"] #=> true

If you want to use a StringIO instance as a placeholder for your file, you have to change your method to take a File instance rather than a file name:

def get_symbols_from_file(file)
file.readlines(',')
end

Both, File and StringIO instances respond to readlines, so the above implementation can handle both:

def test_get_symbols_from_file
s = StringIO.new("YHOO,141414")
assert_equal(["YHOO,141414"], get_symbols_from_file(s))
end

This test however fails: readlines includes the line separator, so it returns an array with two elements "YHOO," (note the comma) and "141414". You are expecting an array with one element "YHOO,141414".

Maybe you're looking for something like this:

def test_get_symbols_from_file
s = StringIO.new("YHOO,141414")
assert_equal(["YHOO", "141414"], get_symbols_from_file(s))
end

def get_symbols_from_file(file)
file.read.split(',')
end

If you really want to use IO::readlines you could create a Tempfile:

require 'tempfile'

def test_get_symbols_from_file
Tempfile.open("foo") { |f|
f.write "YHOO,141414"
f.close
assert_equal(["YHOO", "141414"], get_symbols_from_file(f.path))
}
end

Should I close StringIO instances explicitly?

Well, reads and writes go straight to the underlying string; there's no extra buffers to flush, and no OS-level resources to return.

The only reason you might want to close the StringIO is to make subsequent IOs fail or if you needed to make closed? return true, which could be useful if you gave that StringIO to some other component. On the other hand, if you're just going to discard the StringIO a moment later, it doesn't matter in the slightest; the garbage collector doesn't care if it's marked as open or closed.

Is there such a thing as opening a StringIO for writing?

Here's your problem:

# frozen_string_literal: true

Your string contents is frozen and can't be modified. StringIO expresses it with the abovementioned IOError.



Related Topics



Leave a reply



Submit