What is ruby's StringIO class really?
no, StringIO is more similar to StringReader/StringWriter than StringBuffer.
In Java StringBuffer is the mutable version of String (since String is immutable).
StringReader/StringWriter are handy classes meant to be used when you want to fake file access . You can read/write in a String with the same stream-oriented interface of Reader/Writer: it is immensely useful in unit testing.
What are the advantages to using StringIO in Ruby as opposed to String?
Basically, it makes a string look like an IO object, hence the name StringIO.
The StringIO class has read
and write
methods, so it can be passed to parts of your code that were designed to read and write from files or sockets. It's nice if you have a string and you want it to look like a file for the purposes of testing your file code.
def foo_writer(file)
file.write "foo"
end
def test_foo_writer
s = StringIO.new
foo_writer(s)
raise 'fail' unless s.string == 'foo'
end
What is `StringIO` in the context of RSpec testing (Ruby on Rails)?
StringIO is a string-based replacement for an IO object. It acts same as a file, but it's kept in memory as a String.
In your case I don't think it's really applicable. At least not with your current code. That's because you have File.open
call that creates an IO object and immediately does something with it.
If for example you had something like this:
def write_data(f)
f.write(snapshot)
end
# your code would be
f = File.open("data.dat", "wb")
write_data(f)
# test would be
testIO = StringIO.new
write_data(testIO)
testIO.string.should == "Hello, world!"
Why doesn't Ruby have a real StringBuffer or StringIO?
I looked at the ruby documentation for StringIO
, and it looks like what you want is StringIO#string
, not StringIO#to_s
Thus, change your code to:
s = StringIO.new
s << 'foo'
s << 'bar'
s.string
Using Ruby StringIO's gets method
To see what's actually going on, let's replace print
with a more explicit output:
require 'stringio'
def parse(chunked_data, separator)
data = ""
until chunked_data.eof?
if chunk_head = chunked_data.gets(separator)
puts "chunk_head: #{chunk_head.inspect}"
if chunk = chunked_data.read(chunk_head.to_i)
puts " chunk: #{chunk.inspect}"
data << chunk
end
end
end
data
end
result = parse(StringIO.new("7\r\nHello, \r\n6\r\nworld!\r\n0\r\n"), "\r\n")
puts " result: #{result.inspect}"
Output:chunk_head: "7\r\n"
chunk: "Hello, "
chunk_head: "\r\n"
chunk: ""
chunk_head: "6\r\n"
chunk: "world!"
chunk_head: "\r\n"
chunk: ""
chunk_head: "0\r\n"
chunk: ""
Now with a .
:result = parse(StringIO.new("7.Hello, .6.world!.0."), ".")
puts " result: #{result.inspect}"
Output:chunk_head: "7."
chunk: "Hello, "
chunk_head: "."
chunk: ""
chunk_head: "6."
chunk: "world!"
chunk_head: "."
chunk: ""
chunk_head: "0."
chunk: ""
result: "Hello, world!"
As you can see, it works either way and the result is identical.Although the result is correct, there seems to be a bug in your code: you don't read the separator after chunk
. This can be fixed by adding a chunked_data.gets(separator)
or chunked_data.read(separator.bytesize)
after the if/end
block:
def parse(chunked_data, separator)
data = ""
until chunked_data.eof?
if chunk_head = chunked_data.gets(separator)
puts "chunk_head: #{chunk_head.inspect}"
if chunk = chunked_data.read(chunk_head.to_i)
puts " chunk: #{chunk.inspect}"
data << chunk
end
chunked_data.read(separator.bytesize)
end
end
data
end
result = parse(StringIO.new("7.Hello, .6.world!.0."), ".")
puts " result: #{result.inspect}"
Output:chunk_head: "7."
chunk: "Hello, "
chunk_head: "6."
chunk: "world!"
chunk_head: "0."
chunk: ""
result: "Hello, world!"
That looks better. Is it necessary to close StringIO in ruby?
StringIO#close
does not free any resources or drop its reference to the accumulated string. Therefore calling it has no effect upon resource usage.Only
StringIO#finalize
, called during garbage collection, frees the reference to the accumulated string so that it can be freed (provided the caller does not retain its own reference to it).StringIO.open
, which briefly creates a StringIO instances, does not keep a reference to that instance after it returns; therefore that StringIO's reference to the accumulated string can be freed (provided the caller does not retain its own reference to it).In practical terms, there is seldom a need to worry about a memory leak when using StringIO. Just don't hang on to references to StringIO once you're done with them and all will be well.
Diving into the source
The only resource used by a StringIO instance is the string it is accumulating. You can see that in stringio.c (MRI 1.9.3); here we see the structure that holds a StringIO's state:
static struct StringIO *struct StringIO {
VALUE string;
long pos;
long lineno;
int flags;
int count;
};
When a StringIO instance is finalized (that is, garbage collected), its reference to the string is dropped so that the string may be garbage collected if there are no other references to it. Here's the finalize method, which is also called by StringIO#open(&block)
in order to close the instance.static VALUE
strio_finalize(VALUE self)
{
struct StringIO *ptr = StringIO(self);
ptr->string = Qnil;
ptr->flags &= ~FMODE_READWRITE;
return self;
}
The finalize method is called only when the object is garbage collected. There is no other method of StringIO which frees the string reference.StringIO#close
just sets a flag. It does not free the reference to the accumulated string or in any other way affect resource usage:
static VALUE
strio_close(VALUE self)
{
struct StringIO *ptr = StringIO(self);
if (CLOSED(ptr)) {
rb_raise(rb_eIOError, "closed stream");
}
ptr->flags &= ~FMODE_READWRITE;
return Qnil;
}
And lastly, when you call StringIO#string
, you get a reference to the exact same string that the StringIO instance has been accumulating:static VALUE
strio_get_string(VALUE self)
{
return StringIO(self)->string;
}
How to leak memory when using StringIO
All of this means that there is only one way for a StringIO instance to cause a resource leak: You must not close the StringIO object, and you must keep it around longer than you keep the string you got when you called StringIO#string
. For example, imagine a class having a StringIO object as an instance variable:
class Leaker
def initialize
@sio = StringIO.new
@sio.puts "Here's a large file:"
@sio.puts
@sio.write File.read('/path/to/a/very/big/file')
end
def result
@sio.string
end
end
Imagine that the user of this class gets the result, uses it briefly, and then discards it, and yet keeps a reference to the instance of Leaker. You can see that the Leaker instance retains a reference to the result via the un-closed StringIO instance. This could be a problem if the file is very large, or if there are many extant instance of Leaker. This simple (and deliberately pathological) example can be fixed by simply not keeping the StringIO as an instance variable. When you can (and you almost always can), it's better to simply throw away the StringIO object than to go through the bother of closing it explicitly:class NotALeaker
attr_reader :result
def initialize
sio = StringIO.new
sio.puts "Here's a large file:"
sio.puts
sio.write File.read('/path/to/a/very/big/file')
@result = sio.string
end
end
Add to all of this that these leaks only matter when the strings are large or the StringIO instances numerous and the StringIO instance is long lived, and you can see that explicitly closing StringIO is seldom, if ever, needed. StringIO instance mutating original string
This is intentional - if the stream is writable (in the case of IOString
this would be if the underlying string is writable) then a set_encoding
on the stream also sets the encoding on the underlying string.
https://github.com/ruby/ruby/blob/trunk/ext/stringio/stringio.c#L1602
Ruby Mock a file with StringIO
Your method get_symbols_from_file
is never called in the test. You're just testing that StringIO#readlines
works, i.e.:
StringIO.new("YHOO,141414").readlines == ["YHOO,141414"] #=> true
If you want to use a StringIO
instance as a placeholder for your file, you have to change your method to take a File
instance rather than a file name:def get_symbols_from_file(file)
file.readlines(',')
end
Both, File
and StringIO
instances respond to readlines
, so the above implementation can handle both:def test_get_symbols_from_file
s = StringIO.new("YHOO,141414")
assert_equal(["YHOO,141414"], get_symbols_from_file(s))
end
This test however fails: readlines
includes the line separator, so it returns an array with two elements "YHOO,"
(note the comma) and "141414"
. You are expecting an array with one element "YHOO,141414"
.Maybe you're looking for something like this:
def test_get_symbols_from_file
s = StringIO.new("YHOO,141414")
assert_equal(["YHOO", "141414"], get_symbols_from_file(s))
end
def get_symbols_from_file(file)
file.read.split(',')
end
If you really want to use IO::readlines
you could create a Tempfile
:require 'tempfile'
def test_get_symbols_from_file
Tempfile.open("foo") { |f|
f.write "YHOO,141414"
f.close
assert_equal(["YHOO", "141414"], get_symbols_from_file(f.path))
}
end
Should I close StringIO instances explicitly?
Well, reads and writes go straight to the underlying string; there's no extra buffers to flush, and no OS-level resources to return.
The only reason you might want to close the StringIO is to make subsequent IOs fail or if you needed to make closed?
return true, which could be useful if you gave that StringIO to some other component. On the other hand, if you're just going to discard the StringIO a moment later, it doesn't matter in the slightest; the garbage collector doesn't care if it's marked as open or closed.
Is there such a thing as opening a StringIO for writing?
Here's your problem:
# frozen_string_literal: true
Your string contents
is frozen and can't be modified. StringIO expresses it with the abovementioned IOError
.
Related Topics
How to Run Capybara-Webkit (I.E. Forked Webkit_Server) on Heroku Cedar
How to Programmatically Remove "Singleton Information" on an Instance to Make It Marshal
Fastest Way to Skip Lines While Parsing Files in Ruby
Including Methods to a Controller from a Plugin
Strong Parameters with Nested Hash
How to Test (Rspec) a Http Request That Takes Too Long
How to Save Data with Has_Many: Through
Ruby a Clever Way to Execute a Function on a Condition
Monitor Multiple Rails Applications
Rails 3, Http Extensions (Webdav) and Rack App Mounting
Unit Testing Code Which Gets Current Time
How Does String.Unpack Work in Ruby
Rake Cucumber and Rake Spec Always Use "Develop" Environment
Running Capybara Without Rack Produces Errors When Using Url Parameters