Is it necessary to close StringIO in ruby?
StringIO#close
does not free any resources or drop its reference to the accumulated string. Therefore calling it has no effect upon resource usage.Only
StringIO#finalize
, called during garbage collection, frees the reference to the accumulated string so that it can be freed (provided the caller does not retain its own reference to it).StringIO.open
, which briefly creates a StringIO instances, does not keep a reference to that instance after it returns; therefore that StringIO's reference to the accumulated string can be freed (provided the caller does not retain its own reference to it).In practical terms, there is seldom a need to worry about a memory leak when using StringIO. Just don't hang on to references to StringIO once you're done with them and all will be well.
Diving into the source
The only resource used by a StringIO instance is the string it is accumulating. You can see that in stringio.c (MRI 1.9.3); here we see the structure that holds a StringIO's state:
static struct StringIO *struct StringIO {
VALUE string;
long pos;
long lineno;
int flags;
int count;
};
When a StringIO instance is finalized (that is, garbage collected), its reference to the string is dropped so that the string may be garbage collected if there are no other references to it. Here's the finalize method, which is also called by StringIO#open(&block)
in order to close the instance.
static VALUE
strio_finalize(VALUE self)
{
struct StringIO *ptr = StringIO(self);
ptr->string = Qnil;
ptr->flags &= ~FMODE_READWRITE;
return self;
}
The finalize method is called only when the object is garbage collected. There is no other method of StringIO which frees the string reference.
StringIO#close
just sets a flag. It does not free the reference to the accumulated string or in any other way affect resource usage:
static VALUE
strio_close(VALUE self)
{
struct StringIO *ptr = StringIO(self);
if (CLOSED(ptr)) {
rb_raise(rb_eIOError, "closed stream");
}
ptr->flags &= ~FMODE_READWRITE;
return Qnil;
}
And lastly, when you call StringIO#string
, you get a reference to the exact same string that the StringIO instance has been accumulating:
static VALUE
strio_get_string(VALUE self)
{
return StringIO(self)->string;
}
How to leak memory when using StringIO
All of this means that there is only one way for a StringIO instance to cause a resource leak: You must not close the StringIO object, and you must keep it around longer than you keep the string you got when you called StringIO#string
. For example, imagine a class having a StringIO object as an instance variable:
class Leaker
def initialize
@sio = StringIO.new
@sio.puts "Here's a large file:"
@sio.puts
@sio.write File.read('/path/to/a/very/big/file')
end
def result
@sio.string
end
end
Imagine that the user of this class gets the result, uses it briefly, and then discards it, and yet keeps a reference to the instance of Leaker. You can see that the Leaker instance retains a reference to the result via the un-closed StringIO instance. This could be a problem if the file is very large, or if there are many extant instance of Leaker. This simple (and deliberately pathological) example can be fixed by simply not keeping the StringIO as an instance variable. When you can (and you almost always can), it's better to simply throw away the StringIO object than to go through the bother of closing it explicitly:
class NotALeaker
attr_reader :result
def initialize
sio = StringIO.new
sio.puts "Here's a large file:"
sio.puts
sio.write File.read('/path/to/a/very/big/file')
@result = sio.string
end
end
Add to all of this that these leaks only matter when the strings are large or the StringIO instances numerous and the StringIO instance is long lived, and you can see that explicitly closing StringIO is seldom, if ever, needed.
Should I close StringIO instances explicitly?
Well, reads and writes go straight to the underlying string; there's no extra buffers to flush, and no OS-level resources to return.
The only reason you might want to close the StringIO is to make subsequent IOs fail or if you needed to make closed?
return true, which could be useful if you gave that StringIO to some other component. On the other hand, if you're just going to discard the StringIO a moment later, it doesn't matter in the slightest; the garbage collector doesn't care if it's marked as open or closed.
How can I clear a `StringIO` instance?
seek
or rewind
only affect next read/write operations, not the content of the internal storage.
You can use StringIO#truncate
like File#truncate
:
require 'stringio'
io = StringIO.new
io.write("foo")
io.string
# => "foo"
io.truncate(0) # <---------
io.string
# => ""
Alternative:
You can also use StringIO#reopen
(NOTE: File
does not have reopen
method):
io.reopen("")
io.string
# => ""
What are the advantages to using StringIO in Ruby as opposed to String?
Basically, it makes a string look like an IO object, hence the name StringIO.
The StringIO class has read
and write
methods, so it can be passed to parts of your code that were designed to read and write from files or sockets. It's nice if you have a string and you want it to look like a file for the purposes of testing your file code.
def foo_writer(file)
file.write "foo"
end
def test_foo_writer
s = StringIO.new
foo_writer(s)
raise 'fail' unless s.string == 'foo'
end
Ruby's File.open and the need for f.close
I saw many times in ruby codes unmatched
File.open
calls
Can you give an example? I only ever see that in code written by newbies who lack the "common knowledge in most programming languages that the flow for working with files is open-use-close".
Experienced Rubyists either explicitly close their files, or, more idiomatically, use the block form of File.open
, which automatically closes the file for you. Its implementation basically looks something like like this:
def File.open(*args, &block)
return open_with_block(*args, &block) if block_given?
open_without_block(*args)
end
def File.open_without_block(*args)
# do whatever ...
end
def File.open_with_block(*args)
yield f = open_without_block(*args)
ensure
f.close
end
Scripts are a special case. Scripts generally run so short, and use so few file descriptors that it simply doesn't make sense to close them, since the operating system will close them anyway when the script exits.
Do we need to explicitly close?
Yes.
If yes then why does the GC autoclose?
Because after it has collected the object, there is no way for you to close the file anymore, and thus you would leak file descriptors.
Note that it's not the garbage collector that closes the files. The garbage collector simply executes any finalizers for an object before it collects it. It just so happens that the File
class defines a finalizer which closes the file.
If not then why the option?
Because wasted memory is cheap, but wasted file descriptors aren't. Therefore, it doesn't make sense to tie the lifetime of a file descriptor to the lifetime of some chunk of memory.
You simply cannot predict when the garbage collector will run. You cannot even predict if it will run at all: if you never run out of memory, the garbage collector will never run, therefore the finalizer will never run, therefore the file will never be closed.
Ruby StringIO for concurrent reading and writing
You should consider using a Queue. If you do not need thread safety, then a simple array might be fine too.
Is there such a thing as opening a StringIO for writing?
Here's your problem:
# frozen_string_literal: true
Your string contents
is frozen and can't be modified. StringIO expresses it with the abovementioned IOError
.
Why ruby StringIO does not give different encodings
Lets dissect your code...
a.read(2)
This reads two bytes from the stream and returns a String
. As you are reading a specific number of bytes, Ruby can't guarantee any character boundaries. Because of this, it specified that the returned string will by binary encoded, i.e. Encoding:ASCII-8BIT
.
In your next line, you are using
a.read
You are thus reading until the end of the stream and return all remaining data. The encoding of the returned string can either be given as an argument to the read
method or default to your defined external encoding (in your case UTF-8).
Now, as you have read to the end of the stream, any subsequent reads will either result in an error or simply return an empty string. In the case of StringIO, this happens to be binary string. Although I didn't find any documentation about this specific case, it's clearly defined in MRI's code of the StringIO class.
a.read
will thus return an empty string in binary encoding.
Related Topics
Updated to Osx 10.9, Now Getting Ruby Error Using Homebrew
Ruby Selenium Webdriver Unable to Find Mozilla Geckodriver
Custom_Require.Rb:36:In 'Require': No Such File to Load -- Myapp(Loaderror)
How to I Add a Hyperlink to a Cell in Axlsx
Is Ruby's Stdlib Logger Class Thread-Safe
Why Would You Use a !! Operator
What's the Difference Between /\P{Alpha}/I and /\P{L}/I in Ruby
Installing Libyaml for Ruby on a MAC Osx (Lion)
How to Do "Late" String Interpolation in Ruby
Ruby - Does Array a Contain All Elements of Array B
Saml 2.0 Sso for Ruby on Rails
How to Inherit from Nilclass or How to Simulate Similar Function