How do I track down a memory leak in my Ruby code?
I did not find ruby-prof very useful when it came to locating memory leaks, because you need a patched Ruby interpreter. Tracking object allocation has become easier in Ruby 2.1. Maybe it is the best choice to explore this yourself.
I recommend the blog post Ruby 2.1: objspace.so by tmml who is one of the Ruby core developers. Basically you can fetch a lot of information while debugging your application:
ObjectSpace.each_object{ |o| ... }
ObjectSpace.count_objects #=> {:TOTAL=>55298, :FREE=>10289, :T_OBJECT=>3371, ...}
require 'objspace'
ObjectSpace.memsize_of(o) #=> 0 /* additional bytes allocated by object */
ObjectSpace.count_tdata_objects #=> {Encoding=>100, Time=>87, RubyVM::Env=>17, ...}
ObjectSpace.count_nodes #=> {:NODE_SCOPE=>2, :NODE_BLOCK=>688, :NODE_IF=>9, ...}
ObjectSpace.reachable_objects_from(o) #=> [referenced, objects, ...]
ObjectSpace.reachable_objects_from_root #=> {"symbols"=>..., "global_tbl"=>...} /* in 2.1 */
With Ruby 2.1 you can even start to track allocation of new objects and gather metadata about every new object:
require 'objspace'
ObjectSpace.trace_object_allocations_start
class MyApp
def perform
"foobar"
end
end
o = MyApp.new.perform
ObjectSpace.allocation_sourcefile(o) #=> "example.rb"
ObjectSpace.allocation_sourceline(o) #=> 6
ObjectSpace.allocation_generation(o) #=> 1
ObjectSpace.allocation_class_path(o) #=> "MyApp"
ObjectSpace.allocation_method_id(o) #=> :perform
Use pry and pry-byebug and start exploring the memory heap where you think it will probably grow, respectively try different segments in your code. Before Ruby 2.1 I always relied on ObjectSpace.count_objects
and calculated the result's difference, to see if one object type grows in particularly.
The garbage collection works properly when the number of objects growing are retested back to a much smaller amount during the iterations as opposed to keep growing. The garbage collector should run all the time anyway, you can reassure yourself by looking into the Garbage Collector statistics.
From my experience this is either String or Symbol (T_STRING
). Symbols before ruby 2.2.0 were not garbage collected so make sure your CSV or parts of it is not converted into symbols on the way.
If you do not feel comfortable, try to run your code on the JVM with JRuby. At least the memory profiling is a lot better supported with tools like VisualVM.
How can I track down a memory leak in a rails app?
I managed to figure it out. I was instantiating runtime classes on each iteration which apparently do not get GC'd. Refactoring to not use Class.new
fixed the problem.
So if anyone's googling this, Class.new
creates memory leaks.
Track down Memory leaks in a Ruby Script
I've heard good things about the Ruby Memory Tracking API but it is not free.
There is also a useful blog post for using valgrind to find ruby memory leaks.
There are other solutions for Ruby on Rails but it doesn't seem like you are using rails at all.
Finding the cause of a memory leak in Ruby
It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc
either.
Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUE
s) in so-called heaps of size of approx 16KB. On a low level, RVALUE
is a c-struct, containing a union
of different standard ruby object representations.
So, heaps store RVALUE
objects, which size is not more than 40 bytes. For such objects as String
, Array
, Hash
etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.
This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string
shows the memory up-down behaviour:
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
.strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB
But the heaps (see GC[:heap_length]
) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:
- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)
And, voilá:
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB
The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE
size and is stored in the ruby heap.
If you’ll examine the output of GC.stat
after the GC was run, you’ll find that GC[:heap_used]
value is decreased as expected. Ruby now has a lot of empty heaps, ready.
The summing up: I don’t think, the c
code leaks. I think the problem is within base64 representation of huge image in your css
. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.
Hope it helps.
Want to determine whether we have memory leak in ROR
Because of garbage collection, ruby will never have the kind of memory leak that occurs in C/C++ where the program was supposed to free the memory it no longer needs but did not.
What can happen is runaway memory because you are holding on to references which you don't need. Typically this happens when you keep things in class instance collections but don't cull the list as things are unneeded or old.
Another thing which can happen is an interaction between ruby memory management and the OS memory allocator. There is a very good article, What causes Ruby memory bloat by Hongli Lai on this. This might be something you can do very little about since it is not a memory "leak" in your code.
A feature was added in ruby 2.7 which addresses the issue in Hongli Lai's article. The method is GC.compact
which is not called automatically but will defragment the ruby heap.
ruby/ruby on rails memory leak detection
Some tips to find memory leaks in Rails:
- use the Bleak House plugin
- implement Scout monitoring specifically the memory usage profiler
- try another simple memory usage logger
The first is a graphical exploration of memory usage by objects in the ObjectSpace.
The last two will help you identify specific usage patterns that are inflating memory usage, and you can work from there.
As for specific coding-patterns, from experience you have to watch anything that's dealing with file io, image processing, working with massive strings and the like.
I would check whether you are using the most appropriate XML library - ReXML is known to be slow and believed to be leaky (I have no proof of that!). Also check whether you can memoize expensive operations.
How to deal with Ruby 2.1.2 memory leaks?
From your GC logs it appears the issue is not a ruby object reference leak as the heap_live_slot
value is not increasing significantly. That would suggest the problem is one of:
- Data being stored outside the heap (Strings, Arrays etc)
- A leak in a gem that uses native code
- A leak in the Ruby interpreter itself (least likely)
It's interesting to note that the problem exhibits on both OSX and Heroku (Ubuntu Linux).
Object data and the "heap"
Ruby 2.1 garbage collection uses the reported "heap" only for Objects that contain a tiny amount of data. When the data contained in an Object goes over a certain limit, the data is moved and allocated to an area outside of the heap. You can get the overall size of each data type with ObjectSpace:
require 'objspace'
ObjectSpace.count_objects_size({})
Collecting this along with your GC stats might indicate where memory is being allocated outside the heap. If you find a particular type, say :T_ARRAY
increasing a lot more than the others you might need to look for an array you are forever appending to.
You can use pry-byebug
to drop into a console to troll around specific objects, or even looking at all objects from the root:
ObjectSpace.memsize_of(some_object)
ObjectSpace.reachable_objects_from_root
There's a bit more detail on one of the ruby developers blog and also in this SO answer. I like their JRuby/VisualVM profiling idea.
Testing native gems
Use bundle
to install your gems into a local path:
bundle install --path=.gems/
Then you can find those that include native code:
find .gems/ -name "*.c"
Which gives you: (in my order of suspiciousness)
- digest-stringbuffer-0.0.2
- digest-murmurhash-0.3.0
- nokogiri-1.6.3.1
- json-1.8.1
OSX has a useful dev tool called leaks
that can tell you if it finds unreferenced memory in a running process. Not very useful for identifying where the memory comes from in Ruby but will help to identify when it is occurring.
First to be tested is digest-stringbuffer
. Grab the example from the Readme and add in some GC logging with gc_tracer
require "digest/stringbuffer"
require "gc_tracer"
GC::Tracer.start_logging "gclog.txt"
module Digest
class Prime31 < StringBuffer
def initialize
@prime = 31
end
def finish
result = 0
buffer.unpack("C*").each do |c|
result += (c * @prime)
end
[result & 0xffffffff].pack("N")
end
end
end
And make it run lots
while true do
a=[]
500.times do |i|
a.push Digest::Prime31.hexdigest( "abc" * (1000 + i) )
end
sleep 1
end
Run the example:
bundle exec ruby ./stringbuffertest.rb &
pid=$!
Monitor the resident and virtual memory sizes of the ruby
process, and the count of leaks
identified:
while true; do
ps=$(ps -o rss,vsz -p $pid | tail +2)
leaks=$(leaks $pid | grep -c Leak)
echo "$(date) m[$ps] l[$leaks]"
sleep 15
done
And it looks like we've found something already:
Tue 26 Aug 2014 18:22:36 BST m[104776 2538288] l[8229]
Tue 26 Aug 2014 18:22:51 BST m[110524 2547504] l[13657]
Tue 26 Aug 2014 18:23:07 BST m[113716 2547504] l[19656]
Tue 26 Aug 2014 18:23:22 BST m[113924 2547504] l[25454]
Tue 26 Aug 2014 18:23:38 BST m[113988 2547504] l[30722]
Resident memory is increasing and the leaks tool is finding more and more unreferenced memory. Confirm the GC heap size, and object count looks stable still
tail -f gclog.txt | awk '{ print $1, $3, $4, $7, $13 }
1581853040832 468 183 39171 3247996
1581859846164 468 183 33190 3247996
1584677954974 469 183 39088 3254580
1584678531598 469 183 39088 3254580
1584687986226 469 183 33824 3254580
1587512759786 470 183 39643 3261058
1587513449256 470 183 39643 3261058
1587521726010 470 183 34470 3261058
Then report the issue.
It appears to my very untrained C eye that they allocate both a pointer and a buffer but only clean up the buffer.
Looking at digest-murmurhash
, it seems to only provide functions that rely on StringBuffer so the leak might be fine once stringbuffer is fixed.
When they have patched it, test again and move onto the next gem. It's probably best to use snippets of code from your implementation for each gem test rather than a generic example.
Testing MRI
First step would be to prove the issue on multiple machines under the same MRI to rule out anything local, which you've already done.
Then try the same Ruby version on a different OS, which you've done too.
Try the code on JRuby or Rubinius if possible. Does the same issue occur?
Try the same code on 2.0 or 1.9 if possible, see if the same problem exists.
Try the head development version from github and see if that makes any difference.
If nothing becomes apparent, submit a bug to Ruby detailing the issue and all the things you have eliminated. Wait for a dev to help out and provide whatever they need. They will most likely want to reproduce the issue so if you can get the most concise/minimal example of the issue set up. Doing that will often help you identify what the issue is anyway.
Ruby code memory leak in loop
Try removing the lines from the bottom up, and seeing if the memory leak persists. It's possible that the Memory leak is coming from the find method, or possibly the JSON.parse (extremely unlikely), or the custom Queue data structure. If the memory leak is still there after removing all of the lines, it is likely coming from the worker itself and/or the program running the workers.
q = Queue.new("test")
while true do
m = q.dequeue # Finally remove this and stub the while true with a sleep or something
body = JSON.parse(m.body) # Then remove these two lines
user_id = body["Records"][0]
user = V2::User.find(user_id) # Remove the bottom two lines first
post = V2::Post.find(post_id)
end
Related Topics
Creating Routes with an Optional Path Prefix
Embedding JSON Data into Yaml File
In Ruby What Does "=>" Mean and How Does It Work
How to Use Watir::Waiter::Wait_Until to Force Chrome to Wait
How to Install an Older Version of Jekyll
How to Share the Factories That I Have in a Gem and Use It in Other Project
Rails: Render View from Outside Controller
Devise: How to Override Devise Error Messages on Password Change
Directory Layout for Pure Ruby Project
Preferred Ruby Plugin for Eclipse
How to Unfreeze an Object in Ruby
Form Submitted Twice, Due to :Remote=>True
Rvm Ruby Installation Errors - MAC