Profiling Ruby Code

Profiling Ruby Code

Lots of profilers are like that.
What you need to know is not where the program spends its time, but why. Any references on Dynamic Code Analysis?

ADDED: Here's how I find "bottlenecks" in my code. (I hate that word.)
Here's a list of reasons why.

It is perfectly natural to assume that to find "bottlenecks" you have to somehow do a lot of measuring.
It is so natural that nearly all profilers are based on it.

Actually, finding and measuring are not the same problem. Measuring is needed to see if what you found (and fixed) made a difference. Finding what to fix, to me, is more like debugging than measuring.

The simplest way to explain it is to start from an infinite, or nearly infinite loop. How do you find it? You pause it and look at the stack, right? because you know the problem is somewhere on the stack. You only need to pause it once, and then you need to study the code on the stack. Pause it a few times if you want to be sure you've found it.

Suppose the code only takes twice as long as necessary. That means when you pause it, there is a 50% chance you will see it doing the unnecessary thing. If you pause it and look at it 10 times, you will catch it in the act roughly 5 times. In fact, as soon as you see it doing something you can optimize on as few as 2 samples, you've found a "bottleneck". Fix it, measure the speedup, show it off, and repeat.

Even if your biggest problem is not very big, this method will eventually find it.
Also, there's a magnification phenomenon, where small problems become easier to find after you've removed larger ones. That allows you to keep going until the code is nearly optimal.

P.S. After you've done this, there may still be opportunities for speedup. For example, optimization algorithms can depend on numerical stability. Message-driven architectures can make it harder to trace why code is being executed. In real-time software, a performance problem may only happen occasionally, and be less easy to sample. This requires more cleverness. Falling back on just measuring doesn't do it.

How to read ruby profiler's output

tl;dr self seconds + time spent in called methods

Okay, let us dig into the source. There are two procs, gathering information there: PROFILE_CALL_PROC, PROFILE_RETURN_PROC. The former gets called before stepping into a method, the latter before stepping out.

@@maps[Thread.current] accumulates information about methods, namely calls, total seconds (cost), self seconds and name in this particular order. This very information is served later to the user, in a more wordy fashion. After being aggregated.

@@stacks[Thread.current] stores information about methods being run, in a stack. Namely "when the method was started", and "how much time does called (child) methods took" (tick[1]). This is more of a temporary data structure destined to help gathering data into @@maps.

As one can supposedly see now, self seconds is total seconds minus cost (time spent in called methods). That is, total seconds is a time spent both in the method itself and in the methods it calls.

P.S. Pretty much reminds flat profiles in ruby-prof.

How can I profile a request in Ruby on Rails?

ruby-prof is the way to go. Here's a howto, How to profile your Rails and Ruby applications with ruby-prof.

If you use it in combination with a visualisation tool such as kcachegrind, you can easily separate the code that is your application code from the framework code.

I gave a talk on these tools at my local Ruby-users group a while back: Making your rails app kick ass with ruby-prof and kcachegrind.

Understanding ruby-prof output

Each section of the ruby-prof output is broken up into the examination of a particular function. for instance, look at the first section of your output. The read_frames method on FrameParser is the focus and it is basically saying the following:

  • 100% of the execution time that was profiled was spent inside of FrameParser#read_frames
  • FrameParser#read_frames was called 6 times.
  • 5 out of the 6 calls to read_frames came from FrameParser#receive_data and this accounted 100% of the execution time (this is the line above the read_frames line).
  • The lines below the read_frames (but within that first section) method are all of the methods that FrameParser#read_frames calls (you should be aware of that since this seems like it's your code), how many of that methods total calls read_frames is responsible for (the a/b calls column), and how much time those calls took. They are ordered by which of them took up the most execution time. In your case, that is receive_frame method on the ChatServer class.
  • You can then look down at the section focusing on receive_frames (2 down and centered with the '100%' line on receive_frame) and see how it's performance is broken down. each section is set up the same way and usually the subsequent function call which took the most time is the focus of the next section down. ruby-prof will continue doing this through the full call stack. You can go as deep as you want until you find the bottleneck you'd like to resolve.

How to profile JRuby code?

JRuby 1.5.5 includes a decent --profile command line flag.

Use it like this:

jruby --profile -S <script> <args>


Related Topics



Leave a reply



Submit