How Much Performance Do You Get Out a Heroku Dynos/Workers

How much performance do you get out a Heroku dynos/workers?

This blog entry may be of use. He does a great breakdown of the kind of bottlenecks heroku can run into, and how increasing dynos can help, and provides links and information to the official performance guide on heroku as well as some tools that will help you test your own app.

Worker performance really depends on how your site is built and what you are using them for. Background processing (image formatting, account pruning, etc) called Delayed Jobs is how you put them to work

EDIT // 1 March 2012:
Here is another blog entry that explored heroku latency and throughput performance for a variable number Dynos.

EDIT // 28 Feb 2013:
There have been some concerns raised in this post regarding Heroku's random routing algorithm and how metrics may be misreported when scaling Dynos, specifically those provided by New Relic. This is still an ongoing issue and is something to note in the context of my previous answer. Heroku's responses are linked within the post.

EDIT // 8 May 2013:
A recent post on Shelly Cloud blog analyses impact of number of dynos and web server used on application perfomance. Baseline performance script used there should be useful in performing further tests.

Rails & Heroku: How many workers/dynos do I need

Not sure you're asking the right question. Your real question is "how can I get better performance?" Not "how many dynos?" Just adding dynos won't necessarily give you better performance. More dynos give you more memory...so if your app is running slowly because you're running out of available memory (i.e. you're running on swap), then more dynos could be the answer. If those jobs take 10 seconds each to run, though...memory probably isn't your actual problem. If you want to monitor your memory usage, check out a visualization tool like New Relic.

There are a lot of approaches to solving your problem. But I would start with the code that your wrote. Posting some code on SO might help understand why that job takes 10 seconds (Post some code!). 10 seconds is a long time. So optimizing the queries inside that job would almost surely help.

Another piece of low hanging fruit...switch from resque to sidekiq for your background jobs. Really easy to use. You'll use less memory and should see an instant bump in performance.

what is the tradeoff between using larger dynos vs more dynos on Heroku

As far as I know as you already mentioned Heroku routers don't know and care about load in a specific dyno and sends the request by a random selection algorithm. https://devcenter.heroku.com/articles/http-routing

So it is very likely to see many consecutive requests going to same dyno. I am observing that behaviour with my current applications.

But in your case if a worker need 230MB then you can't use 2 worker in a Standard-1X dyno because their max memory is 512M. So I think you should definitely go for 2X dynos with like 3-4 workers. 2X dynos can use up to 1GB memory.

What is the difference between four 1X dynos and two 2X dynos in heroku?

More dynos give you more concurrency. Based on a single threaded web server such as thin, if you have 4 x 1x dynos, you can serve four requests at the same time.

With 2 x 2x dynos you can only serve two requests.

2x dynos have more memory (1024MB) and more CPU available. This is useful if your application takes up a lot of memory. Most apps should not need 2x dynos.

Heroku have recently added PX dynos as well, which have significantly more power available.

You can read about the different dynos Heroku offers on their website.

Heroku queue times

Basically, break the problem down into its parts and test each part. Simply throwing a bunch of requests at a cluster of unicorns isn't necessarily a good way to measure throughput. You have to consider many variables (side note: checkout "Programmers Need To Learn Statistics Or I Will Kill Them All" by Zed Shaw)

Also, you're leaving out critical information from your question for solving the mystery.

How many requests is each unicorn handling per second?
How long is the total test and are you allowing time for whatever cache you have to warm up?
How many total requests were handled by the collection?
I see in the chart that queuing time drops significantly from the initial spike at the left hand side of the chart - any idea why? Is this startup time? Is this cache warming? Is it a flood of requests coming disproportionally at the beginning of the test?

You're the only person who can answer these questions.

Queuing time, if I understand Heroku's setup correctly, is essentially the time new requests sit waiting for an available unicorn (or to be more accurate with unicorn, how long requests sit before they are grabbed by unicorn). If you're load testing and feeding the system more than it can handle then, while your app itself my serve requests that it's ready to handle very quickly, there will still be a backlog of requests waiting for an available unicorn to process it.

Depending on your original setup, try the following variables in your test:

Same number of total requests, but run it longer to see if caches warm up more and speed up response times (i.e. unicorns handle more requests per second)
Adjust the number of requests per second to the total collection of unicorns available, both up and down, and observe at what thresholds the queuing times get better and worse
Simplify the test. First, just test a single unicorn process and figure out how long it takes to warm up, how many requests per second it can handle, and at what point queuing times start to increase due to backlogs. Then, add unicorn processes and repeat the tests, trying to to find out if, with 3 unicorns, you get 3x performance, or if there's some % overhead in adding more unicorns (e.g. the overhead of load balancing the incoming requests), and whether that overhead is negligible or not, etc.
Make sure the requests are all very similar. If you have some requests that are just returning a front page with 100% cached and non-dynamic content your processing times will be much shorter than requests that need to generate a variable amount of dynamic content, which is going to throw off your test results considerably.

Also, find out if the test results chart you're showing above is an average, or a 95th percentile with standard deviations, or some other measurement.

Only after you've broken the problem down into its component parts will you know with any predictability whether or not adding more unicorns will help. Looking at this basic chart and asking, "Should I just add more unicorns?" is like having a slow computer and asking, "Should I just add more RAM to my machine?". While it may help you're skipping the steps of actually understanding why something is slow, and adding more of something, while it may help, won't give you any deeper understanding of why it's slower. Because of this (and especially on heroku), you might wind up overpaying for more dynos when you don't need them, if only you could get to the root of what is causing the longer than expected queuing times you'll be in much better shape.

This approach, of course, isn't unique to heroku. Trying experiments, tweaking variables, and recording the outcome measurements will allow you to pick apart what's going on inside those performance numbers. Understanding the "why" will enable you to take specific, educated steps that should have mostly predictable effects on overall performance.

After all of that you may find that yes, the best way to improve the performance in your specific case is to add more unicorns, but at least you'll know why and when to do so, as well as a really solid guess as to how many to add.

Scaling Dyno worker size dynamically on Heroku Rails application

The best way is the one you mentioned: use the Heroku Platform API to scale your Dyno size up before starting the job, and then down again afterwards.

This is because tools like HireFire only work by inspecting stuff like application response time, router queue, etc. -- so there's no way for them to know you're about to run some job and then scale up just for that.

How Much Performance Do You Get Out a Heroku Dynos/Workers