Why Does My Program Run Way Faster When I Enable Profiling

Visual Studio - Program runs faster when profiling

Visual Studio's "CPU Usage" profiler appears to disregard laptop power usage settings, so if you run an application on a laptop that is trying to conserve battery power, it will run slower than if you run the profiler on it.

I discovered this when I got home from work- I noticed the speed difference had disappeared. On a hunch, I unplugged my laptop and tried the test several more times. The speed difference returned. What's more, under the profiler, the application runs at about the same speed plugged in or not.

I was not able to find any sources on this, but I'll be happy to edit them in if someone can find some.

Why is my C# program faster in a profiler?

Luaan posted the solution in the comments above, it's the system wide timer resolution. Default resolution is 15.6 ms, the profiler sets the resolution to 1ms.

I had the exact same problem, very slow execution that would speed up when the profiler was opened. The problem went away on my PC but popped back up on other PCs seemingly at random. We also noticed the problem disappeared when running a Join Me window in Chrome.

My application transmits a file over a CAN bus. The app loads a CAN message with eight bytes of data, transmits it and waits for an acknowledgment. With the timer set to 15.6ms each round trip took exactly 15.6ms and the entire file transfer would take about 14 minutes. With the timer set to 1ms round trip time varied but would be as low as 4ms and the entire transfer time would drop to less than two minutes.

You can verify your system timer resolution as well as find out which program increased the resolution by opening a command prompt as administrator and entering:

powercfg -energy duration 5

The output file will have the following in it somewhere:

Platform Timer Resolution:Platform Timer Resolution
The default platform timer resolution is 15.6ms (15625000ns) and should be used whenever the system is idle. If the timer resolution is increased, processor power management technologies may not be effective. The timer resolution may be increased due to multimedia playback or graphical animations.
Current Timer Resolution (100ns units) 10000
Maximum Timer Period (100ns units) 156001

My current resolution is 1 ms (10,000 units of 100nS) and is followed by a list of the programs that requested the increased resolution.

This information as well as more detail can be found here: https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/

Here is some code to increase the timer resolution (originally posted as the answer to this question: how to set timer resolution from C# to 1 ms?):

public static class WinApi
{
/// <summary>TimeBeginPeriod(). See the Windows API documentation for details.</summary>

[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1401:PInvokesShouldNotBeVisible"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Security", "CA2118:ReviewSuppressUnmanagedCodeSecurityUsage"), SuppressUnmanagedCodeSecurity]
[DllImport("winmm.dll", EntryPoint = "timeBeginPeriod", SetLastError = true)]

public static extern uint TimeBeginPeriod(uint uMilliseconds);

/// <summary>TimeEndPeriod(). See the Windows API documentation for details.</summary>

[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1401:PInvokesShouldNotBeVisible"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Security", "CA2118:ReviewSuppressUnmanagedCodeSecurityUsage"), SuppressUnmanagedCodeSecurity]
[DllImport("winmm.dll", EntryPoint = "timeEndPeriod", SetLastError = true)]

public static extern uint TimeEndPeriod(uint uMilliseconds);
}

Use it like this to increase resolution :WinApi.TimeBeginPeriod(1);

And like this to return to the default :WinApi.TimeEndPeriod(1);

The parameter passed to TimeEndPeriod() must match the parameter that was passed to TimeBeginPeriod().

Application runs faster with visual studio performance analysis

I found the answer:

The reason is because when you run your application within Visual
Studio, the debugger is attached to it. When you run it using the
profiler, the debugger is not attached.

If you try running the .exe by itself, or running the program through
the IDE with "Debug > Start Without Debugging" (or just press Ctrl+F5)
the application should run as fast as it does with the profiler.

https://stackoverflow.com/a/6629040/1563172

I didn't find it earlier because I thought that the reason is concurrency.

One could use a profiler, but why not just halt the program?

On Java servers it's always been a neat trick to do 2-3 quick Ctrl-Breakss in a row and get 2-3 threaddumps of all running threads. Simply looking at where all the threads "are" may extremely quickly pinpoint where your performance problems are.

This technique can reveal more performance problems in 2 minutes than any other technique I know of.

Why instrumented C program runs faster?

Since there are not many details in the question, I can only recommend some factors to consider while investigating the problem.

Very few additional work (such as incrementing a counter) might alter compiler's decision on whether to apply some optimizations or not. Compiler has not always enough information to make perfect choice. It may try to optimize for speed where bottleneck is code size. It may try to auto-vectorize computations when there is not too much data to process. Compiler may not know what kind of data is to be processed or what is the exact model of CPU, that will execute the code.

  1. Incrementing a counter may increase size of some loop and prevent loop unrolling. This may decrease code size (and improve code locality, which is good for instruction or microcode caches or for loop buffer and allows CPU to fetch/decode instructions quickly).
  2. Incrementing a counter may increase size of some function and prevent inlining. This also may decrease code size.
  3. Incrementing a counter may prevent auto-vectorization, which again may decrease code size.

Even if this change does not affect compiler optimization, it may alter the way how the code is executed by CPU.

  1. If you insert counter-incrementing code in place, full of branch targets, this may make branch targets less dense and improve branch prediction.
  2. If you insert counter-incrementing code in front of some particular branch target, this may make branch target's address better aligned and make code fetch faster.
  3. If you place counter-incrementing code after some data is written but before the same data is loaded again (and store-to-load forwarding did not work for some reason), the load operation may be completed earlier.
  4. Insertion of counter-incrementing code may prevent two conflicting load attempts to the same bank in L1 data cache.
  5. Insertion of counter-incrementing code may alter some CPU scheduler decision and make some execution port available just in time for some performance-critical instruction.

To investigate effects of compiler optimization, you can compare generated assembler code before and after addition of counter-incrementing code.

To investigate CPU effects, use a profiler allowing to inspect processor performance counters.



Related Topics



Leave a reply



Submit