iOS 11 CPU Throttling and Idle Timer

Create time operation that controls memory and CPU consumption ios

Once your app is in the background, you no longer have control of your resources. There are a few things that can be done to have some resources, but if your app attempts to hog or deliberately consume resources, you will be terminated.

While your app is in the foreground, you can use as much memory and CPU time as you want, but at your own hazard. If you stall the UI thread too long, you will be killed. If the system believes your background thread is hung at 100% CPU usage, you will also be killed.

Why are you trying to do this? If we knew your goals, we may be able to offer better ways to achieve them. What you are suggesting now is a dead end.

iOS app slows down after a few seconds, speeds back up if paused and resumed in debugger

First of all, thank you @thst for trying really hard to help me out. My question was kind of dumb since it really could have been anything.

I eventually solved the problem by rendering (via OpenGL, a detail I forgot to include, again showing how bad my question was) only when there is actually a change to the state of the objects and textures being rendered. It used to render at 60FPS all the time.

The app also uses CIDetector to detect faces. I think, but I am not sure, that CIDetector uses the GPU to perform its detection. If so, there might have been some contention for GPU resources. CIDetector blocking on a wait may have caused slowdown throughout the app.

Thanks everyone.

Why are Xcode and Time Profiler reporting higher CPU usage for faster iOS devices?

I wrote a routine that performed a consistent calculation (calculating π by summing Gregory-Leibniz series, throttled to only 1.2m iterations every 60th of a second, with a similar semaphore/displaylink dance like you had in your example). Both the iPad mini 2 and iPhone Xs Max were able to sustain the target 60fps (the iPad mini 2 just barely), and saw CPU usage values more consistent with what one would expect. Specifically, the CPU usage was 47% on iPhone Xs Max (iOS 13), but 102% on iPad mini 2 (iOS 12.3.1):

iPhone Xs Max:

Sample Image

iPad mini 2:

Sample Image

I then ran this through the “Time Profiler” in Instruments with the following settings:

  • “High Frequency” sampling;
  • “Record Waiting Threads”;
  • “Deferred” or “Windowed” capture; and
  • Changed the call tree to sort by “state”.

For a representative time sample, the iPhone Xs Max was report that this thread was running 48.2% of the time (basically, just waiting around for over half of the time):

Sample Image

Whereas on the iPad mini 2 that thread was running 95.7% of the time (barely no excess bandwidth, calculating almost all the time):

Sample Image

Bottom line, this suggests that the particular queue on the iPhone Xs Max could probably do roughly twice as much as the iPad mini 2 could.

You can see that the Xcode debugger CPU graph and the Instruments “Time Profiler” are telling us fairly consistent stories. And they’re also both consistent with our expectations, that the iPhone Xs Max is going to be considerably less taxed by the exact same task given to an iPhone mini 2.

In the interest of full disclosure, when I dropped the workload down (e.g. taking it from 1.2m iterations every 60th of a second, down to just 800k), the CPU utilization difference was less stark, where the CPU usage was 48% on iPhone Xs Max, and 59% on iPad mini 2. But still, the more powerful iPhone was using less CPU than the iPad.

You asked:

  1. Why might the CPU usage % be higher for devices with faster CPUs? Any reason to think the results are not accurate?

A couple of observations:

  • I’m not sure you’re comparing apples-to-apples here. If you’re going to do this sort of comparison, make absolutely sure that the work done on each thread on each device is absolutely identical. (I love that quote that I heard in a WWDC presentation years ago; to paraphrase, “in theory, there’s no difference between theory and practice; in practice, there’s a world of difference”.)

    If you had dropped frame rates or other time-based differences that might have split up the computations differently, the numbers might not be comparable, because other factors like context switches and the like might come into play. I’d make 100% sure that the calculations on the two devices are identical, or else comparisons will be misleading.

  • The debugger’s CPU “Percentage Used” is, IMHO, just an interesting barometer. I.e., you want to ensure that the meter is nice and low when you don’t have anything going on, to ensure there isn’t some rogue task floating out there. Conversely, when doing something massively parallelized and computationally intensive, you can use this to make sure that you don’t have some mistake that is preventing the device from being fully utilized.

    But this debugger “Percentage used” is not a number on which I’d hang my hat, in general. It’s always more illuminating to look at Instruments, identify threads that are blocked, look at utilization by CPU core, etc.

  • In your example, you’re placing a lot of emphasis on the debugger’s reporting of CPU “Percentage Used” of 47% on iPad mini 2 vs the 85% on the iPhone Xs Max. You’re obviously ignoring that on the iPad mini, it’s about ¼th of the overall capacity, but only in the neighborhood of ⅙th for the iPhone Xs Max. Bottom line, the overall meter is less worrying than these simple percentages.


  1. How could I better interpret the figures or get better benchmarks between devices?

Yep, Instruments will always give you more meaningful, more actionable results.


  1. Why does Record Waiting Threads result in lower CPU usage percentages (but still not significantly different and sometimes higher for the faster device)?

I’m not sure which “percentages” you’re talking about. Most of the general call tree percentages are useful for “when my code is running, what percentage of the time is spent where”, but in the absence of “Record Waiting Threads”, you’re missing a big part of the equation, i.e. where your code is waiting for something else. These are both important issues but by including “Record Waiting Threads”, you’re capturing a more wholistic picture (i.e. where the app is slow).


FWIW, here is the code that generated the above:

class ViewController: UIViewController {

@IBOutlet weak var fpsLabel: UILabel!
@IBOutlet weak var piLabel: UILabel!

let calculationSemaphore = DispatchSemaphore(value: 0)
let displayLinkSemaphore = DispatchSemaphore(value: 0)
let queue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".pi", qos: .userInitiated)
var times: [CFAbsoluteTime] = []

override func viewDidLoad() {
super.viewDidLoad()

let displayLink = CADisplayLink(target: self, selector: #selector(handleDisplayLink(_:)))
displayLink.add(to: .main, forMode: .common)

queue.async {
self.calculatePi()
}
}

/// Calculate pi using Gregory-Leibniz series
///
/// I wouldn’t generally hardcode the number of iterations, but this just what I empirically verified I could bump it up to without starting to see too many dropped frames on iPad implementation. I wanted to max out the iPad mini 2, while not pushing it over the edge where the numbers might no longer be comparable.

func calculatePi() {
var iterations = 0
var i = 1.0
var sign = 1.0
var value = 0.0
repeat {
iterations += 1
if iterations % 1_200_000 == 0 {
displayLinkSemaphore.signal()
DispatchQueue.main.async {
self.piLabel.text = "\(value)"
}
calculationSemaphore.wait()
}
value += 4.0 / (sign * i)
i += 2
sign *= -1
} while true
}

@objc func handleDisplayLink(_ displayLink: CADisplayLink) {
displayLinkSemaphore.wait()
calculationSemaphore.signal()
times.insert(displayLink.timestamp, at: 0)
let count = times.count
if count > 60 {
let fps = 60 / (times.first! - times.last!)
times = times.dropLast(count - 60)
fpsLabel.text = String(format: "%.1f", fps)
}
}
}

Bottom line, given that my experimentation with the above seems to correlate to our expectations, whereas yours doesn’t, I have to wonder if your calculations are actually doing precisely the same work every 60th of a second, regardless of device, like the above does. Once you have any dropped frames, different calculations for different time intervals, etc., it seems like all sorts of other variables would come into play and make comparisons invalid.


For what it’s worth, the above is with all of the semaphore and display link logic. When I simplified it to just sum the 50 million values of the sequence in a single thread as quickly as possible, the iPhone Xs Max did it in 0.12 seconds, whereas the iPad mini 2 did it in 0.38 seconds. Clearly, with simple calculations without any timers or semaphores, the hardware performance comes into stark relief. Bottom line, I wouldn’t be inclined to rely on any CPU usage calculations in the debugger or Instruments to identify what theoretical performance you can achieve.



Related Topics



Leave a reply



Submit