How to Improve Camera Quality in Arkit

How to improve camera quality in ARKit

Update: Congrats to whoever filed feature requests! In iOS 11.3 (aka "ARKit 1.5"), you can control at least some of the capture settings. And you now get 1080p with autofocus enabled by default.

Check ARWorldTrackingConfiguration.supportedVideoFormats for a list of ARConfiguration.VideoFormat objects, each of which defines a resolution and frame rate. The first in the list is the default (and best) option supported on your current device, so if you just want the best resolution/framerate available you don't have to do anything. (And if you want to step down for performance reasons by setting videoFormat, it's probably better to do that based on array order rather than hardcoding sizes.)

Autofocus is on by default in iOS 11.3, so your example picture (with a subject relatively close to the camera) should come out much better. If for some reason you need to turn it off, there's a switch for that.

There's still no API for changing the camera settings for the underlying capture session used by ARKit.

According to engineers back at WWDC, ARKit uses a limited subset of camera capture capabilities to ensure a high frame rate with minimal impact on CPU and GPU usage. There's some processing overhead to producing higher quality live video, but there's also some processing overhead to the computer vision and motion sensor integration systems that make ARKit work — increase the overhead too much, and you start adding latency. And for a technology that's supposed to show users a "live" augmented view of their world, you don't want the "augmented" part to lag camera motion by multiple frames. (Plus, on top of all that, you probably want some CPU/GPU time left over for your app to render spiffy 3D content on top of the camera view.)

The situation is the same between iPhone and iPad devices, but you notice it more on the iPad just because the screen is so much larger — 720p video doesn't look so bad on a 4-5" screen, but it looks awful stretched to fill a 10-13" screen. (Luckily you get 1080p by default in iOS 11.3, which should look better.)

The AVCapture system does provide for taking higher resolution / higher quality still photos during video capture, but ARKit doesn't expose its internal capture session in any way, so you can't use AVCapturePhotoOutput with it. (Capturing high resolution stills during a session probably remains a good feature request.)

iOS ARKit breaks regular camera quality

No work arounds found. However, this is definitely an apple bug as it doesn't happen in newer devices. Looking forward for an iphone 7 update.

ARKit - configure video quality

Nope, and nope.

ARKit owns and entirely controls its underlying video capture session. It's hard to know why, but there are some likely guesses... to ensure that it gets video samples in a format and rate that works well for the computer vision work it does to provide world tracking. And/or to make sure said work is done efficiently enough to leave headroom for your app to do awesome things with SceneKit, Metal, etc. And/or to make world tracking performance/accuracy consistent across all supported hardware.

More capture session flexibility might be a good feature request to send to Apple, though.

What is the real Focal Length of the camera used in RealityKit?

ARKit and RealityKit do definitely have identical values of focal length parameter. That's because these two frameworks are supposed to work together. And although there's no focal length instance property for ARView at the moment, you can easily print in Console a focal length for ARSCNView or SCNView.

@IBOutlet var sceneView: ARSCNView!

sceneView.pointOfView?.camera?.focalLength

However, take into account that ARKit, RealityKit and SceneKit frameworks don't use a screen resolution, they rather use a viewport size. A magnification factor for iPhones' viewports is usually 1/2 or 1/3.

Intrinsic Camera Matrix

Sample Image

As you said in ARKit there's a 3x3 camera matrix allowing you convert between the 2D camera plane and 3D world coordinate space.

var intrinsics: simd_float3x3 { get }

Using this matrix you can print 4 important parameters: fx, fy, ox and oy. Let's print them all:

DispatchQueue.main.asyncAfter(deadline: .now() + 2.0) {
                    
    print(" Focal Length: \(self.sceneView.pointOfView?.camera?.focalLength)")
    print("Sensor Height: \(self.sceneView.pointOfView?.camera?.sensorHeight)")
    // SENSOR HEIGHT IN mm
                    
    let frame = self.sceneView.session.currentFrame

    // INTRINSICS MATRIX
    print("Intrinsics fx: \(frame?.camera.intrinsics.columns.0.x)")
    print("Intrinsics fy: \(frame?.camera.intrinsics.columns.1.y)")
    print("Intrinsics ox: \(frame?.camera.intrinsics.columns.2.x)")
    print("Intrinsics oy: \(frame?.camera.intrinsics.columns.2.y)")
}

For iPhone X the following values are printed:

Sample Image

When you apply your formulas you'll get a implausible result (read on to find out why).

About Wide-Angle Lens and OIS

The iPhone X has two rear camera sensors, and both those modules are equipped with an optical image stabilizer (OIS). The wide-angle lens offers a 28-millimeter focal length and an aperture of f/1.8, while the telephoto lens is 56 millimeters and f/2.4.

ARKit and RealityKit use a wide-angle lens rear module. In iPhone X case it's a 28-mm lens. But what about printed value focal length = 20.78 mm, huh? I believe that the discrepancy between the value of 28 mm and 20.78 mm is due to the fact that video stabilization eats up about 25% of the total image area. This is done in order to eventually get a focal length's value of 28 mm for final image.

Sample Image

Red frame is a cropping margin at stabilisation stage.

Conclusion

This is my own conclusion. I didn't find any reference materials on that subject, so do not judge me strictly if my opinion is wrong (I admit it may be).

We all know a fact that camera shake is magnified with an increase in focal length. So, the lower value of focal length is, the less camera shake is. It's very important for non-jittering high-quality world tracking in AR app. Also, I firmly believe that Optical Image Stabilisers work much better with lower values of focal length. Hence, it's not a surprise that ARKit engineers have chosen a lower value of focal length for AR experience (capturing a wider image area), and then after stabilization, we get a modified version of the image, like it has focal length = 28 mm.

So, in my humble opinion, it makes no sense to calculate a REAL focal length for RealityKit and ARKit 'cause there is a "FAKE" focal length already implemented by Apple engineers for a robust AR experience.

ARKit – Viewport Size vs Real Screen Resolution

Why there is a difference?

Let's explore some important display characteristics of your iPhone 7:

a resolution of 750 (W) x 1,334 (H) pixels (16 : 9)
viewport rez of 375 (W) x 667 (H) pixels (16 : 9)

Because mobile devices with the same screen size can have very different resolutions, developers often use viewports when they are creating 3D scenes or mobile friendly webpages. In VR and AR fields: the lower resolution is – the quicker a renderer is, and CPU/GPU burden is considerably less. The idea of creating viewports is mainly used for mobile devices. In macOS Screen Resolution and Viewport Resolution are identical.

Sample Image

In iPhone, as well as in other mobile devices, Viewport is a scaled down version (usually 2 or 3 times smaller in each axis) of resolution that allows 3D scenes viewports or websites to be viewed more consistently across different devices and (very important!) with less energy's consumption. Viewports are often more standardized and smaller than resolution sizes.

Snapshots almost always reflect a real screen resolutions:

let screenSize = sceneView.snapshot().size

/*   750 x 1,334    */
/*   iPhone 7 rez   */

SceneView size often reflects a standardized screen resolution (4 times smaller than specs rez):

let viewportSize = sceneView.bounds.size 

/*   375 x 667     */
/*   ViewPort rez  */

Viewport Rez (1/4) to Screen Rez aspect ratio in iPhone 7:

Schematic depiction!

Sample Image

Viewport size and its real layout in mobile device:

Real depiction!

Sample Image

Additional reference: Phone X has a ViewPort resolution nine times smaller (375 x 812) than screen resolution (1125 x 2436).

What coordinates are used in Hit-Testing?

In Hit-Testing and Ray-Casting coordinates of ViewPort are used.

Let's make 3 taps using hit-testing method – first tap in a Upper Left corner (near x=0 and y=0), second tap in center of the screen and third tap in a Lower Right Corner (near x=667 and y=375):

let point: CGPoint = gestureRecognize.location(in: sceneView)

print(point)

Sample Image

Coordinates of iPhone 7 Viewport is printed in a console:

Sample Image

Quod Erat Demonstrandum!

How to Improve Camera Quality in Arkit