How to Track More Than 4 Images at a Time with Arkit

Can I track more than 4 images at a time with ARKit?

ARKit 5.0

Today's answer is YES, you can. Apple has told that developers can now detect up to 100 images at a time in ARKit 5.0 announced at WWDC 2021. Let's check this out.

ARKit 4.0

There's no workaround in ARKit 4.0 to simultaneously track more than FOUR images using ARImageAnchor subclass inside session's ARImageTrackingConfiguration(). I should say that this limitation works despite the fact that the total number of tracked images in a scene can be up to 100 in ARKit 4.0.

You can read comments in ARConfiguration class if you choose Jump to Definition option.

Sample Image

I believe this feature was limited by Cupertino software engineers not occasionally. ARImageAnchor subclass inherits from ARAnchor parent class and conforms to ARTrackable protocol, so it tracks not only static images but moving images as well (like a logo on a car's body). Hence, if you track more than 4 images – it's highly CPU/GPU intensive (the most notorious thing for draining phone's battery), cause your device must detect and track several different objects.

I suppose it will be possible to simultaneously track more than 4 images with a newer ARKit 4.0 version that can be run on considerably powerful 5nm devices, like iPhone 12, that we'll see this fall.

Thus, Apple software engineers sacrificed apps functionality for the sake of a robust AR experience.

P.S.

It is incorrect to compare ARCore with ARKit, 'cause these frameworks work differently inside, even though they have similar fundamental principles – like World Tracking, Scene Understanding and Rendering stages. And in addition to the above, I should say that ARCore has more modest functionality than ARKit, which makes ARCore more "lightweight" for CPU calculations (although I understand that the last phrase sounds very subjective).

Only able to detect and track up to 4 images at a time with ARKit 3.0

Found the official answer in a comment made to ARImageTrackingConfiguration:

@discussion Image tracking provides 6 degrees of freedom tracking of known images. Four images may be tracked simultaneously.

EDIT: Found in ARConfiguration.h, line 336

ARKIT - how many tracking images can it track?

Having a look at the Apple Docs it doesn't seem to specify a limit. As such it is likely to assume it would likely depend on memory management etc.

Regarding creating images on the fly, this is definitely possible.

According to the docs this can be done one of two ways:

Creating a a new reference image from a Core Graphics image object:

init(CGImage, orientation: CGImagePropertyOrientation, physicalWidth: CGFloat)

Creating a new reference image from a Core Video pixel buffer:

init(CVPixelBuffer, orientation: CGImagePropertyOrientation, physicalWidth: CGFloat)

Here is an example of creating a referenceImage on the fly using an image from the standard Assets Bundle, although this can easily be adapted for parsing an image from a URL etc:

// Create ARReference Images From Somewhere Other Than The Default Folder
func loadDynamicImageReferences(){

//1. Get The Image From The Folder
guard let imageFromBundle = UIImage(named: "moonTarget"),
//2. Convert It To A CIImage
let imageToCIImage = CIImage(image: imageFromBundle),
//3. Then Convert The CIImage To A CGImage
let cgImage = convertCIImageToCGImage(inputImage: imageToCIImage)else { return }

//4. Create An ARReference Image (Remembering Physical Width Is In Metres)
let arImage = ARReferenceImage(cgImage, orientation: CGImagePropertyOrientation.up, physicalWidth: 0.2)

//5. Name The Image
arImage.name = "CGImage Test"

//5. Set The ARWorldTrackingConfiguration Detection Images Assuming A Configuration Is Running
configuration.detectionImages = [arImage]

}


/// Converts A CIImage To A CGImage
///
/// - Parameter inputImage: CIImage
/// - Returns: CGImage
func convertCIImageToCGImage(inputImage: CIImage) -> CGImage? {

let context = CIContext(options: nil)
if let cgImage = context.createCGImage(inputImage, from: inputImage.extent) {

 return cgImage

}

return nil
}

We can then test this within ARSCNViewDelegate e.g.

func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {

//1. If Out Target Image Has Been Detected Than Get The Corresponding Anchor
guard let currentImageAnchor = anchor as? ARImageAnchor else { return }

let x = currentImageAnchor.transform
print(x.columns.3.x, x.columns.3.y , x.columns.3.z)

//2. Get The Targets Name
let name = currentImageAnchor.referenceImage.name!

//3. Get The Targets Width & Height In Meters
let width = currentImageAnchor.referenceImage.physicalSize.width
let height = currentImageAnchor.referenceImage.physicalSize.height

print("""
Image Name = \(name)
Image Width = \(width)
Image Height = \(height)
""")

//4. Create A Plane Geometry To Cover The ARImageAnchor
let planeNode = SCNNode()
let planeGeometry = SCNPlane(width: width, height: height)
planeGeometry.firstMaterial?.diffuse.contents = UIColor.white
planeNode.opacity = 0.25
planeNode.geometry = planeGeometry

//5. Rotate The PlaneNode To Horizontal
planeNode.eulerAngles.x = -.pi/2

//The Node Is Centered In The Anchor (0,0,0)
node.addChildNode(planeNode)

//6. Create AN SCNBox
let boxNode = SCNNode()
let boxGeometry = SCNBox(width: 0.1, height: 0.1, length: 0.1, chamferRadius: 0)

//7. Create A Different Colour For Each Face
let faceColours = [UIColor.red, UIColor.green, UIColor.blue, UIColor.cyan, UIColor.yellow, UIColor.gray]
var faceMaterials = [SCNMaterial]()

//8. Apply It To Each Face
for face in 0 ..< 5{
    let material = SCNMaterial()
    material.diffuse.contents = faceColours[face]
    faceMaterials.append(material)
}
boxGeometry.materials = faceMaterials
boxNode.geometry = boxGeometry

//9. Set The Boxes Position To Be Placed On The Plane (node.x + box.height)
boxNode.position = SCNVector3(0 , 0.05, 0)

//10. Add The Box To The Node
node.addChildNode(boxNode)

 }

As you can see the process if fairly easy. So in your case, you are probably more interested in the conversion function above which uses this method to create the dynamic images:

init(CGImage, orientation: CGImagePropertyOrientation, physicalWidth: CGFloat)

Tracking multiple images and play their videos in ARkit2

As @SilentK has stated, in order to track multiple ARImageAnchors, you need to set the maximum number of images to track e.g:

configuration.maximumNumberOfTrackedImages = 10

Which as of IOS12 is available for both ARWorldTrackingConfiguration and ARImageTrackingConfiguration.

As you will also know that when you add an ARReferenceImage to your Assets Bundle, you have to assign it a name e.g:

Sample Image

Now since each target contains a unique name, you can easily use this to load your different videos. So it makes most sense to name each video the same as your imageTarget.

An example of what you are attempting to do might look like so:

//-------------------------
//MARK: - ARSCNViewDelegate
//-------------------------

extension ViewController: ARSCNViewDelegate{

    func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {

        //1. Check We Have Detected An ARImageAnchor
        guard let validAnchor = anchor as? ARImageAnchor else { return }

        //2. Create A Video Player Node For Each Detected Target
        node.addChildNode(createdVideoPlayerNodeFor(validAnchor.referenceImage))

    }


    /// Creates An SCNNode With An AVPlayer Rendered Onto An SCNPlane
    ///
    /// - Parameter target: ARReferenceImage
    /// - Returns: SCNNode
    func createdVideoPlayerNodeFor(_ target: ARReferenceImage) -> SCNNode{

        //1. Create An SCNNode To Hold Our VideoPlayer
        let videoPlayerNode = SCNNode()

        //2. Create An SCNPlane & An AVPlayer
        let videoPlayerGeometry = SCNPlane(width: target.physicalSize.width, height: target.physicalSize.height)
        var videoPlayer = AVPlayer()

        //3. If We Have A Valid Name & A Valid Video URL The Instanciate The AVPlayer
        if let targetName = target.name,
            let validURL = Bundle.main.url(forResource: targetName, withExtension: "mp4", subdirectory: "/art.scnassets") {
            videoPlayer = AVPlayer(url: validURL)
            videoPlayer.play()
        }

        //4. Assign The AVPlayer & The Geometry To The Video Player
        videoPlayerGeometry.firstMaterial?.diffuse.contents = videoPlayer
        videoPlayerNode.geometry = videoPlayerGeometry

        //5. Rotate It
        videoPlayerNode.eulerAngles.x = -.pi / 2

        return videoPlayerNode

    }

}

Infinite looping playback using this method is quite CPU intensive if you have several instances rendered:

 NotificationCenter.default.addObserver(forName: NSNotification.Name.AVPlayerItemDidPlayToEndTime, object: nil, queue: nil) { notification in
        player.seek(to: CMTime.zero)
        player.play()
    }

As such you will probably want to add logic to only play one video at a time, e.g. performing an SCNHitTest on the VideoNode, which will trigger it playing etc.

Hope it helps...

Distinguish between multiple tracked images at the same time?

You can easily do it using such instance properties as referenceImage and name.

// The detected image referenced by the image anchor.
var referenceImage: ARReferenceImage { get }

and:

// A descriptive name for your reference image.
var name: String? { get set }

Here's how they look like in code:

func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {
    
    guard let imageAnchor = anchor as? ARImageAnchor,
          let _ = imageAnchor.referenceImage.name
    else { return }
    
    anchorsArray.append(imageAnchor)

    if imageAnchor.referenceImage.name == "apple" {
        print("Image with apple is successfully detected...")
    }
}

Can I do ARKit Continuous Image Tracking in a World Tracking Configuration with RealityKit?

Continuous image tracking does work out of the box with RealityKit ARViews in world tracking configurations. A mistake in my original code lead me to think otherwise.

Incorrect anchor entity initialization (for what I was trying to accomplish):

currentImageAnchor = AnchorEntity(world: imageAnchor.transform)

Since I wanted to track the ARImageAnchor assigned to the matched reference image, I should have done it like this:

currentImageAnchor = AnchorEntity(anchor: imageAnchor)

The corrected example below places one virtual marker that is fixed to the reference image's initial position, and another that smoothly tracks the reference image in a world tracking configuration:

import ARKit
import RealityKit

class ViewController: UIViewController, ARSessionDelegate {

    @IBOutlet var arView: ARView!
    
    let ballRadius: Float = 0.02

    override func viewDidLoad() {
        super.viewDidLoad()
        
        guard let referenceImages = ARReferenceImage.referenceImages(
            inGroupNamed: "AR Resources", bundle: nil) else {
            fatalError("Missing expected asset catalog resources.")
        }
        
        arView.session.delegate = self
        arView.automaticallyConfigureSession = false
        arView.debugOptions = [.showStatistics]
        arView.renderOptions = [.disableCameraGrain, .disableHDR,
            .disableMotionBlur, .disableDepthOfField,
            .disableFaceOcclusions, .disablePersonOcclusion,
            .disableGroundingShadows, .disableAREnvironmentLighting]

        let configuration = ARWorldTrackingConfiguration()
        configuration.detectionImages = referenceImages
        configuration.maximumNumberOfTrackedImages = 1

        arView.session.run(configuration)
    }

    func session(_ session: ARSession, didAdd anchors: [ARAnchor]) {
        guard let imageAnchor = anchors[0] as? ARImageAnchor else { return }

        if let imageName = imageAnchor.name, imageName  == "target_image" {
            
            // AnchorEntity(world: imageAnchor.transform) results in anchoring
            // virtual content to the real world.  Content anchored like this
            // will remain in position even if the reference image moves.
            let originalImageAnchor = AnchorEntity(world: imageAnchor.transform)
            let originalImageMarker = makeBall(radius: ballRadius, color: .systemPink)
            originalImageMarker.position.y = ballRadius + (ballRadius * 2)
            originalImageAnchor.addChild(originalImageMarker)
            arView.scene.addAnchor(originalImageAnchor)

            // AnchorEntity(anchor: imageAnchor) results in anchoring
            // virtual content to the ARImageAnchor that is attached to the
            // reference image.  Content anchored like this will appear
            // stuck to the reference image.
            let currentImageAnchor = AnchorEntity(anchor: imageAnchor)
            let currentImageMarker = makeBall(radius: ballRadius, color: .systemTeal)
            currentImageMarker.position.y = ballRadius
            currentImageAnchor.addChild(currentImageMarker)
            arView.scene.addAnchor(currentImageAnchor)
        }
    }
    
    func makeBall(radius: Float, color: UIColor) -> ModelEntity {
        let ball = ModelEntity(mesh: .generateSphere(radius: radius),
            materials: [SimpleMaterial(color: color, isMetallic: false)])
        return ball
    }
}