Arkit - Raycasting Using a World Ray Instead of a Screen Point

ARKit – Raycasting using a world ray instead of a screen point

In Apple RealityKit and ARKit frameworks you can find three main types of Raycast methods: ARView Raycast, ARSession Raycast and Scene Raycast (or World Raycast). All methods written in Swift:

ARView.raycast(from:allowing:alignment:)

This instance method performs a ray cast, where a ray is cast into the scene from the center of the camera through a point in the view, and the results are immediately returned. You can use this type of raycast in ARKit.

func raycast(from point: CGPoint, 
        allowing target: ARRaycastQuery.Target, 
              alignment: ARRaycastQuery.TargetAlignment) -> [ARRaycastResult]

ARView.scene.raycast(origin:direction:query:mask:relativeTo:)

WORLD RAYCAST

This instance method performs a convex ray cast against all the geometry in the scene for a ray of a given origin, direction, and length.

func raycast(origin: SIMD3<Float>, 
          direction: SIMD3<Float>,
              query: CollisionCastQueryType, 
               mask: CollisionGroup, 
         relativeTo: Entity) -> [CollisionCastHit]

ARView.session.trackedRaycast(_:updateHandler:)

This instance method repeats a ray-cast query over time to notify you of updated surfaces in the physical environment. You can use this type of raycast in ARKit 3.5.

func trackedRaycast(_ query: ARRaycastQuery, 
              updateHandler: @escaping ([ARRaycastResult]) -> Void) -> ARTrackedRaycast?

ARView.trackedRaycast(from:allowing:alignment:updateHandler:)

This RealityKit's instance method also performs a tracked ray cast, but here a ray is cast into the scene from the center of the camera through a point in the view.

func trackedRaycast(from point: CGPoint, 
               allowing target: ARRaycastQuery.Target, 
                     alignment: ARRaycastQuery.TargetAlignment, 
                 updateHandler: @escaping ([ARRaycastResult]) -> Void) -> ARTrackedRaycast?

Code snippet 01:

import RealityKit

let startPosition: SIMD3<Float> = [3,-2,0]
let endPosition: SIMD3<Float> = [10,7,-5]
let query: CollisionCastQueryType = .all
let mask: CollisionGroup = .all

let raycasts: [CollisionCastHit] = arView.scene.raycast(from: startPosition, 
                                                          to: endPosition, 
                                                       query: query,  
                                                        mask: mask, 
                                                  relativeTo: nil)

guard let rayCast: CollisionCastHit = raycasts.first
else { 
    return 
}

Code snippet 02:

import ARKit

let query = arView.raycastQuery(from: screenCenter,
                            allowing: .estimatedPlane,
                           alignment: .any)

let raycast = session.trackedRaycast(query) { results in

    if let result = results.first {
        object.transform = result.transform
    } 
}

raycast.stop()

What is the real benefit of using Raycast in ARKit and RealityKit?

Simple Ray-Casting, the same way as Hit-Testing, helps to locate a 3D point on a real-world surface by projecting an imaginary ray from a screen point onto detected plane. In Apple documentation (2019) there was the following definition of ray-casting:

Ray-casting is the preferred method for finding positions on surfaces in the real-world environment, but the hit-testing functions remain present for compatibility. With tracked ray-casting, ARKit and RealityKit continue to refine the results to increase the position accuracy of virtual content you place with a ray-cast.

When the user wants to place a virtual content onto detected surface, it's a good idea to have a tip for this. Many AR apps draw a focus circle or square that give the user visual confirmation of the shape and alignment of the surfaces that ARKit is aware of. So, to find out where to put a focus circle or a square in the real world, you may use an ARRaycastQuery to ask ARKit where any surfaces exist in the real world.

UIKit implementation

Here's an example where you can see how to implement the raycast(query) instance method:

import UIKit
import RealityKit

class ViewController: UIViewController {
    
    @IBOutlet var arView: ARView!
    let model = try! Entity.loadModel(named: "usdzModel")
    
    override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
        self.raycasting()
    }

    fileprivate func raycasting() {
            
        guard let query = arView.makeRaycastQuery(from: arView.center,
                                              allowing: .estimatedPlane,
                                             alignment: .horizontal)
        else { return }

        guard let result = arView.session.raycast(query).first
        else { return }

        let raycastAnchor = AnchorEntity(world: result.worldTransform)
        raycastAnchor.addChild(model)
        arView.scene.anchors.append(raycastAnchor)
    }
}

If you wanna know how to use a Convex-Ray-Casting in RealityKit, read this post.

If you wanna know how to use Hit-Testing in RealityKit, read this post.

SwiftUI implementation

Here's a sample code where you can find out how to implement a raycasting logic in SwiftUI:

import SwiftUI
import RealityKit

struct ContentView: View {
    
    @State private var arView = ARView(frame: .zero)
    var model = try! Entity.loadModel(named: "robot")
    
    var body: some View {            
        ARViewContainer(arView: $arView)
            .onTapGesture(count: 1) { self.raycasting() }
            .ignoresSafeArea()
    }
    
    fileprivate func raycasting() {                    
        guard let query = arView.makeRaycastQuery(from: arView.center,
                                              allowing: .estimatedPlane,
                                             alignment: .horizontal)
        else { return }

        guard let result = arView.session.raycast(query).first
        else { return }

        let raycastAnchor = AnchorEntity(world: result.worldTransform)
        raycastAnchor.addChild(model)
        arView.scene.anchors.append(raycastAnchor)
    }
}

and then...

struct ARViewContainer: UIViewRepresentable {
    
    @Binding var arView: ARView
    
    func makeUIView(context: Context) -> ARView { return arView }
    func updateUIView(_ uiView: ARView, context: Context) { }
}

P.S.

If you're building either of these two app variations from scratch (i.e. not using Xcode AR template), don't forget to enable the Privacy - Camera Usage Description key in the Info tab.

Create ARRaycastQuery from 3d object

If you're using ARKit you can implement 3 different ray-casting methods:

raycast(_:)

This instance method checks once for intersections between a ray (the ray you create from a screen point you're interested in) and real-world surfaces.

raycastQuery(from:allowing:alignment:)

This instance method creates a raycast query that originates from a point on the view, aligned with the center of the camera's field of view.

trackedRaycast(_:updateHandler:)

This instance method repeats a raycast query over time to notify you of updated surfaces in the physical environment.

But as I see you need to implement a method that creates a ray from one 3d object that must hit another 3d object in a scene. In that case you need to implement a RealityKit's instance method that can be used inside ARView's scene:

raycast(from:to:query:mask:relativeTo:)

This method performs a convex ray cast against all the geometry in the scene for a ray between two end points.

func raycast(from startPosition: SIMD3<Float>, 
                 to endPosition: SIMD3<Float>, 
                          query: CollisionCastQueryType = .all, 
                           mask: CollisionGroup = .all, 
     relativeTo referenceEntity: Entity? = nil) -> [CollisionCastHit]

In real code it might look like this:

import RealityKit

let raycasts: [CollisionCastHit] = arView.scene.raycast(from: [2, 2, 2], 
                                                          to: [9, 9, 9], 
                                                       query: .all,  
                                                        mask: .all, 
                                                  relativeTo: nil)

guard let rayCast: CollisionCastHit = raycasts.first
else { return }

print(rayCast.entity.name)
print(rayCast.distance)

But remember!

The method ignores entities that lack a CollisionComponent. So, before using aforementioned method you need to assign collision shape for a future hit-model.

How to use Raycast methods in RealityKit?

Simple Ray-Casting

If you want to find out how to position a model made in Reality Composer into a RealityKit scene (that has a detected horizontal plane) using Ray-Casting method, use the following code:

import RealityKit
import ARKit

class ViewController: UIViewController {
    
    @IBOutlet var arView: ARView!
    let scene = try! Experience.loadScene()
    
    @IBAction func onTap(_ sender: UITapGestureRecognizer) {
        
        scene.steelBox!.name = "Parcel"
        
        let tapLocation: CGPoint = sender.location(in: arView)
        let estimatedPlane: ARRaycastQuery.Target = .estimatedPlane
        let alignment: ARRaycastQuery.TargetAlignment = .horizontal
                
        let result: [ARRaycastResult] = arView.raycast(from: tapLocation,
                                                   allowing: estimatedPlane,
                                                  alignment: alignment)
        
        guard let rayCast: ARRaycastResult = result.first
        else { return }
        
        let anchor = AnchorEntity(world: rayCast.worldTransform)
        anchor.addChild(scene)
        arView.scene.anchors.append(anchor)
        
        print(rayCast)
    }
}

Pay attention to a class ARRaycastQuery. This class comes from ARKit, not from RealityKit.

Convex-Ray-Casting

A Convex-Ray-Casting methods like raycast(from:to:query:mask:relativeTo:) is the op of swiping a convex shapes along a straight line and stopping at the very first intersection with any of the collision shape in the scene. Scene raycast() method performs a hit-tests against all entities with collision shapes in the scene. Entities without a collision shape are ignored.

You can use the following code to perform a convex-ray-cast from start position to end:

import RealityKit

let startPosition: SIMD3<Float> = [0, 0, 0]
let endPosition: SIMD3<Float> = [5, 5, 5]
let query: CollisionCastQueryType = .all
let mask: CollisionGroup = .all

let raycasts: [CollisionCastHit] = arView.scene.raycast(from: startPosition, 
                                                          to: endPosition, 
                                                       query: query,  
                                                        mask: mask, 
                                                  relativeTo: nil)

guard let rayCast: CollisionCastHit = raycasts.first
else { return }
    
print(rayCast.distance)      /* The distance from the ray origin to the hit */
print(rayCast.entity.name)   /* The entity's name that was hit              */

A CollisionCastHit structure is a hit result of a collision cast and it lives in RealityKit's scene.

P.S.

When you use raycast(from:to:query:mask:relativeTo:) method for measuring a distance from camera to entity it doesn't matter what an orientation of ARCamera is, it only matters what its position is in world coordinates.

How to recolor all model's parts when using raycasting?

Separate-parts-model approach

You can easily retrieve all 3 models. But you have to specify this whole long hierarchical path:

let scene = try! Experience.loadFanfare()

// Fanfare – .children[0].children[0]
let fanfare = scene.children[0] ..... children[0].children[0] as! ModelEntity
fanfare.model?.materials[0] = UnlitMaterial(color: .darkGray)
    
// Flag – .children[1].children[0]
let flag = scene.children[0] ..... children[1].children[0] as! ModelEntity
flag.model?.materials[0] = UnlitMaterial(color: .darkGray)

// Star – .children[2].children[0]
let star = scene.children[0] ..... children[2].children[0] as! ModelEntity
star.model?.materials[0] = UnlitMaterial(color: .darkGray)

I don't see much difference when retrieving model entities from .rcproject, .reality or .usdz files. According to the printed diagram, all three model-entities are located at the same level of hierarchy, they are offsprings of the same entity. The condition in the if statement can be set to its simplest form – if a ray hits a collision shape of fanfare or (||) flag or (||) star, then all three models must be recolored.

Mono-model approach

The best solution for interacting with 3D models through raycasting is the mono-model approach. A mono-model is a solid 3D object that does not have separate parts – all parts are combined into a whole model. Textures for mono-models are always mapped in UV editors. The mono-model can be made in 3D authoring apps like Maya or Blender.

P.S.

All seasoned AR developers know that Wow! AR experience isn't about code but rather about 3D content. You understand that there is no "miracle pill" for an easy solution if your 3D model consists of many parts. Competently made AR model is 75% of success when working with code.

ARKit – Tap node with raycastQuery instead of hitTest, which is deprecated

About Hit-Testing

Official documentation says that only ARKit's hitTest(_:types:) instance method is deprecated in iOS 14. However, in iOS 15 you can still use it. ARKit's hit-testing method is supposed to be replaced with a raycasting methods.

Deprecated hit-testing:

let results: [ARHitTestResult] = sceneView.hitTest(sceneView.center, 
                                           types: .existingPlaneUsingGeometry)

Raycasting equivalent

let raycastQuery: ARRaycastQuery? = sceneView.raycastQuery(
                                                      from: sceneView.center, 
                                                  allowing: .estimatedPlane, 
                                                 alignment: .any)

let results: [ARRaycastResult] = sceneView.session.raycast(raycastQuery!)

If you prefer raycasting method for hitting a node (entity), use RealityKit module instead of SceneKit:

let arView = ARView(frame: .zero)

let query: CollisionCastQueryType = .nearest
let mask: CollisionGroup = .default
    
let raycasts: [CollisionCastHit] = arView.scene.raycast(from: [0, 0, 0], 
                                                          to: [5, 6, 7],  
                                                       query: query, 
                                                        mask: mask, 
                                                  relativeTo: nil)
    
guard let raycast: CollisionCastHit = raycasts.first else { return }
    
print(raycast.entity.name)

P.S.

There is no need to look for a replacement for the SceneKit's hitTest(_:options:) instance method returning [SCNHitTestResult], because it works fine and it's not a time to make it deprecated.

RealityKit – Keep object always in front of screen

The best solution in this case is to use trackedRaycast method:

let query: ARRaycastQuery = .init(origin: SIMD3<Float>(),
                               direction: SIMD3<Float>(),
                                allowing: .estimatedPlane,
                               alignment: .horizontal)

let repeated = arView.session.trackedRaycast(query) { results in
 
    guard let result: ARRaycastResult = results.first
    else { return }
        
    let anchor = AnchorEntity(world: result.worldTransform)
    anchor.addChild(model)
    arView.scene.anchors.append(anchor)
}
    
repeated?.stopTracking()

Also, you have to implement ray(through:) instance method:

@MainActor func ray(through screenPoint: CGPoint) -> (origin: SIMD3<Float>, 
                                                   direction: SIMD3<Float>)?

Positioning ARKIT model

When you do

node.position = SCNVector3Make(0, 0, 0);
self.sceneView.scene.rootNode.addChildNode(node)

you are positioning your model node at the rootNode of the scene, the position of which is determined by ARKit when the scene starts.

If you want the model to appear in front of the camera, then you need to grab the camera transform (session.currentFrame?.camera.transform), modify the -z component, and set that as the node's position. Or, if you want to place your model on a surface, do a raycast and use the raycastResult simdWorldPosition as your node's simdWorldPosition.

There are definitely examples of raycast around, but in brief, you first need to define a query:

let bounds = sceneView.bounds
let screenCenter = CGPoint(x: bounds.midX, y: bounds.midY)
let query = sceneView.raycastQuery(from: screenCenter, allowing: .estimatedPlane, alignment: .horizontal)

The above defines a raycast extending from the center of the screen and finding only horizontal surfaces. Please explore the docs for more information about allowing and alignment options.

You can then use the session to cast your ray:

session.raycast(query)

This will return an array of ARRaycastResult. You can examine them and choose if you wish, but I usually just take the .first one (note: it can be empty).

You would then want to take the translation component of your raycastResult transform and assign it as the simdWorldPosition of your node. Note that you can assign the entire transform to the nodes simdWorldTransform but this will also alter your node's orientation, which you may or may not want to do.

if let raycastResult = session.raycast(query).first {
    node.simdWorldPosition = raycastResult.worldTransform.translation
}

Oh, also, I find this little extension to be handy:

extension float4x4 {
    /**
     Treats matrix as a (right-hand column-major convention) transform matrix
     and factors out the translation component of the transform.
    */
    var translation: SIMD3<Float> {
        let translation = columns.3
        return SIMD3<Float>(translation.x, translation.y, translation.z)
    }
}

It's what allows you to do the worldTransform.translation in my code above.

Arkit - Raycasting Using a World Ray Instead of a Screen Point