Using Swift, I need to detect and track text (words and phrases) in augmented reality experience, similar to ARKit\'s existing function of face/image/object detection. For e