I found in many object detection papers have these similar words: backbone, neck, and head. I know neck is for feature extraction, but I don\'t know the difference between n