I removed the classification part of a faster-rcnn model and then added another output layer to get the feature embeddings. I trained it with siamese network and contrastive