Is there a way to get bounding boxes from the Microsoft's custom vision object detection model.pb file?

问题

Is there a way to get bounding boxes of a particular object detected via Microsoft custom vision model.pb file? I know we can get that via API calls to the azure custom vision service. Say for example, we can get the bounding boxes from the ssd frozen inference graph.pb file as there are tensors present. Can we do the same for custom vision's model.pb file?

This is the code that I am using the print out the operations for a tensorflow model and the output.

detection_graph = tf.Graph()

with detection_graph.as_default():
    graph_def = tf.GraphDef()
    with tf.gfile.GFile('model.pb,'rb') as fid:
        serialized_graph = fid.read()
        graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(graph_def, name='')

with tf.Session(graph=detection_graph) as sess:
    ops = tf.get_default_graph().get_operations()
    for op in ops:
        for output in op.outputs:
            print(output.name)


Placeholder:0
layer1_conv/weights:0
layer1_conv/weights/read:0
layer1_conv/Conv2D:0
layer1_conv/biases:0
layer1_conv/biases/read:0
layer1_conv/BiasAdd:0
layer1_leaky/alpha:0
layer1_leaky/mul:0
layer1_leaky:0
pool1:0
layer2_conv/weights:0
layer2_conv/weights/read:0
layer2_conv/Conv2D:0
layer2_conv/biases:0
layer2_conv/biases/read:0
layer2_conv/BiasAdd:0
layer2_leaky/alpha:0
layer2_leaky/mul:0
layer2_leaky:0
pool2:0
layer3_conv/weights:0
layer3_conv/weights/read:0
layer3_conv/Conv2D:0
layer3_conv/biases:0
layer3_conv/biases/read:0
layer3_conv/BiasAdd:0
layer3_leaky/alpha:0
layer3_leaky/mul:0
layer3_leaky:0
pool3:0
layer4_conv/weights:0
layer4_conv/weights/read:0
layer4_conv/Conv2D:0
layer4_conv/biases:0
layer4_conv/biases/read:0
layer4_conv/BiasAdd:0
layer4_leaky/alpha:0
layer4_leaky/mul:0
layer4_leaky:0
pool4:0
layer5_conv/weights:0
layer5_conv/weights/read:0
layer5_conv/Conv2D:0
layer5_conv/biases:0
layer5_conv/biases/read:0
layer5_conv/BiasAdd:0
layer5_leaky/alpha:0
layer5_leaky/mul:0
layer5_leaky:0
pool5:0
layer6_conv/weights:0
layer6_conv/weights/read:0
layer6_conv/Conv2D:0
layer6_conv/biases:0
layer6_conv/biases/read:0
layer6_conv/BiasAdd:0
layer6_leaky/alpha:0
layer6_leaky/mul:0
layer6_leaky:0
pool6:0
layer7_conv/weights:0
layer7_conv/weights/read:0
layer7_conv/Conv2D:0
layer7_conv/biases:0
layer7_conv/biases/read:0
layer7_conv/BiasAdd:0
layer7_leaky/alpha:0
layer7_leaky/mul:0
layer7_leaky:0
layer8_conv/weights:0
layer8_conv/weights/read:0
layer8_conv/Conv2D:0
layer8_conv/biases:0
layer8_conv/biases/read:0
layer8_conv/BiasAdd:0
layer8_leaky/alpha:0
layer8_leaky/mul:0
layer8_leaky:0
m_outputs0/weights:0
m_outputs0/weights/read:0
m_outputs0/Conv2D:0
m_outputs0/biases:0
m_outputs0/biases/read:0
m_outputs0/BiasAdd:0
model_outputs:0

The Placeholder:0 and model_outputs:0 are the inputs and the outputs. The Placeholder:0 takes a tensor of shape (?,416,416,3) and the model_outputs:0 outputs a tensor of shape (1, 13, 13, 30). If I am detecting just a single object, how do I get the bounding boxes from the model_outputs:0 tensor.

Where am I going wrong? Any suggestions are welcome.

回答1:

You seem to be using python, so you can export the object-detection model from the customvision UI (select tensorflow options):

https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/export-model-python

which will give you a zipfile containing:

labels.txt
model.pb
python/object_detection.py
python/predict.py

Put everything in one directory then simply execute the code:

python predict.py image.jpg

Hey presto! This will print out a list of dictionaries like

{'boundingBox': {'width': 0.92610852, 'top': -0.06989955, 'height': 0.85869097, 'left': 0.03279033}, 'tagId': 3, 'tagName': 'myTagName', 'probability': 0.24879535}

The coordinates (relative to top left) are normalized to the width and height of the image.

Here is main (not my code!):

def main(image_filename):
    # Load a TensorFlow model
    graph_def = tf.GraphDef()
    with tf.gfile.FastGFile(MODEL_FILENAME, 'rb') as f:
        graph_def.ParseFromString(f.read())

    # Load labels
    with open(LABELS_FILENAME, 'r') as f:
        labels = [l.strip() for l in f.readlines()]

    od_model = TFObjectDetection(graph_def, labels)

    image = Image.open(image_filename)
    predictions = od_model.predict_image(image)
    print(predictions)

which you can modify as you see fit. Good luck!

来源：https://stackoverflow.com/questions/54048657/is-there-a-way-to-get-bounding-boxes-from-the-microsofts-custom-vision-object-d

标签

tensorflow

deep-learning

object-detection

bounding-box

microsoft-custom-vision