Short version:
When using Dataset map
operations, is it possible to specify that any 'rows' where the map
invocation results in an error are quietly filtered out rather than having the error bubble up and kill the whole session?
Specifics:
I have an input pipeline set up that (more or less) does the following:
- reads a set of file paths of images stored locally (images of varying dimensions)
- reads a suggested set of 'bounding boxes' from a csv
- Produces the set of all image path to bounding box combinations
- Reads and decodes the image then produces the set of 'cropped' images for each of these combinations using
tf.image.crop_to_bounding_box
My issue is that there are (very rare) instances where my suggested bounding boxes are outside the bounds of a given image so (understandably) tf.image.crop_to_bounding_box
throws an error something like this:
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [width must be >= target + offset.]
which kills the session.
I'd prefer it if these errors were simply ignored and that the pipeline moved onto the next combination.
(I understand that the correct fix for this specific issue would be commit the time to checking each bounding box and image dimension size are possible the step before and filter them out using a filter
operation before it got to the map
with the cropping operation. I was wondering if there was an easy way to just ignore an error and move on to the next case both for easy of implementation in this specific case and also in more general cases)
There is tf.contrib.data.ignore_errors
. I've never tried this myself, but according to the docs the usage is simply
dataset = dataset.map(some_map_function)
dataset = dataset.apply(tf.contrib.data.ignore_errors())
It should simply pass through the inputs (i.e. returns the same dataset) but ignore any that throw an error.
来源:https://stackoverflow.com/questions/51966969/tensorflow-dataset-map-is-it-possible-to-ignore-errors