Seeing the picture of the edges/non edges pixel in the classifier, we can see that the gradient image of your input already basically gives the result of the classifier you have learnt. But the confidence map shows a good solution except that:
1. they are connected levelsets, with varying sizes.
2. you have noisy bright spots in the cells that cause false outputs from the classifier. (maybe some smoothing could be considered)
3. I guess it would be probably easier to characterize the internal of each cell: the grayscale variations, the average size. Learning these distributions would probably get you better detection results. Topologically we have a set of low grayscale values nested in large grayscale values.
To perform this one could use Graphcuts with GMM model for the unitary costs and a learnt gradient distribution for the pairwise term