I've rolled my own data generator that generates random data conforming to regular expressions. The basic idea is to use validation rules twice. First you use them to generate valid random data and then you use them to validate new input in production.
I've stated a rewrite of the utility as it seems like a nice learning project. It's available at googlecode.