Looking at the source of Java 6, HashSet is actually implemented using HashMap, using dummy object instance on every entry
I am guessing that it has never turned up as a significant problem for real applications or important benchmarks. Why complicate the code for no real benefit?
Also note, that object sizes are rounded up in many JVM implementation, so there may not actually be an increase in size (I don't know for this example). Also the code for HashMap is likely to be compiled and in cache. Other things being equal, more code => more cache misses => lower performance.