After reading this old article measuring the memory consumption of several object types, I was amazed to see how much memory Strings use in Java:
Remember that there are many types of compression. Using huffman encoding is a good general purpose approach - but it is relatively CPU intensive. For a B+Tree implementation I worked on a few years back, we knew that the keys would likely have common leading characters, so we implemented a leading character compression algorithm for each page in the B+Tree. The code was easy, very, very fast, and resulted in a memory usage 1/3 of what we started with. In our case, the real reason for doing this was to save space on disk, and reduce time spent on disk -> RAM transfers (and that 1/3 savings made a huge difference in effective disk performance).
The reason that I bring this up is that a custom String implementation wouldn't have helped very much here. We were only able to achieve the gains we did because we worked the layer of the container that the strings live in.
Trying to optimize a few bytes here and there inside the String object may not be worth it in comparison.