I recently stumbled on an interesting TCP performance issue while running some performance tests that compared network performance versus loopback performance. In my case the n
1 or 2) I'm not sure why you're bothering to use loopback at all, I personally don't know how closely it will mimic a real interface and how valid it will be. I know that Microsoft disables NAGLE for the loopback interface (if you care). Take a look at this link, there's a discussion about this.
3) I would closely look at the first few packets in both cases and see if you're getting a severe delay in the first five packets. See here