I use the following code to limit the download speed of a file in java:
package org;
import java.io.IOException;
import java.io.InputStream;
import java.net
Your rate limit doesn't actually work like you think it does, because the data is not actually sent byte-per-byte, but in packets. These packets are buffered, and what you observe (download continues without connection) is just your stream reading the buffer. Once it reaches the end of your buffer, it waits 5 seconds before the timeout is thrown (because that is what you configured).
You set the rate to 8 kB/s, and the normal packet size is normally around 1 kB and can go up to 64 kB, so there would be 8 seconds where you are still reading the same packet. Additionally it is possible that multiple packets were already sent and buffered. There exists also a receive buffer, this buffer can be as small as 8 - 32 kB up to several MB. So really you are just reading from the buffer.
[EDIT]
Just to clarify, you are doing the right thing. On average, the rate will be limited to what you specify. The server will send a bunch of data, then wait until the client has emptied its buffer enough to receive more data.