Real-time audio streaming using HTTP - choosing the protocol and Java implementation

问题

I'm trying to implement simple HTTP server for real-time audio (in Java). Suppose there is a website where you can see a list of songs which are playing one after another. When client connects to server - lets say in the middle of the song - I'm thinking to use "Range" HTTP header and send the data range starting from that part of song. But if during download connection is temporary lost (and song finished) - should the server send previous song part and finish it - or should the server send those parts of song which is playing at that moment? What are best practices/principles?

PS - I'm not looking for 3rd party software for audio streaming.

EDIT:
Now after some research in available real-time streaming technologies, I see these goals:
1. Choosing protocol for simple real-time audio streaming
2. Protocol implementation in Java (server side)

回答1:

You cannot arbitrarily cut media and expect a player to be able to play it. This works with bare MPEG streams, but other containers and codecs can have trouble. Because of this don't send partial files unless the client already has the rest of it.

You also have the problem of what to do when the song ends and you're on to the next.

There are two ways to implement this. One of which is to have static media available to your clients and then seek to the correct time in the audio client-side.

The way I would choose is to truly create an internet radio stream where everyone hears the same thing at the same time because you effectively have a common buffer which chunks are copied from and sent to all clients about the same time. Now, if you do this you will either need to use codecs/containers that support arbitrary splicing (MP3 or AAC) or re-wrap streams with the container as they are sent to the client. This is a complicated problem, so it's best if you use something off-the-shelf that does this, like Icecast. I know you say you're not looking for third-party solutions, but that's the best way. If you want to do it all yourself, you'll have to reimplement all of it, or support MPEG streams only.

EDIT: From your comment:

Could you explain more about data stream format, which is [24,576 bytes of stream] [metablock] [24,576 bytes of stream] [metablock] etc.. How to separate blocks, and what are the contents of metablocks?

If you wish, you can mux SHOUTcast-style metadata into your stream. Not all clients support this. If they do, they will send you the following header in the request:

Icy-MetaData: 1

If you see that header and value, you can optionally include metadata in the stream. The metadata is simply injected after every chunk of stream data. To include metadata, first you need to decide how big your stream chunks are. Too far apart, and the metadata won't align well to the stream. Too close together and in theory you are wasting bandwidth (but not much, since an unchanging metadata block is only a byte long). I usually stick with 8KB. It's not uncommon to see 16KB and sometimes 32KB. Output that chunk size, the metadata interval, in the response headers:

Icy-MetaInt: 8192

To get things started, send 8192 bytes (8KB) of audio stream data to the client.

Now it's time for a metadata block. Start with a string, like this:

StreamTitle='This is my stream title';StreamUrl='';

You can pass in StreamUrl or even other fields, but only StreamTitle is really used by clients these days. (StreamUrl used to have the ability to popup a browser by capitalizing some letters or something, I don't remember for certainty what the trigger was. It's no longer used.) Then convert this string to a buffer and pad with null bytes (0x00) to the nearest evenly divisible block of 16. That is, if the string version of your metadata block is 51 bytes long, you need it to be 64 bytes long so you will add 13 bytes of NUL padding.

A quick note on character set. Many clients support UTF-8 in their metadata. Some don't. Also, if you have to use an apostrophe ' in your metadata, it needs to be escaped. Unfortunately there doesn't seem to be a truly standard way of doing this. Backslash sometimes works. Repeating the character sometimes works. Different players work differently. Experiment with Winamp and see what it likes, as that would be about as "official" as you can get. Everything else is probably just a broken client. (If you wanted to get really crafty, you could determine the client from the User-Agent request header and adjust your escaping accordingly.)

Now that you have the metadata block, you just need to add one byte to the front of it that indicates how long it is, divided by 16. So if we now have a 64-byte metadata, we will add the byte 0x04 to the front of it which indicates that our metadata is 64 bytes long. This gives in total a 65-byte metadata block that we now send to the client. Send it.

From here, we just enter the loop again, sending another 8KB of stream data before inserting metadata. This time around though since we don't want to change the metadata, we just send 0x00 as our metadata block. Again, since the first byte indicates the length of the chunk, and we're not updating the title, tell the client that the length is 0. We only send the strings when something is changing.

来源：https://stackoverflow.com/questions/31476798/real-time-audio-streaming-using-http-choosing-the-protocol-and-java-implementa

标签

java

http

audio-streaming

internet-radio