Does anybody have any references on the internals of Team Speak? I'd like to figure out why doubling sounds so bad, and fix it if at all possible.
It's pretty obvious from tracing the packets that everybody talks UDP to a central node (ts1.teamspeakhost.com) that repeats whatever it hears.
There just aren't very many ways to do this. The central node can reflect the individual bit stream(s) exactly as received, without decoding. The listeners decode them. If more than one person talks at once, everybody sums up the PCM. (This is how Eagle's voice conferencing should work.)
The other way has the central node decode the incoming packet stream(s), sum the decoded audio if there's more than one active stream, and re-encode the result into a single forward stream. This has the moderate advantage of not overrunning any slow dialup modems that might be in use.
This isn't an FM repeater, so either way there needn't be any significant distortion when two people talk at once. Even with loud talkers and central summing, gain scaling would avoid clipping.
The fact that the audio *is* so badly distorted whenever a double occurs, even when one talker isn't saying anything, means that something is broken somewhere. Perhaps the summing is being done incorrectly at the server. Maybe it's trying to sum mu-law PCM, which is nonlinear.
The doubling distortion combined with the fairly long round trip delay makes this system a lot more fatiguing and a lot less usable than it could be.
--Phil