The first and only time I used Apple's screen sharing, I was getting dropped a lot. Vine never did this to me. Furthermore, Vine is a more useful VNC solution anyway since it can run (the way I use it) as a system daemon and allows a connection even before you log in. This is what I think makes it more stable.
I would not even try to get these to sync, personally. What's needed is sophisticated programming to delay screen shots with the audio buffer, which is effected mostly by the client connection. You will never be able to predict this, me thinks, without the video and audio being sent in the same packets of data sent to the client (a unified software solution).
As far as bit depth, I've always controlled that on the client side, but that's a good point. Having to spit out millions of colors when you really only want to send 256 kinda sucks. I found a
PDF that contains the command line info you alluded to. You just schooled me, so thanks

Too bad that isn't just an option in the preferences of the server. Bummer.