WebRTC is cool, and hard. Group audio chat with WebRTC is even harder. Besides learning those glossaries including STUN, TURN, ICE, OPUS, BUNDLE, RTP, RTCP and so on, you need to decide between mesh and SFU, or even MCU.
I have built a real time group video chat app a few years ago, so I was familiar with these concepts. But it still feels frustrating to pick up them again when we recently decided to add E2EE group audio chat to our Mixin Messenger app. Anyway, after we finally mastered those things, and tried to choose an open source SFU solution, we failed.
Because we need E2EE, so MCU is not an option as it decodes plain audio frames to merge them into one single frame, encryption will prevent audio codecs from working. And many experiment have proves that a mesh of more than four peers will make the group chat almost unusable, cause every peer needs to upload its stream to all other peers. So SFU is our only solution, which allows peer upload only one stream despite participants count.
We used Janus in our previous WebRTC application years ago, and it’s our first candidate when we choose SFU for our new project, because we knew its API and flow, and it has very good performance and stability. But After read through its docs and issues, we find its development is so slow, without UNIFIED-BUNDLE support yet, that’s our goal which will make service more stable and development easier. And we recalled our pain experience tuning Janus RPC calls, too complicated and some unexpected behaviors to hack around. The most important thing is that it’s written in C/C++ and difficult for us to maintain or improve its codebase.
Then we researched Jitsi and Mediasoup, their APIs are designed too obscure and both recommend use of their client libraries. Too much to learn about them and we suspect that they would perform better even after we mastered their usage. And again, their primary languages are difficult for us to maintain the code and do urgent fixes.
I have never thought about to roll out our own SFU server, because I knew the complications to do all WebRTC things correctly and combine them together, until I read ION from hacker news. I was amazed that there is such an SFU by pure Go, which is our primary language. Then I played around with it, and soon made a working audio conferencing service, Mornin. We put it in production and even on the front page of Product Hunt.
However, after trying to integrate ION with our existing Mixin Messenger API we still find its API too difficult to handle correctly, and its codebase is really huge. Most of its code are related to video stream and live broadcast, but we don’t need those features. So we tried to remove these features from the code to make it easier for our audio only usage. We failed on it.
During the process of refactoring ION, we learnt enough about Pion, the pure Go WebRTC library used by ION. Why not write an audio only SFU with Pion? And finally we made Kraken born, it works flawlessly and we are able to fix things very fast. In the creation of Kraken, we even made some contributions to the Pion code.
Kraken is by far the most simple and intuitive WebRTC SFU, which doesn’t bind you to any client libraries, strange protocols other than plain HTTP JSON-RPC API. Four easy steps to create or join an existing group audio chat.
- Create a PeerConnection, offer SDP, then RPC
publishthe SDP to a room.
- In pc.onicecandidate callback, do RPC
- Do RPC
answerto get notified on your pc.ontrack callback whenever new audio stream available.
- In pc.ontrack, play the audio stream.
Kraken utilizes all most recent WebRTC features, e.g. Unified Plan, which allows all audio streams in a single UDP port and PeerConnection. Assumes you have 100 participants in a room, only 100 UDP ports needed, instead of 100x100.
Another cool thing is
subscribe will notify all available audio streams to the client, so when anyone joins a room with 100 participants, they just need to subscribe once instead of 100 times.
Give Kraken a try in our GitHub, and we also have Mornin open sourced there.