Cedric Fung

Mixin me 25566

End-to-End Encrypted Audio Conferencing

Jun 23, 2020

Whenever I build something, the first principle is always security and privacy, better with end-to-end encryption. If E2EE was not applicable, I would give up that product or feature at most time. Only E2EE is not enough neither, it must be productive and comparable to which without E2EE in user experience.

We are a fully remote team, and we mainly collaborate asynchronously, but at some point we need a synchronous conferencing service to connect colleagues. Zoom is obvious not an option for us, for its bad privacy practice and reputation. But the worse thing is, we failed to find anything better than Zoom, and we can’t even find a working E2EE conferencing service.

After we made the Mixin Messenger desktop app work well enough, we have ditched Slack and migrated all our stuff to MM, and by eating our own dog food, we tailor MM to a pretty good tool for secure remote collaboration. When we started the project, we wanna MM an E2EE replacement for email, acts as a simple instant messenger that delivers messages and files, all enforced Signal protocol. All the collaboration features we relied on are implemented in bots, and they are very flexible and work reliably.

In about two years since MM first version released, we added nothing besides these core features. The only further step was E2EE voice call, because we need some secure tool to talk urgently and it’s difficult or impossible to do that feature with a bot. So we included this feature in MM, improved its service quality for months, and finally we think it reaches good enough usability. It’s only a one-to-one call service, as we think group call is not the feature where an email replacement client should have, we keep looking for a secure group voice call service, but nothing secure and usable exists literally. We are shocked by the fact, technology improves so fast that lots of group conferencing apps are published in the app store, but none is both usable and E2EE.

So we decided to roll out our own E2EE group audio call feature in Mixin Messenger, and it is almost finished. The service is also built on WebRTC stack, and to make a large group conferencing stable, we utilize SFU to connect all peers instead of forming a mesh. Don’t simply think the SFU will be able to decode the audio streams, because we build a custom SFU with Pion and add our own encryption layer on WebRTC with the new insertable streams feature! Moreover the key for this layer is delivered to participants by Signal protocol.

All messages transferred in Mixin Messenger are E2EE with Signal protocol, files are encrypted with secure AES-GCM and AES keys are E2EE with Signal protocol. Now the group audio call is also E2EE, and performs excellent in quality.

[References list]

  1. Mixin Messenger Source Code
  2. WebRTC Insertable Streams
  3. Pion WebRTC : A pure Go implementation of the WebRTC API

About the Author

Core developer of Mixin Network. Passionate about security and privacy. Strive to formulate elegant code, simple design and friendly machine.

25566 @ Mixin Messenger