WebRTC is a technology that enables real-time peer to peer media and data communication between web browsers external plugin. It is available natively modern web browsers such as Chrome, Firefox, Safari, Opera and Edge. WebRTC shares most of the components from existing real-time communication technologies such SIP and H323. What is really new here is the JavaScript APIs that has been introduces to the browsers.

WebRTC Browser API

  • MediaStream API
    The MediaStream API provides the functionality to access camera, microphone or screen using javascript.

  • RTCPeerConnection API
    The RTCPeerConnection API takes care of the NAT traversal, Codec processing, SDP negotiation, Media transferring and much more on handling the secure connection between peers.

  • RTCDataChannel API
    The RTCDataChannel API allows to setup bidirectional data transfer channel between peers.


Signaling is the process which establishes the connection between peers. It can achieve by WebSockets, XMPP, SIP or any copy & paste mechanism.

this is a placeholder image
Signaling Process
  • Session Description Protocol
    Also known as SDP, It is a protocol used to negotiate media capabilities between peers before establishing a connection.
  • ICE (Interactive Connectivity Establishment)
    ICE is a framework used for NAT traversal mechanism. ICE collects all available candidates (local IP addresses, reflexive addresses - STUN ones and relayed addresses - TURN ones). All the collected addresses are then sent to the remote peer via SDP.
  • STUN server
    The STUN server allows clients to find out their public address, the type of NAT they are behind and the Internet side port associated by the NAT with a particular local port.
  • TURN server
    TURN is used to relay media via a TURN server when the use of STUN isn't possible.

WebRTC Topologies


this is a placeholder image
Mesh topology

In Mesh network all peers send their stream directly to other connected peers in network individually.


this is a placeholder image
SFU topology

SFU stands for Selective Forwarding Unit. An SFU receives the incoming media streams from all users, and then decides which streams to send to which users. it works as a video relay and does not mix the video streams together to generate a composite conference video. If there is a video conference with 3 users, User A will be sending his stream to to JVB. JVB will be relaying the video stream to B and C users.
In this case, each user will be uploading one video stream and downloading two video streams. An advantage of SFU over P2P model is, you only have one upstream. But you have same number of down streams as P2P model. Compared to MCU an advantage of SFU is that you do not need a lot of computing power since the video streams does not get mixed. One of the major disadvantage of SFU is that, the server and participants should have a lot of available bandwidth. So remember, if you setup an SFU, you need to have a good bandwidth.


this is a placeholder image
MCU topology

MCU stands for Multipoint Conferencing Unit. An MCU receives the incoming media streams from all users, decodes it all, creates a new layout of everything and sends it out to all users as a single stream.

Read our blog on Media Server Comparison to get a better understanding on popular opensource media servers.


Leave a Comment