Video communication has worked its way into our everyday lives - there’s no escaping it. If you are a product owner looking to add video conferencing to your SaaS application, there are many ways to go about it. In the old days, it was easy: should I use Zoom, or should I use Teams? Since then, the market has grown immensely, and various techniques for truly integrating video conferencing have emerged.
In what follows, we’ll focus on embedded technologies and help you get an overview of the available options. Not all of them will suit the needs of your business. In fact, you may be better off simply launching Zoom or Teams from your UI, which is totally fine, but at least you will make that decision consciously.
Table of contents
There are standalone video conferencing tools, and then there are platforms that allow you to embed a video conferencing solution.
We are all familiar with standalone video conferencing tools such as Zoom, Google Meet, and Microsoft Teams (known in the industry as “off the shelf” or OTS tools). While you can embed an OTS tool, it’s not a true embedded solution, often requiring you to perform meeting management via admin screens and hardcode static meeting links into your code. We will not discuss the merits of going down this route.
A true embeddable video conferencing platform will give you access to an API and SDK. In essence, such a platform contains three elements:
Video conferencing is hugely popular, and countless SaaS products could benefit from jumping on the bandwagon and embedding a video conferencing platform. Here are just a few ideas.
Project management tools:
Legal technology platforms:
Real estate management software:
Healthcare management systems:
Learning management systems:
Obviously, this list is not exhaustive. Any vertical or horizontal SaaS application that already includes some element of communication could be modernised with video conferencing features.
There are various approaches you can take when considering an embedded video conferencing solution, but it boils down to two main categories.
You can build the whole thing—or large parts of it—yourself. We call this the custom-built approach. This approach gives you a lot of flexibility, but it comes with increased financial and time costs.
Or you can take advantage of an embedded platform that already has a lot of functionality included. We call this the prebuilt approach. This approach is quicker, but you sacrifice some flexibility.
Let’s dive a little deeper into both.
Custom-built means that you are effectively building your embedded video conferencing solution from scratch. We distinguish between two variants of custom-built:
If you are extremely courageous and can afford to dedicate a few years, you could start by building your own media server. Barring that, you will kick off your in-house build by selecting an existing media server, such as Janus, Jitsi, or Mediasoup, to name a few. These low-level media servers provide you with the basic functionality that is required to send and receive streams.
You have to build all the collaborative logic yourself, though. You’ll need to write frontend code to capture the webcam in the browser, wrap it into a stream, connect to the media server, send the stream to the media server, and then write backend logic that instructs other users where they can find the stream in order to view it.
Then, you have to write room creation and management code, bandwidth management code, scaling code, error detection and correction code, and a whole host of low-level backend logic.
Once you have done that, you can add basic stream-related features (e.g., muting), followed by basic conferencing features (e.g., video tile layouts). Now, you have a basic video conference.
If you require collaborative features, such as a participant list or chat, you will have to build those yourself, too.
You get the point. You have to build everything yourself.
Rather than starting with a media server, you could step it up a level and choose a Video as a Platform (VPaaS) provider (e.g. Agora, Daily, or Twilio) as your starting point. These are still low-level, but they get you past the point where you have to worry about basic video streaming logic.
You start by positioning the video elements in your UI and then writing code for things like muting and other basic stream controls.
Then you can move on to more advanced application logic (the brain of your application), including video layout engine, user role management, scaling logic, etc.
And then - if your use case demands it, you have to build the collaborative features, such as participant list, chat, whiteboard, polling, question & answer, etc.
So custom (with or without a low-level VPaaS) is a lot of work. What you gain is flexibility; you can do anything you want, but you have to build it yourself.
Prebuilt solutions are effectively higher-level VPaaS solutions (e.g. Digital Samba, Whereby, or Daily-Prebuilt). Prebuilt means that certain things that you would otherwise have to build yourself have been built for you. Decisions were made for you. There is a wide range of prebuilt solutions. It is not easy to classify them because we are not only talking about features that a user can see but also about architectural and logical decisions that were made by the vendor or platform.
Some prebuilt solutions are low-featured. Some prebuilt solutions are high-featured. Some make decisions you can live with, and some make decisions you can’t live with (for example, they implemented a video layout that simply does not work for you).
Prebuilt solutions take away some flexibility, but in return, they offer a faster time to market coupled with lower development and maintenance costs.
Maybe you’d like to see that information in a table, so here it is:
Feature |
Custom-built approach |
Prebuilt approach |
Flexibility |
You can do anything you want since you develop everything yourself. |
You are “stuck” with vendor decisions, but those decisions might work for you. |
Development effort |
Depends on your use case, but real-time video is no joke, and you could be developing for years. |
All you have to do is write integration code, which is easier if the vendor has good support and documentation. |
Time to market |
Generally slow, since you need to develop everything yourself. |
Generally fast, since the vendor has already built significant parts of the application for you. |
Cost |
Generally high, since you have development, infrastructure, and maintenance costs. |
Generally low, since vendor pricing models tend to be usage-based. |
Feature set |
You develop everything yourself, so the world is your oyster when it comes to features. |
Depends very much on the vendor that you choose. |
Future proof |
You own the roadmap and can roll with the punches. |
High dependence on the direction the vendor takes. Some vendors will allow you to influence their roadmap. |
As previously mentioned, the range of prebuilt solutions on the market is vast. Do your research before diving into a custom-built solution. Video is hard, very hard.
The trick with getting prebuilt right:
A prebuilt vendor simply can’t make decisions that 100% of people are going to be 100% happy with, and settings are not as flexible as developing something exactly according to the specifications of your use case.
That said, a prebuilt solution that makes smart decisions and allows those decisions to be massaged with appropriate settings can get you pretty close to the flexibility of a custom-built solution. It all depends on what you are looking for.
If you are new to the game and are leaning towards going custom-built, make sure that this is absolutely the right move for you. How? Take an hour or two out of your day to embed Digital Samba, check out the decisions we have made, and play around with the features we offer. The time you spend on that is just a blip compared to the time you’ll need to develop something custom. But at least you are making that decision with full confidence that prebuilt is not the right path for you.
If you have already decided to go prebuilt, consider Digital Samba as your embedded solution. Our company has its roots in standalone video conferencing, and we’ve gone through the painstaking process of creating a custom-built OTS product from the ground up. It’s safe to say that we know custom-built and we know prebuilt.
Backed by 20 years of experience, we have taken all the features you would expect from an OTS video conferencing product and wrapped them into a highly customisable prebuilt platform. Our time-tested perspective in this market allows us to make smart prebuilt decisions that are applicable to almost any use case - probably yours, too. Give us a try!