Joining Video Calls from Mozilla Hubs on Windows
I’m going to start with the disclaimer: this process involves a number of wonky settings on your computer and isn’t something that runs smoothly “out of the box” with a single click, nor is it an officially supported feature of either Hubs or Zoom. However, if you’re interested in upping your video meetings by appearing as an avatar in a virtual world, and you like messing around with applications and settings, read on.
I’ve worked on virtual world software for the past 4 years, and I feel way more comfortable, generally, interacting in a virtual space as an avatar. Despite having a painstakingly (and lovingly) camera setup for my video calls, complete with dedicated desk key light and DSLR “webcam”, I’m increasingly finding that I prefer to be grounded in a virtual space when I’m taking virtual calls. I wrote more about that for the Mozilla Mixed Reality blog, which you can read about here.
The thing is, a lot of people aren’t ready to replace their video calls with virtual world meetings, and that’s A-OK, because now I have a way to join them without being on camera myself.
At a high level, here’s how I’m setting up my computer to stream my Hubs avatar into Zoom:
- I’m using virtual audio devices for both input and output, so that my physical microphone drives my Hubs room audio, and the audio from my Hubs room is used as a virtual microphone input into the video conferencing system
- I’m using OBS Studio to stream a desktop tab into Zoom instead of my webcam
- I have a virtual “green screen” streaming room in Hubs that I use with two tabs, one where my avatar is on screen, and one to record from
Virtual Audio Devices
I initially installed software to create virtual audio devices so I could stream my desktop video into Hubs with audio playing. I’m using VB-Audio Virtual Cable , which adds one virtual audio input and one virtual audio output to my computer. There are other software applications that I’ve been told do this more effectively or easily, but for now, this is the one I installed and I’m stubborn, so it’s the one that I’m sticking with.
When I go into a Hubs room to stream into a video application, I set my microphone to use my physical mic input and change my system audio output to be my virtual audio output device. This means that I can’t actually directly hear what’s happening in the room on that tab (I really should check out those other software applications that allows me to virtually mix sources) but I don’t need to for this first test, so the limited setup I have now is serving my needs.
In Zoom, I then set my microphone input to be the desktop audio. Be warned that this means that if you don’t have system notifications or application notifications silenced, those are going to come from your microphone too, so make sure that you’re keeping this quiet as needed. I turn all this audio off anyway, but you can anticipate it being annoying if you’re not careful. Then, I switch my Zoom output only to using my physical headphones, so I can hear whatever is happening on the call. If I had a virtual mixer, I could hear audio from both the Hubs room and my Zoom call, but that’s something to tackle in a future post.
OBS Studio with VirtualCam
Okay, so audio is one part of the setup. Let’s talk about the video setup. If you have separate monitors, this might be easier for you (I have an ultrawide monitor, which works great, but after this experiment I’m feeling a lot more likely to invest in a better window management system, let me tell you.)
The way that I’m doing this is to set up two tabs in Hubs, one that acts as the camera and one that acts as “me” in the call. They face each other uncomfortably close and then I use OBS to set up the shot.
OBS allows me to set up a scene that can be streamed out of the application and detected as a virtual webcam feed. You can arrange your scene however you like – I set it up so my avatar appears to be sitting in front of a webcam (head and shoulders up) and then stream out to video applications like Zoom and Hangouts.
This is the part where having separate monitors can be helpful. Let’s be realistic, at this point in my setup, I’ve got any number of Firefox windows and tabs open, so window management is critical. Once you start streaming out as a virtual webcam feed, you essentially have a portable avatar to take around with you to your meetings.
Then, when you’re joining your next video call, jump in and choose your virtual camera instead of your physical one, and you, too, can delight everyone with something entirely unexpected.
A Virtual Green Screen
Once you have the basic setup in place that successfully routes your camera and audio into a tool like Zoom, you can really get into the fun stuff. For the examples above, I started by streaming an entire room into Zoom, but with the virtual background feature of Zoom, it’s possible to do some pseudo-AR type video calls where your avatar seems to be streaming from your own living room, in front of the golden gate bridge, or in front of any arbitrary video.
To do this, I use my regular setup described above, but instead of being in a virtual room and broadcasting out the entire space, I go into a virtual “green screen” studio, where I position my actor avatar in front of a green wall and set up my camera view so that the avatar and green screen are the only things in the space. From there, in OBS, it’s just a matter of enabling the virtual background and selecting “I have a green screen”! Then, the world is your oyster. In the image above, I used my webcam to take a picture of my usual background, and superimposed the avatar over it
You can click the link to the Streaming Room scene in the tweet above to get a copy of the streaming room and remix it, or use it as-is to create a room for yourself! If you give this a try, please share your results – I’d love to see what you come up with!
So, why do all of this?
I wrote about some of the things that make virtual worlds more compelling to me than video calls, and this is an extension of that experimentation. Since more of the world has turned to video conferencing, there’s been an increase in research exploring the cognitive load of looking at faces in a grid on-screen, and applying our existing models of social interactions and norms to a 2D window of faces. For me, having a virtual “place” I can ground myself in makes me feel far more comfortable than being on-camera, and I don’t have to stare at my actual face on the screen. It’s a win-win!
I also struggle with facial affect – both recognizing it and displaying it – so removing the expectation that I have facial expressions at all is actually quite helpful for some types of meetings. You also get a perfectly attentive avatar, privacy for the people in your home, and the ability to emote more intentionally, which I absolutely love.
I’m not planning to replace all of my video calls with this – but it’s a compelling use case for me as more of my meetings become mixed reality hybrids. No longer do I have to choose between meeting in a virtual world or a physical one – I can bring my video into Hubs, or Hubs into my video calls – or be fully in one world or the other.
There are some definite improvements to be made here that come to mind, which I’ll be looking at in the coming weeks as I continue to explore the viability and desirability of joining video calls from virtual spaces. Specifically, I want to get a new audio mixer setup, so that I can actually hear audio from both the Hubs room source and the Zoom source, and figure out the right set of configurations so I can more easily do this from a VR headset. Unfortunately right now, my setup is limited because my desk is temporarily moved to the living room, but I have high hopes for what this entails and the creativity that can stem from it.