by Irene Viola, Jack Jansen, and Pablo Cesar (CWI)
Extended Reality (XR) telecommunication systems promise to overcome the limitations of current real-time teleconferencing solutions, by enabling a better sense of immersion and fostering more natural interpersonal interactions. To achieve truly immersive communication, high-fidelity representations of our own bodies and faces are essential. Enter VR2Gather: a customisable end-to-end system to enable multi-party communication with real-time acquisition.
The future of media communication is immersive. XR technologies can overcome the limitations of current telecommunication systems by offering enhanced realism, better sense of presence, higher degree of interactivity, and more naturalness in remote social communication. The current commercial solutions for immersive teleconferencing, however, all employ synthetic avatars to represent their users. Ranging from low-fidelity, cartoonish representations, to sophisticated avatars that are designed to mimic our appearance, these solutions nonetheless offer low realism, appear artificial, and can lead to the uncanny valley feeling [L1].
Another solution is to directly capture and transmit photorealistic representations of the users in 3D, using volumetric media, similarly to what we already do in traditional video-conferencing. Several solutions are available nowadays to achieve volumetric video capture, using either commercial 3D sensors [L2] or Ai algorithms to generate 3D contents from traditional videos [L3]. However, such media contents require large amounts of data: for example, an uncompressed 30 frames per second (fps) point cloud video with one million points requires around 5 Gigabit per second (Gbps). Thus, to enable remote communication, we need to lower the bandwidth requirements by orders of magnitude. We can do so by using compression solutions, or by adaptively transmitting the parts of the contents that are of interest for each user, thus tailoring the delivery around each participant’s needs. Thus, we need a system that can properly handle the capture, delivery, and rendering of such data in an efficient way. This is what VR2Gather aims to do: to provide an end-to-end system for volumetric real-time communication, that can enable social XR experiences for various sectors, such as healthcare, education, entertainment, and cultural heritage [1].
Figure 1: VR2Gather is used to enable a shared celebration for CWI 75 anniversary in 2021, complete with a (virtual) cake.
The system is open-source and customisable, so that different capturing setups, transport protocols, and rendering applications can be selected. It comprises different modules: a capture module, which produces a stream of timestamped point clouds; a tile module, which splits point clouds into parts, which can be processed separately to enable parallelisation, along with user and network adaptation; an encode module, which compresses the data for more efficient transmission; a transport module, with three different implementations (direct TCP, socketIO, DASH) for transmission over the internet; a receive module that takes the transmitted packets; a decode module that decompresses the packets into point cloud data; and finally, a render module that displays the received data.
We have demonstrated the use of VR2Gather over several use cases. The first experience was around cultural heritage [L4]. Current museum experiences offer very limited interaction with the artefacts on display, not to mention all the collection pieces that are not displayed because of space limitations or other constraints. The VR2Gather platform was used to present the visitors of the Netherlands Institute for Sound and Vision museum with a costume from the collection, historically worn for a pop performance. Each user could interact with the costume and wear it while recreating the historical performance, for example by playing instruments and singing on stage [2].
Figure 2: VR2Gather is used to enable remote doctor consultation using consumer-grade phones, over 5G network.
The second experience was around connecting with others while being apart [L5]. In particular, we showed how our VR2Gather platform could be used to bring people together and share a virtual cake slice. Our setup included a specific capturing system for the cake itself along with the users, which were able to chat and interact in real time while being located in different cities.
The last experience was around remote consultation with doctors [L6]. Meeting remotely in an immersive environment opens new possibilities for healthcare, for example by reducing the amount of time patients spend travelling to the clinic and waiting for their appointment, and by enabling people with mobility impairments to access healthcare advice in real time, while waiting for the healthcare personnel to be dispatched on site. However, patients might not have access to high-end volumetric cameras, or stable connections. Thus, we demonstrated the consultation using a consumer-grade phone to acquire a volumetric representation of the patient, which was transmitted over 5G network.
This work was supported through “PPS-programmatoeslag TKI” Fund of the Dutch Ministry of Economic Affairs and Climate Policy and CLICKNL, the European Commission H2020 program, under the grant agreement 762111, VRTogether [L7] and the European Commission Horizon Europe program, under the grant agreement 101070109, TRANSMIXR [L8].
Links:
[L1] https://www.wired.com/story/gadget-lab-podcast-630/
[L2] https://www.intelrealsense.com/
[L3] https://volucap.com/
[L4] https://www.dis.cwi.nl/news/2021-12-10-cwi-and-the-netherlands-institute-of-sound-and-vision-gave-a-sneak-peek-into-the-future-of-cultural-heritage/
[L5] https://cacm.acm.org/news/256173-the-outlook-for-virtual-meetups/
[L6] https://www.dis.cwi.nl/news/2020-11-10-dis-group-realises-worlds-first-volumetric-video-conference-over-public-5g-network/
[L7] http://vrtogether.eu/
[L8] https://transmixr.eu/
References:
[1] I. Viola, et al., “VR2Gather: A collaborative social VR system for adaptive multi-party real-time communication,” IEEE MultiMedia, 2023.
[2] I. Reimat, et al., “Mediascape XR: a cultural heritage experience in social VR,” in Proc.of the 30th ACM International Conference on Multimedia, pp. 6955–6957, 2022.
Please contact:
Irene Viola, CWI, The Netherlands