DAF Technology Lab | Marijn Cinjee

Studio Marijn Cinjee | Spatial Sound Technology Research

INTRODUCTION

Studio Marijn Cinjee has been tasked with the design and development of this new, innovative spatial audio environment.

Together With Tilburg University, DAF Technology lab, and Genelec. We developed a comprehensive new hardware and software framework for delivering ultra-realistic spatial audio in XR.

In a nutshell

Imagine being able to physically move this speaker installation of 44 loudspeakers through the world in real-time, while each speaker produces its own coherent perspective on the world using spatial physical modelling simulation to accurately project real-world acoustics to all listening positions. This system will truly transport the audience to a new environment. It recreates the world in volumetric sonic holography, aligning visual and auditory perception in a cohesive manner.

How it started

The original project brief, five years ago, was to develop a new system for Unity3D and Unreal Engine to render sound objects in realtime on a 360-degree multi-speaker system that extends both vertically and horizontally. The system was to be combined with 3D visuals.

Research process

Over 5 years of intensive research on sound perception, physical modelling simulation, and Spatial audio technology design have led us to the development of a groundbreaking immersive audio experience.

Schermafbeelding 2021-01-13 om 23_edited.jpg

42.2 Genelec IP Smart Loudspeakers, positioned in the Holographic projection canvas layout, producing continuous phantom imaging in all directions, from all listening positions.

Design overview

New research goals:

A big opportunity to innovate spatial sound technology

After delivering the first proof of concept, we quickly recognized the true research potential and the opportunity to create a system aimed at indistinguishably replicating acoustic reality within a VR / XR context. The system needed to be precise and reliable enough for conducting cognitive neuroscience and psycho-acoustic experiments while also delivering big cinematic impact for simulations. Lastly, the software architecture needed give us new innovative tools needed to build creatively.

Volumetric, holographic sound objects

with simulated physical properties such as material density, shape dimensions, orientation, scale, distance, speed, air resistance and more.

Creating a holographic sound projection canvas.

What would it take to build an audio experience indistinguishable from reality?

To answer that question, we went back to the drawing board. After a long research period we began reverse-engineering every physical property of sound in the real world in order to construct a model for building an ultra-realistic sound simulation. We realised that we would have to invent the technology ourselves because it simply did not exist yet, not like we envisioned it.

Volumetric Sound objects

The goal was to make it completely impossible to hear any loudspeakers at all. We only wanted to hear volumetric sound objects that could be projected precisely in to the space, ideally also in front of the loudspeakers as holographic sound projections that are not dependent on a traditional sweet spot, enabling multiple people to have the same optimised experience. The Holographic sound projection canvas is designed to make it impossible to tell if a sound is coming from a loudspeaker, or any other point in space with an accuracy of +-9mm.

Building the algorithm in combination with the speaker setup.

To create a projection canvas for holographic sound objects, we aimed to provide our brains with minimal clues about the physical locations of the speakers. After extensive research into spatialisation methods, we developed our custom speaker layout based on measured acoustic energy at evenly distributed points in the CAVE room and calculated ideal speaker locations to ensure continuous acoustic energy presence throughout the space. This resulted in an optimised audio system that ensures perfectly equal acoustic energy distribution, both measured and perceived from all listening positions.

This approach ensures a direct, discrete transition from digital data to acoustic energy. Essentially, the loudspeakers themselves become inaudible, as sounds can be physically projected from any location with the exact same acoustic energy as if they were emitted directly from the loudspeakers.

The combination of not being able to tell where the speakers are, with the ability to physically project acoustic energy in every location, gives us the ability to present sounds exactly like in the real world.

Perceptual refinements

After focussing on getting physical acoustic energy distribution right, we spent years refining the perceptual parameters. What can be measured is not the same as what can be heard, psychoacoustics are equally important in designing the experience. We created the tools to perfect the illusion of depth, dimensionality and movement in the sound.

Audiovisual correlation

We conducted dedicated audiovisual perceptual tests to closely match audio localization with 3D visuals. The system can now project virtual sounds as small as 9mm in any location—below, above, behind, and even in front of the loudspeakers—based on extensive testing with listeners.

We are convinced that this level of precision can only be achieved by developing the software in tandem with the hardware, ensuring total architectural integration at every level of the process.

world overlay to scale 1:1

The system is designed to be a life-sized transparent overlay in order to provide an accurate acoustic reality. 1 meter of virtual movement is 1 meter of real movement, the line between the real and virtual is blurred as much as possible.

SOFTWARE DEVELOPMENT

After developing the spatialisation algorithm, the next objective was to research how to use the ultra-precise holographic rendering capabilities of this system to simulate real-world behaviours and physics in order to achieve convincingly realistic audio simulations. for this ,we designed a comprehensive set of features for vast interactive world building.

Triggers

Program spatial sound events & actions based on in-simulation user interaction

Configure parameters for holographic sound objects.

Visual

3D representation of the CAVE audio system

Environments

Simulate acoustic behaviour and ambience sounds.

Navigate trough immersive environments and simulate virtual acoustics based on physical properties of the virtual world.

Why build a custom software environment?

The difference between custom developed spatial audio software and regular commercial solutions, is that commercial software must be compatible with a wide range of devices. This inevitably leads to compromises in the quality of spatial audio in favour of efficiency. Consequently, no commercially available VR audio product currently has the capability to render such a high resolution spatial audio.

With custom software and in-house spatial audio expertise, we were able to develop a software package focused solely on achieving quality. By designing the hardware and software together, we eliminate the need for any compromises.

Schermafbeelding%202021-01-10%20om%2016_edited.jpg

Schermafbeelding 2021-01-11 om 10.14.50.png

Physical & perceptual modelling

The system is designed to make sounds feel realistic and materially rigid by simulating real-world physics. For example, sounds have a distinct front and back; they are volumetric in nature and can be oriented in any position relative to us. A virtual person can face us or be faced away from us. Sound objects affect their virtual surroundings in the same way that real objects do. Sounds can be blocked by obstacles and disappear behind virtual corners, just like in the real world. Sound objects can collide with each other, They can also have different material densities among other properties which influence their sonic footprint.

Motion simulation

Motion of virtual sound objects is enhanced by simulating the resistance of the air as they move through virtual space.

The CAVE Spatial Audio Engine is designed for easy integration with Unity3D and Unreal Engine, enabling developers to control large amounts of loudspeakers directly from game engine environments. It features all the capabilities that can be expected from a modern VR audio environment builder and brings this functionality to high-resolution multi-speaker spatial audio for the first time.

DYNAMIC CLUSTERS

The CAVE audio Engine system automatically assigns new sound objects as we navigate trough the virtual world

Sound Sources can be generatively spawned into existence dynamically. The number of sources that can be generated is theoretically endless. The CAVE Spatial Audio Engine has various capabilities built in to scatter, swarm, and deviate large groups of sources to make them feel and sound more organic.

Screenshot 2024-05-13 at 19.35_edited.png

Large amounts of generated asteroids flying trough space, making sounds as they bump into each other

Virtual Acoustics Engine

The ability to precisely simulate spatial acoustic reflections and place sound objects in their acoustic environment

Example: Simulated spatial Acoustics of a live-sized church

Virtual Acoustics Engine

An important aspect of making sounds feel real, is presenting them in their corresponding acoustic context.

The CAVE Spatial audio Engine has a new in-house developed 42.2 channel virtual acoustics processor built in.

nuanced sonic reflections are generated in realtime and are adaptive to their virtual environment.The acoustics will change accordingly as we navigate the 42.2 speaker system trough different parts of the virtual world to perfectly match their visual surroundings based on the parameters of the virtual environment:

Size of the room
Walls, corners and sound blocking obstacles
Materials of the space (concrete, wood, metal, etc)
Physical modelling of acoustic meshes
Line of sight raytracing
Live array microphone input

Example: Virtual acoustics engine rendering the four corridors of the church, generating 840.000 decorrelated reflections based on 1008 real data points

Early reflections, 42.2 discrete reverberance

An important part of sound object localisation is the dispersion of early reflections. The Initial reflection (up to 30ms) gives our brain important clues about the size of the room, and the location of the sound object within its contextual environment.

The CAVE Spatial Engine is capable of generating accurately simulated early reflections in order to provide the brain with these vital contextual clues. The Engine generates reflections with the same 0.9cm phantom precision, and with speeds as fast as 300 microseconds. all discretely spatializedy on the 42.2 system, this resolution the comes close to what humans are biologically capable of perceiving.

Example: Virtual acoustics engine rendering the four corridors of the church, generating 840.000 decorrelated reflections based on 1008 real data points

Position based reflections

The scattering dispersion of audio reflections is physically simulated from the live position of the sound object.

Obstacles

When a sound object is behind a corner or wall, the direct (line of sight) sound wil diminish, and indirect reflections are perceived just like in reality.

Orientations

The volumetric rotation of sound objects has a significant influence on virtual acoustics. When a sound is facing away from us, the direct sound will be blocked based on the thickness and material of the emitter (aperture) In addition, the directivity of the source will determine the path for indirect reflections to take.

Frequency based

Low frequencies tend to travel trough obstacles easier than high frequencies, this behaviour is simulated based on the material properties of both the object and the obstacle to calculate the corresponding acoustic presence.

Custom 42.2 Spatial Convolution IR

Convolution is a technique for capturing real-world acoustics by playing a sweep (all frequencies) in an acoustic space and capturing the reverberation trail. This data can then be used to place any virtual source back into that environment by "convolving" the audio with the captured acoustic profile.

For the CAVE acoustics, we captured our own impulse responses using 42 omnidirectional microphones, positioned exactly where the speakers are. This approach aligns with our philosophy of transferring digital data into acoustic energy as realistically as possible. By capturing custom impulse responses with the CAVE speaker setup, each loudspeaker can reproduce the exact corresponding acoustics. The sweeps are made using 24 omnidirectional speakers from Bloomline Acoustics, chosen for their ability to radiate sound equally in all directions, providing detailed echoing.

The combination of 24 loudspeakers and 42 microphones results in 1008 discrete directions of convolution reverb. This extensive positional data allows us to accurately position sound objects and create a convolution reverb that updates acoustic reflection placement in real time as volumetric sound objects move through space.

Hybrid System Based on a Trained Model

After capturing the highest resolution spatial acoustic convolution currently in existence, we needed a dynamic way to utilize this data in simulations.

Our spatial sampling resolution of 1008 acoustic reflection angles allowed us to train a sophisticated model for interpolating and predicting different virtual rooms based on size, materials, and other properties.

Our system intelligently combines the early reflection patterns captured in our 42.2 convolution CAVE format with our discrete 42.2 generative reflection engine for late reflections. This hybrid approach offers the realism of convolution (early reflections) and the flexibility of algorithmic virtual acoustics (late reflections) for live simulations.

Blending the Real with the Virtual

As an interactive layer, we have installed a WFS array microphone to spatially capture noises created by participants. Simply put, in a virtual church, your own voice will sound like it is right there. The system spatially distributes live reflection patterns across all 42 loudspeakers to accurately reproduce the early and late reflections in the acoustic environment. The microphone is smart enough to eliminate sound emanating from the loudspeakers and only focus on the people inside.

By blending real acoustics with the simulated environment, we can completely blur the boundaries between the real and the virtual world.

Spatial environments Engine

navigating trough procedural spatial ambiences, weather and music in realtime.

A different way of interacting with the CAVE audio system is by using our real-time immersive environments engine to simulate entire spatial audio worlds at once.

For example, when simulating a rainstorm, forest, or background wind, we want sounds to fill the space naturally. It would be inefficient to render thousands of real-time sound objects to accomplish this. That's where spatial environments come in. This system allows us to create and play a spatial render from a single file, complete with object-based spatial metadata for real-time navigation on 42 loudspeakers.

Environments are designed to provide a natural background for our holographically projected sound objects.

Details for tech people:

Based on 12th order ambisonics (custom encoder and decoder)
Integrated with our 42.2 Virtual Acoustics Convolution Engine
Navigable ambience zones
Up to 5 different layers of environments can be mixed together at the same time
Endless use of environments in simulation (5 at the same time maximum)
We use our content creation framework for the CAVE Spatial Engine in Ableton Live to create spatial sound design. music and more.
Designs can be compiled to our playable file format for use in simulation.
Backwards compatible with any immersive audio format.

In-house developed Recording technique

We have developed an in-house spatial recording technique using 42 omnidirectional microphones placed exactly where the loudspeakers are. This means every speaker plays back exactly what it would if it were really there, from its perspective, just like with the virtual acoustics engine. This approach makes sure that recordings are as close to reality as they can be.

03 - City_01_1080p_v1_1920x1080-4b3a5cd21fda0daae0e4bd277ae2a49b.webp

The cars are localisable holographic sound objects, the rest of the city noises are part an ambience zone rendered by our environments engine.

Lastly, the cars, and the background ambience are combined in to the 42.2 virtual acoustics hybrid convolution engine to place them perceptually in the same acoustic context.

Software System Architecture

The holographic sound object projection canvas, the Environments Engine, and Virtual Acoustic Engine are all designed to work together, providing the most possible acoustic depth to our brains.

1. Object Projection Canvas

The loudspeaker system's hardware and the software algorithms are engineered together in order to achieve totally imperceivable loudspeakers. Instead, we hear defined, detailed holographic audio sources as small as 9mm. These virtual objects can come very close to us inside the CAVE. Sound objects behave like real-world volumetric objects.

2. Environments Engine

Detailed admospheric background environments, accurately represented on a 1:1 scale. Environments are designed to provide sonic depth of field. Navigate interactively trough ultra-realistic landscapes of sound, captured with 42 microphones, or designed creatively specifically for the CAVE audio system using our in-house creation tools.

3. Virtual acoustics Engine

Using a combination of 42.2 spatial convolution & physical modelling techniques to bring all the sound together in a coherent, realistic acoustic environment. Acoustics adapt intelligently in realtime as we move trough different virtual locations based on live raytracing. The system is designed to perfectly simulate real-world acoustics in spatial dimensionality.

Demo

One of the problems with this spatial audio system is that it's impossible to grasp its true potential without hearing it for yourself. To learn more about this spatial audio hardware and software system, do not hesitate to reach out for a demonstration. (marijn@cinjee.com)

Reach out for projects

At Studio Marijn Cinjee, we are always looking for exciting new immersive audio/visual projects. We can play an important role both technically and artistically. Bringing spatial audio expertise, creative vision, and technical development skills to the project.

Studio Marijn Cinjee works on a bespoke project basis in order to achieve the highest quality.

If you are looking to include immersive audio, acoustic architecture, or interactive sound design in your project, or if you just happen to have a crazy idea for an installation, please feel free to get in touch and involve us in the drawing process so we can create an integrated, high-quality audio solution for your specific needs.

Studio Marijn Cinjee can play an important role in the following tasks:

Architecture and conceptualization of the sound system
Consultancy on hardware and software
Acoustical engineering
Custom software development
Virtual acoustics engine design | real-time position-based reflection simulation
Design and development of creative tools for simulation content creation
Spatial sound design for simulations | artistic vision
Education and training
Presentations and demos on request

Please reach out to marijn@cinjee.com to schedule an introductory meeting

Screenshot 2024-05-23 at 16.05_edited.png

FAQ

Is your XR spatial audio framework publicly available?

The CAVE spatial audio engine framework is shared intellectual property of Studio Marijn Cinjee and DAF Technology Lab, therefore the software is not publicly available as a product. Studio Marijn Cinjee works on a bespoke project basis in order to achieve the highest quality.

if you have a great idea for a new project involving hyperrealistic spatial audio design, please reach out to us.

Together we can create a design specifically tailored to your use case.

Can I use the CAVE audio installation to play existing simulations from other formats?

Yes, an important part of the design is the capability to run existing VR simulations from various formats.

The CAVE audio system is compatible with:

Dolby Atmos
Ambisonics (up to 12th order)
Any discrete speaker configuration can be virtualised

I am a researcher / teacher , and I want to use spatial audio in my practice

Studio Marijn Cinjee can help formulate clearly how and why spatial audio should be implemented in your project and offer practical solutions support your research. Studio Marijn Cinjee can be of assistance in the creation of content, custom software, and conduct scientifically accurate measurements to verify the results.

I am an organiser / entrepreneur and I want to build an immersive audio system like this

Studio Marijn Cinjee can play an important role in the following tasks:

Architecture and conceptualization of the sound system
Consultancy on hardware and software
Acoustical engineering
Custom software development
Virtual acoustics engine design | real-time position-based reflection simulation
Design and development of creative tools for simulation content creation
Spatial sound design for simulations | artistic vision
Education and training
Presentations and demos on request

I am a game developer for VR / XR and I would like to work with this system

We advise visiting the DAF Technology Lab for a demonstration and in-depth technical explanation to get started.

TILBURG UNIVERSITy:

Spatial audio research for virtual reality

TILBURG UNIVERSITy:

Spatial audio research for virtual reality

TILBURG UNIVERSITy:

Spatial audio research for virtual reality

Marijn Cinjee