[FFmpeg-devel] [PATCH] Support for Ambisonics and OpusProjection* API.

Drew Allen bitllama at google.com
Mon Apr 23 19:02:03 EEST 2018


Hi Rostislav,

Here is my feedback:

I am sorry, I don't agree that this should be merged quite yet.

First of all, the draft upon which how the channel family is interpreted
(draft-ietf-codec-ambisonics-04) isn't finalized yet. What if it changes
after we make a release? We can't really change it. This affects both
encoding and decoding.

We have spent the past 2 years with the draft relatively unchanged aside
from minor edits on the draft. It is headed to a working group for
finalization very soon and no one has yet raised a single issue regarding
any proposed changes that this patch introduces. I wrote the
OpusProjection* API and it has been adopted in all Opus-related xiph master
branches.

Second, the API isn't in any libopus release yet. What if we make a
release, the draft changes, so the API in libopus needs to be changed. Or
the API in the current git master of libopus changes? We can't rely on an
unstable API in a library.

I worked closely with Jean-Marc Valin to design the API in Opus 1.3 to his
specification. Opus 1.3 beta already contains this new API and upon
release, I have 100% assurance from Jean-Marc that the OpusProjection* API
will be supported in 1.3 RC.

Third, this patch makes the decoder always do demixing with the data in
extradata. What if someone wants to just decode and do their own demixing
with positional information? We'd break decoding for anyone who's depending
on the behaviour if we were to change this so the decoder outputs raw
ambisonics. We never do any mixing or conversions in decoders. Hence
ambisonics data needs to be exposed directly, and anyone wanting to demix
it should use a filter (there's an unmerged filter to do that) or do it
themselves.
What we need to do to properly handle ambisonics is:
1. Be able to describe ambisonics. This needs a new API.
2. Be able to expose the demixing matrix via frame side data.
3. Have a filter which can use both to provide a demixed output, better yet
with positional information.

I disagree that a filter or some other layer of abstraction is necessary
here. OpusProjection* does not code the ambisonic channels directly.
Instead, they are mixed using a mixing matrix that minimizes coding
artifacts over the sphere. The demixing matrix on the decoder is vital in
order to get back the original ambisonic channels and OpusProjectionDecoder
handles this automatically.

I think the draft should become an RFC first. That gives us enough time to
work on 1.), which should take the longest time to do and agree on. 2.) is
trivial and 3.) is from what I know mostly done.

I completely disagree. The IETF draft has been stable for over a year and
these same changes to support the new API are already present in Opus,
libopusenc, opusfile and opus-tools.


More information about the ffmpeg-devel mailing list