[FFmpeg-devel] [PATCH 4/5] x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis

Hendrik Leppkes h.leppkes at gmail.com
Tue Apr 2 15:16:26 EEST 2019


On Mon, Apr 1, 2019 at 4:59 PM James Almer <jamrial at gmail.com> wrote:
>
> On 4/1/2019 9:13 AM, Lynne wrote:
> > Mar 31, 2019, 11:49 PM by jamrial at gmail.com:
> >
> >> A float ret value needs to be in xmm0, and you swapped m0 with m2 on
> >> Win64. This is the source of the fate failure.
> >>
> > Attached a patch to fix this.
> >
> >> %if WIN64
> >> -    SWAP 0, 2
> >> -%endif
> >> +    shufps m0, m2, 0
> >> +%else
> >>       shufps m0, m0, 0
> >> +%endif
> >> %endif
> >
> > 0001-x86-opusdsp-fix-WIN64-return-value.patch
> >
> > From 98e93b6f322a2a9dee7499fe87b74cf50a33b022 Mon Sep 17 00:00:00 2001
> > From: Lynne <dev at lynne.ee>
> > Date: Mon, 1 Apr 2019 13:06:34 +0100
> > Subject: [PATCH] x86/opusdsp: fix WIN64 return value
> >
> > ---
> >  libavcodec/x86/opusdsp.asm | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavcodec/x86/opusdsp.asm b/libavcodec/x86/opusdsp.asm
> > index ed65614e06..a7bff4f0b3 100644
> > --- a/libavcodec/x86/opusdsp.asm
> > +++ b/libavcodec/x86/opusdsp.asm
> > @@ -40,9 +40,10 @@ cglobal opus_deemphasis, 4, 4, 8, out, in, coeff, len
> >      VBROADCASTSS m0, coeffm
> >  %else
> >  %if WIN64
> > -    SWAP 0, 2
> > -%endif
> > +    shufps m0, m2, 0
>
> No, like this shufps will interleave the values from m2 and m0. The
> correct instruction would be
>
> shufps m0, m2, m2, 0
>
> Where m2 is used for both input arguments, and the VEX encoding lets us
> use another reg as output.
> Tested and pushed it with that change. Thanks.

Opus tests still fail on win32, both mingw and msvc.

- Hendrik


More information about the ffmpeg-devel mailing list