[FFmpeg-devel] [PATCH] swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_1

Lauri Kasanen cand at gmx.com
Sun Apr 7 09:18:14 EEST 2019


On Sun, 31 Mar 2019 17:18:47 +0300
Lauri Kasanen <cand at gmx.com> wrote:

> ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
>         -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
>         -cpuflags 0 -v error -
>
> 32-bit mul, power8 only.
>
> 1.8-2.3x speedup:
>
> rgb24
>   18192 UNITS in yuv2packed1,   32767 runs,      1 skips
>    9983 UNITS in yuv2packed1,   32760 runs,      8 skips
> bgr24
>   18665 UNITS in yuv2packed1,   32766 runs,      2 skips
>    9925 UNITS in yuv2packed1,   32763 runs,      5 skips
> rgba
>   20239 UNITS in yuv2packed1,   32767 runs,      1 skips
>    8794 UNITS in yuv2packed1,   32759 runs,      9 skips
> bgra
>   20354 UNITS in yuv2packed1,   32768 runs,      0 skips
>    8770 UNITS in yuv2packed1,   32761 runs,      7 skips
> argb
>   20185 UNITS in yuv2packed1,   32768 runs,      0 skips
>    8761 UNITS in yuv2packed1,   32761 runs,      7 skips
> bgra
>   20360 UNITS in yuv2packed1,   32766 runs,      2 skips
>    8759 UNITS in yuv2packed1,   32764 runs,      4 skips
>
> This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
> is also heavily inaccurate, while the vsx version has high accuracy.

Applying.

- Lauri


More information about the ffmpeg-devel mailing list