[FFmpeg-devel] [PATCH] libavcodec/zmbvenc: Add support for RGB formats

Wed Mar 13 19:46:57 EET 2019

> On 12 Mar 2019, at 11:46, Tomas Härdin <tjoppen at acc.umu.se> wrote:
> 
> tis 2019-03-12 klockan 10:27 +0000 skrev Matthew Fearnley:
>>> On 11 Mar 2019, at 10:37, Tomas Härdin <tjoppen at acc.umu.se> wrote:
>>> 
>>> 
>>> There's some justification for adding sub-8bpp, like BMP. bmp.c
>>> converts all of them except GRAY8 to PAL8. Bitdepths besides 1, 4
>>> and 8
>>> don't work at all.
>>> 
>>> One way to at least allow both the bmp and zmbv encoders to do sub-
>>> 8bpp 
>>> from PAL8 would be to keep track of the maximum number of colors in
>>> some appropriate struct.
>>> 
>>> Adding proper sub-8bpp support would involve a lot of libsws
>>> headache I
>>> suspect.
>> 
>> It occurs to me that adding sub-8bpp has some implications:
>> 
>> My current understanding (I could be wrong) is that FFmpeg tends to
>> detect the pix_fmt based on the first frame. If FFmpeg detects the
>> first frame as e.g. PAL4, and chooses that as its output, that means
>> the rest of the video will have to be encodable as PAL4, otherwise it
>> (obviously) won’t be encoded properly.
>> 
>> So adding a PAL4 format puts a new constraint on encoders (inside and
>> outside FFmpeg) to not encode frames in a way that looks like PAL4,
>> unless the whole video will be encodable that way.
> 
> Yes, FFmpeg will probe the initial format of the video and audio.
> Nothing says these are constant. There are FATE samples specifically
> for files that change resolution. Since ZMBV is a DOS capture codec,
> and DOS programs frequently change resolution and colordepth, this is
> indeed something we have to think about. Example: DOS boots into mode
> 3h, 80x25 16-color text. A recording may start in this mode, then
> switch to mode 13h (320x200 256 colors graphics), if I understand the
> format correctly.
DOSBox actually avoids this issue by outputting to a new file whenever it detects a change in colour depth, dimensions or FPS. So in practice, these remain constant over a single video.

I’ve tried stitching two AVI files together (palette + RGB, same dimensions), and found that if a palette-capable format like ‘png’ is chosen, all of the resulting AVI file is encoded with a palette, and RGB sections are dithered.  That possibly suggests FFmpeg doesn’t like changing format halfway through videos, for AVI at least?
> 
>> If FFmpeg supports PAL8 only, then it can be tempting to optimise
>> videos to encode as sub-8bpp whenever possible, knowing they will
>> always (in FFmpeg at least) decode to PAL8. But this could break
>> format detection for tools outside FFmpeg, if they choose to add sub-
>> 8bpp support.
>> 
>> The safest thing FFmpeg can do is to always decode sub-8bpp to PAL8,
>> and to emit PAL8 frames as exactly 8bpp (where applicable). It could
>> still offer encoding formats for PAL1/2/4, but these formats could
>> only be detected by scanning the whole video.
>> 
>> The suggestion of bits_per_raw_sample sounds interesting. What would
>> that look like in practice?
> 
> The decoder would set bits_per_raw_sample on every keyframe. It and
> resolution might change during the course of a ZMBV file, as explained
> above.
> 
> I think frivolously changing the bitdepth is a bad idea, at least
> inside the encoder. If the encoder is fed with changing bitdepths then
> it's a different story.
> 
>>> Just a small thing to be clear: ZMBV_ENABLE_24BPP is not defined
>>> anywhere, so we're free to do however we want with it. It's not
>>> going
>>> to break anyone's workflow unless they were foolish enough to
>>> encode
>>> 24-bit ZMBVs outside of the non-existing spec.
>> 
>> True. But they might be understandably puzzled if they encode as 24-
>> bit, and then find the channels swapped when they decode.
> 
> Yeah, that's why we'd need to get this nailed down
> 
>> Thanks for writing the email to the DOSBox crew.
>> If they choose a channel order, then we have good grounds for fixing
>> the encoder (if need be), and implementing the decoder in the same
>> way.
>> It occurs to me that they might (in theory) also want to specify 2/4
>> byte alignment on RGB, like with the MVs.  My gut says there’d be
>> very little benefit though, and it would only be seen with strange
>> video / block widths.
>> 
>> It also occurs to me this will may warrant a version bump in the
>> format, to give an easy error case for decoders that don’t expect it.
>> Particularly if our decoder has to redefine its channel order.
> 
> Are there any decoders besides ours and dosbox's?
I don’t know of any public implementations.

(That said, I have written a stand-alone tool that encodes/decodes ZMBV, but I’ve not published it anywhere. It’s based heavily on the DOSBox implementation.)
> It also turns out the creator of this codec is Harekiet, who hangs out
> in #revision on IRCnet
Ah ok, do you know him?
It sounds like he’s not concerned too much about what direction we take the format in. But I guess anything we implement may not get made official unless DOSBox adds decoding support.

By the way, I’m happy for this patch to be committed as-is (possibly without the extra note on unsupported bit depths, if that causes any issues). Any new additions outside the existing spec would warrant a new patch I think.

Matthew