Topics

BRAW codec - analysis

Daniel Rozsnyó
 

Hi,
  I see many people seeking detains on the Blackmagic RAW codec, so here are mine:

  I have looked to the SDK as well - from those DLL files perspective. There are three files, a main library, containing the bitstream decoding on CPU using SSE/AVX, and then there are GPU acceleration libraries, which catch the eye, since they contain a clear text (non-encrypted) chunk of obfuscated CUDA and OpenCL shader code. Actually these constitutes 97 percent of CUDA lib and 93 porcent of OpenCL lib, each had 4 fragments.

  De-obfuscation was not that hard, since they rely on a C pre-processor... I just ran "gcc -E", that produced me a somewhat readable source code, but still with obfuscated variable/function names. Looking to the code, it is pretty easy to tell what it does (and I am not a CUDA/OpenCL programer). I produced a set of 400 names for those functions and variables. It seems as a human error in rush (no thinking after 19-th coffee?), or maybe just an intention to put there a "protection to be violated" for sake of DMCA. Well.. this is just hiding the elephant behind a tree.


What there is:

    8x8 DCT (same as in jpeg-ext, prores, prores-raw and other codecs) - but in 12 bit precision
        (with specific funny constants from one academic paper telling you how to do it super-fast)

    The iDCT has separate shaders for full,half,quarter and eight of the resolution

    The DCT compressed data is in a pseudo-YCbCr colour-space, so encoding a weighted RGB average, and differences towards R and B

    Transfer curve that is partially linear and partially quadratic, with a threshold
    Linearization LUT, with 32K points

    There is a yet unknown feature that selects one of the 4 tables how to mix colors up (or that might be the quantization)

    Decoding quality is either fast/rough or high-quality
    Decoding to 4 buffer formats (RGBA, 32bit, 32bit planar, 16bit planar)
    Decoding to 2 not yet understood formats with 1 and 4 elements (maybe yuv/rgb or rgb/raw)

    Simple and straightforward processing (blacklevel, gain, linearization, demosaic, saturation, colorspace conversion)


What there is not:

    Wavelets (dirac-pro, TICO/jpeg-xs)

    Encoding (at least no API exists for inserting raw data and creating files)

    No advanced de-mosaicing that would trace contours or do other funny stuff



Looking into a sample 4K6 file:

    it is based on ISO MEDIA format (Quicktime MOV), had no trouble to parse it through by my MP4 library

    it is clearly an i-frame file and it might be a constant bit-rate one I got, since the frame sizes were almost same

    there is some acquisition metadata about lens, shutter, iso per frame, these take few 64byte chunks of each frame, in custom QT atom format (256 on my sample)

    then goes the binary header, with information about the resolution and slicing the file

    and the slice index and those unknown mixing flags per slice


The 4k6 file seems to be partitioned into 240 slices, that are in 8x30 matrix. 8 wide shall correspond to the camera capabilities seen earlier with the JPEG extended profile in the 3:1 and 4:1 codecs, yet there the slices spanned over the whole (or half) the height of the picture. Here they span just 88 pixels tall. More slices means more potential to get things decoded in parallel (either on cpu or gpu).

I have not found yet a way to decode the slice bitstream, to tell whether it is just a JPEG, or it has subsections that lead to even faster /progressive/ decoding when processed partially.


My opinion on some choices:

    Use of ISO MEDIA is fine and getting a new extension is good to avoid people telling "hey, i cant open this MP4"
    (which was and is a trouble with the nonstandard coding in the 3:1/4:1 formats)

    Including the shady shader source code - this makes the library future-hardware proof. When you get a new GPU architecture, it will run optimally.
    e.g. Canon in their SDK pre-compiled these functions for the current gpu architectures, which means it might not even run on something newer, or will run sub-optimally.

    Instead of obfuscation, they just might use a compression/encryption and I would probably never find these shaders. Not on first sight.

    The partial de-mosaic may just mean:
        - take this RGGB raw data, create R, (Gr+Gb)/2, B, convert it to pseudoYUV444, encode with JPEG(444). And the little of Gb-Gr residue is encoded extra
        - for fast decoding at 1/2 of resolution just decompress the JPEG and convert to final color space, no de-mosaic, look ma - what a speed!
        - for high quality decoding, get the Gr/Gb difference back, compose the original RGGB bayer data and do a full resolution de-mosaic


Funny fact:

    When I worked on a 4x4K60 camera that dumped its 3GB/s of sensor data to files with no headers we named them *.braw (b for beast).
    That was the rawest raw you could ever get.
   

If anybody from BMD is reading - I would very much like if Blackmagic releases this codec specs at least in form of RDD - same way as Apple reveals the ProRes internal structure, so that 3rd party applications and products can be created. Releasing a binary SDK is just what RED and Canon (and likely others) had for ages for their custom formats.

Oh.. and what patent numbers are applicable for BRAW? Heard that.. it got some patents baked in?


Ing. Daniel Rozsnyo
camera developer
Prague, Czech Republic


Bob Kertesz
 

  I see many people seeking detains on the Blackmagic RAW codec, so
here are mine:
Nice comprehensive post, Daniel. Thank you.

-Bob

Bob Kertesz
BlueScreen LLC
Hollywood, California

DIT, Video Controller, and live compositor extraordinaire.

High quality images for more than four decades - whether you've wanted
them or not.©

* * * * * * * * * *

axel.mertes
 

Hi Daniel,

very in depth analysis.

Am I getting this right:
You think they store away R,  (G1+G2)/2 and B to decode RGB at 1/2 res directly.
Then they also store G1-(G1+G2)/2 as difference layer to recover the full RGGB and have the demosaic layer on the CPU/GPU side, not in the camera?

Mmmh, might indeed be a valid possibilty. Similar to what CineForm does when decoding RAW, or also RED CODE. Makes sense.

However, it would not be very computation intense at all on camera side, given that the cameras can do full demosaic plus color processing/baking in to create the source for in-camera ProRes or DNxHD encoding. Given that, the cameras should have an easy battery live when running in RAW mode...

Doing the demosaic except the color processing in camera would fit a bit more into the split demosaic idea IMHO, but maybe this is being explained or identified at some point.

They did a lot of homework to cover CPU or GPU sided decoding and access for third party software to frame buffers. So if you have a GPU based processing pipeline, you could create the decode image there and continue processing in GPU. No data movement from RAM to GPU etc. as in many other codecs. Thats clever and fast. Adobe & Co. will surely love it.

I just wonder that with the soon advent of JPEG XS (including RAW options) we see a new codec that is DCT based. Quality might be OK, just that JPEG XS (ex TICO with serious improvements) will surely be beyond that. Well, licensing and availability is another thing and BRAW is now here. And maybe they could add JPEG XS encoding scheme in forthcoming versions...?

Mit freundlichen Grüßen,
Best regards,

Axel Mertes

Workflow, IT, Research and Development
Geschäftsführer/CTO/Founder
Tel: +49 69 978837-20
eMail: Axel.Mertes@...

Magna Mana Production
Bildbearbeitung GmbH
Jakob-Latscha-Straße 3
60314 Frankfurt am Main
Germany
Tel: +49 69 978837-0
Fax: +49 69 978837-34
eMail: Info@...
Web: http://www.MagnaMana.com/

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese
E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren
sowie die unbefugte Weitergabe dieser E-Mail sind nicht gestattet.

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in
error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material
in this e-mail is strictly forbidden.


Olaf Matthes
 

Daniel Rozsnyó schrieb:
_What there is:__
_
    8x8 DCT (same as in jpeg-ext, prores, prores-raw and other codecs)
- but in 12 bit precision
ProRes is also doing it in 12bit... or at least the value range is
0..4095. The precision of the variables used in the DCT is of course
different for each implementation.

    The iDCT has separate shaders for full,half,quarter and eight of
the resolution
Actually ProRes can also decode to lower resolutions, just that almost
nobody seems to know about it or make use of it.

You can use the IBlackmagicRawManualDecoderFlow2 interface from the SDK
to get access to the decoded image data (just before it gets stuffed
into the "processing"). The size of that is (for the sample frame that
comes with the SDK) 36.622.336 bytes. So a little more than I expected,
but it might have some sort of header in front of it.

BTW, OpenCL has a kernel cache. So unless they spent the extra time to
disable that and write their own caching (using ecryption/obfuscation
again) you can (at least on Linux) see the de-obfuscated OpenCL code there.

Olaf Matthes
EU-based freelance software developer

Paul Curtis
 

On 18 Sep 2018, at 19:53, Daniel Rozsnyó <daniel@...> wrote:

The partial de-mosaic may just mean:
- take this RGGB raw data, create R, (Gr+Gb)/2, B, convert it to pseudoYUV444, encode with JPEG(444). And the little of Gb-Gr residue is encoded extra
- for fast decoding at 1/2 of resolution just decompress the JPEG and convert to final color space, no de-mosaic, look ma - what a speed!
- for high quality decoding, get the Gr/Gb difference back, compose the original RGGB bayer data and do a full resolution de-mosaic
Nice analysis Daniel,

I was trying to work out whether there is any benefit to storing both G rather than an average, it is feasible that a CFA has two different greens to help extend dynamic range. Not sure if BMD do this though.

As i mentioned before i suspect the beauty is in the fact that it is a well documented, supported, SDK and easy for others to integrate and not rely on legacy QT code.

I'd guess they're storing native RGB values from the sensor which is different to storing them in a wide colour space. This would mean that the decoders would need to understand the exact camera to be able to work with the colours but it would allow all the white balancing goodness that comes with RAW.

It doesn't appear that the decode process makes any use of the demosiac as that is no long part of the file.

cheers
Paul

Paul Curtis, VFX & Post | Canterbury, UK

Daniel Rozsnyó
 

Hi Axel,

On 09/19/2018 03:43 AM, axel.mertes wrote:
I just wonder that with the soon advent of JPEG XS (including RAW options) we see a new codec that is DCT based. Quality might be OK, just that JPEG XS (ex TICO with serious improvements) will surely be beyond that. Well, licensing and availability is another thing and BRAW is now here. And maybe they could add JPEG XS encoding scheme in forthcoming versions...?


TICO is 1D wavelet!

Citing from a paper "Linelet, an Ultra-Low Complexity, Ultra-Low Latency Video Codec for Adaptation of HD-SDI to Ethernet"
https://brage.bibsys.no/xmlui/bitstream/handle/11250/2400689/10976_FULLTEXT.pdf?sequence=1

    "3.1 TICO
   
From publicly available information [36],  the codec is based on compressing single scanlines.
    A Le Gall 5/3 wavelet transform is applied to the scanline horizontally."

    [36]  “JTNM   request   for   technology,    intoPIX,” http://videoservicesforum.org/download/jtnm/JTNM012-1.zip, [Online; accessed 2014-04-25].


If standardizing for JPEG-XS means the algorithm will be openly described - great, but still does not guarantee that it is patent-free / license-free from IntoPIX and that somebody would not want some royalties or stop you making software/systems that work with such files.



Ing. Daniel Rozsnyo
camera developer
Prague, Czech Republic

Geoff Boyle
 

I’ve started doing practical tests with braw, I got the camera update before IBC, and am comparing scenes with different compression settings, as I have done with other cameras.

 

So far I can see no difference between uncompressed and constant quality, the best setting of this.

 

Variable compression does seem to be a great way to go.

 

cheers
Geoff Boyle NSC
EU based cinematographer
+31 637155076

www.gboyle.nl

www.cinematography.net

 

 

From: cml-raw-log-hdr@... <cml-raw-log-hdr@...> On Behalf Of Olaf Matthes
Sent: 18 September 2018 22:16
To: cml-raw-log-hdr@...
Subject: Re: [cml-raw-log-hdr] BRAW codec - analysis

 

Daniel Rozsnyó schrieb:

_What there is:__
_
    8x8 DCT (same as in jpeg-ext, prores, prores-raw and other codecs)
- but in 12 bit precision

ProRes is also doing it in 12bit... or at least the value range is
0..4095. The precision of the variables used in the DCT is of course
different for each implementation.

    The iDCT has separate shaders for full,half,quarter and eight of
the resolution

Actually ProRes can also decode to lower resolutions, just that almost
nobody seems to know about it or make use of it.

You can use the IBlackmagicRawManualDecoderFlow2 interface from the SDK
to get access to the decoded image data (just before it gets stuffed
into the "processing"). The size of that is (for the sample frame that
comes with the SDK) 36.622.336 bytes. So a little more than I expected,
but it might have some sort of header in front of it.

BTW, OpenCL has a kernel cache. So unless they spent the extra time to
disable that and write their own caching (using ecryption/obfuscation
again) you can (at least on Linux) see the de-obfuscated OpenCL code there.

Olaf Matthes
EU-based freelance software developer

Mark Grgurev
 

I was so confident that I was onto something when I came up with the theory that the demosaicing involved doing the gradient direction analysis in-camera and then storing it in the file as a kind of direction map so that the decoder would just have to do the final interpolations. Is there any thing that you found that may hint at that? It seems like kind of the perfect way to split up the demosiacing process so that the heavy lifting is done camera-side.