Decoding Video files¶

DeFFcode's FFdecoder API readily supports multimedia Video files path as input to its source parameter. And with its frame_format parameter, you can easily decode video frames in any pixel format(s) that are readily supported by all well known Computer Vision libraries (such as OpenCV).

We'll discuss its video files support and pixel format capabilities briefly in the following recipes:

DeFFcode APIs requires FFmpeg executable

DeFFcode APIs MUST requires valid FFmpeg executable for all of its core functionality, and any failure in detection will raise RuntimeError immediately. Follow dedicated FFmpeg Installation doc ➶ for its installation.

Additional Python Dependencies for following recipes

Following recipes requires additional python dependencies which can be installed easily as below:

OpenCV: OpenCV is required for previewing video frames. You can easily install it directly via pip:

OpenCV installation from source

You can also follow online tutorials for building & installing OpenCV on Windows, Linux, MacOS and Raspberry Pi machines manually from its source.

Make sure not to install both pip and source version together. Otherwise installation will fail to work!

Other OpenCV binaries

OpenCV maintainers also provide additional binaries via pip that contains both main modules and contrib/extra modules opencv-contrib-python, and for server (headless) environments like opencv-python-headless and opencv-contrib-python-headless. You can also install any one of them in similar manner. More information can be found here.
```
pip install opencv-python       
```

Always use FFdecoder API's terminate() method at the end to avoid undesired behavior.

Never name your python script deffcode.py

When trying out these recipes, never name your python script deffcode.py otherwise it will result in ModuleNotFound error.

Accessing RGB frames from a video file¶

The default function of FFdecoder API is to decode 24-bit RGB video frames from the given source.

FFdecoder API's generateFrame() function can be used in multiple methods to access RGB frames from a given source, such as as a Generator (Recommended Approach), calling with Statement, and as a Iterator.

In this example we will decode the default RGB24 video frames from a given Video file (say foo.mp4) using above mentioned accessing methods:

As a Generator (Recommended)Calling with StatementAs a Iterator

This is a recommended approach for faster and error-proof access of decoded frames. We'll use it throughout the recipes.

# import the necessary packages
from deffcode import FFdecoder

# initialize and formulate the decoder
decoder = FFdecoder("foo.mp4").formulate()

# grab RGB24(default) frame from decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # lets print its shape
    print(frame.shape) # for e.g. (1080, 1920, 3)

# terminate the decoder
decoder.terminate()

Calling with Statement approach can be used to make the code easier, cleaner, and much more readable. This approach also automatically handles management of formulate() and terminate() methods in FFdecoder API, so don't need to explicitly call them. See PEP343 -- The 'with' statement' for more information on this approach.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder
with FFdecoder("foo.mp4") as decoder:

    # grab the BGR24 frames from decoder
    for frame in decoder.generateFrame():

        # check if frame is None
        if frame is None:
            break

        # {do something with the frame here}

        # lets print its shape
        print(frame.shape)  # for e.g. (1080, 1920, 3)

This Iterator Approach bears a close resemblance to OpenCV-Python (Python API for OpenCV) coding syntax, thereby easier to learn and remember.

# import the necessary packages
from deffcode import FFdecoder

# initialize and formulate the decoder
decoder = FFdecoder("foo.mp4").formulate()

# loop over frames
while True:

    # grab RGB24(default) frames from decoder
    frame = next(decoder.generateFrame(), None)

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # lets print its shape
    print(frame.shape) # for e.g. (1080, 1920, 3)

# terminate the decoder
decoder.terminate()

Capturing and Previewing BGR frames from a video file¶

In this example we will decode OpenCV supported live BGR24 video frames from a given Video file (say foo.mp4) in FFdecoder API, and preview them using OpenCV Library's cv2.imshow() method.

By default, OpenCV expects BGR format frames in its cv2.imshow() method by using two accessing methods.

As a Generator (Recommended)Calling with Statement

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder for BGR24 pixel format output
decoder = FFdecoder("foo.mp4", frame_format="bgr24").formulate()

# grab the BGR24 frames from decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

Calling with Statement approach can be used to make the code easier, cleaner, and much more readable. This approach also automatically handles management of formulate() and terminate() methods in FFdecoder API, so don't need to explicitly call them. See PEP343 -- The 'with' statement' for more information on this approach.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder for BGR24 pixel format output
with FFdecoder("foo.mp4", frame_format="bgr24") as decoder:

    # grab the BGR24 frames from decoder
    for frame in decoder.generateFrame():

        # check if frame is None
        if frame is None:
            break

        # {do something with the frame here}

        # Show output window
        cv2.imshow("Output", frame)

        # check for 'q' key if pressed
        key = cv2.waitKey(1) & 0xFF
        if key == ord("q"):
            break

# close output window
cv2.destroyAllWindows()

Playing with any other FFmpeg pixel formats¶

Similar to BGR, you can input any pixel format (supported by installed FFmpeg) by way of frame_format parameter of FFdecoder API for the desired video frame format.

In this example we will decode live Grayscale and YUV video frames from a given Video file (say foo.mp4) in FFdecoder API, and preview them using OpenCV Library's cv2.imshow() method.

Use ffmpeg -pix_fmts terminal command to lists all FFmpeg supported pixel formats.

Decode GrayscaleDecode Grayscale via YUV (fastest)Decode YUV frames

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder for GRAYSCALE output
decoder = FFdecoder("input_foo.mp4", frame_format="gray", verbose=True).formulate()

# grab the GRAYSCALE frames from the decoder
for gray in decoder.generateFrame():

    # check if frame is None
    if gray is None:
        break

    # {do something with the gray frame here}

    # Show output window
    cv2.imshow("Gray Output", gray)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

Fastest RAW-to-Grayscale via -extract_luma

Every YUV/NV bytestream stores the Luma (Y) plane uncompressed at the top of each frame. The exclusive -extract_luma boolean attribute makes FFdecoder slice that Y-plane directly and hand back a 2D (H, W) grayscale ndarray — no colorspace conversion in FFmpeg, no cv2.cvtColor in Python. This is strictly faster than frame_format="gray", which still asks FFmpeg to do a yuv→gray conversion on every frame.

Combined with the reduced pipe-bytes of YUV 4:2:0 ingest, this is the fastest grayscale pipeline the API can produce.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# enable direct Luma (Y-plane) extraction
ffparams = {"-extract_luma": True}

# initialize the decoder with a YUV pixel-format
decoder = FFdecoder(
    "input_foo.mp4", frame_format="yuv420p", verbose=True, **ffparams
).formulate()

# grab the 2D (H, W) grayscale frames from the decoder
for gray in decoder.generateFrame():

    # check if frame is None
    if gray is None:
        break

    # {do something with the gray frame here}

    # Show output window
    cv2.imshow("Gray Output", gray)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

With FFdecoder API, frames extracted with YUV pixel formats (yuv420p, yuv444p, nv12, nv21 etc.) are generally incompatible with OpenCV APIs. But you can make them easily compatible by using exclusive -enforce_cv_patch boolean attribute of its ffparam dictionary parameter.

Performance Mode — Faster Decoding via YUV420p

Ingesting frames as 12-bit YUV 4:2:0 instead of 24-bit RGB/BGR halves the bytes moving through the FFmpeg pipe, so the subprocess pipeline spends less time blocked on I/O. In community benchmarks on 1080p MP4 (see issue #15), RAW ingest jumped from ~96 FPS (RGB24) to ~213 FPS (YUV420p), and ~155 FPS when converted to BGR inside Python via OpenCV — a 25–33% gain over the RGB path for the majority of common video sources (which are already YUV420 on disk).

Use this mode when you're throughput-bound on decoding and can afford a single cv2.cvtColor call per frame. Skip it for scientific workloads where the implicit chroma subsampling of YUV 4:2:0 is unacceptable.

Let's try decoding YUV420p pixel-format frames in following python code:

You can also use other YUV pixel formats such yuv422p(4:2:2 subsampling) or yuv444p(4:4:4 subsampling) etc. instead for more higher dynamic range in the similar manner.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# enable OpenCV patch for YUV frames
ffparams = {"-enforce_cv_patch": True}

# initialize and formulate the decoder for YUV420p output
decoder = FFdecoder(
    "input_foo.mp4", frame_format="yuv420p", verbose=True, **ffparams
).formulate()

# grab the YUV420p frames from the decoder
for yuv in decoder.generateFrame():

    # check if frame is None
    if yuv is None:
        break

    # convert it to `BGR` pixel format,
    # since imshow() method only accepts `BGR` frames
    bgr = cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_I420)

    # {do something with the bgr frame here}

    # Show output window
    cv2.imshow("Output", bgr)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

Capturing and Previewing frames from a Looping Video¶

In this example we will decode live BGR24 video frames from looping video using different means in FFdecoder API, and preview them using OpenCV Library's cv2.imshow() method.

Using -stream_loop optionUsing loop filter

The recommend way to loop video is to use -stream_loop option via. -ffprefixes list attribute of ffparam dictionary parameter in FFdecoder API. Possible values are integer values: > 0 value of loop, 0 means no loop, -1 means infinite loop.

Using -stream_loop 3 will loop video 4 times.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# define `-stream_loop 3` for looping 4 times
ffparams = {"-ffprefixes":["-stream_loop", "3"]}

# initialize and formulate the decoder with suitable source
decoder = FFdecoder("input.mp4", frame_format="bgr24", verbose=True, **ffparams).formulate()

# print metadata as `json.dump`
print(decoder.metadata)

# grab the BGR24 frame from the decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

Another way to loop video is to use loop complex filter via. -filter_complex FFmpeg flag as attribute of ffparam dictionary parameter in FFdecoder API.

This filter places all frames into memory(RAM), so applying trim filter first is strongly recommended. Otherwise you might probably run Out of Memory.

Using loop filter for looping video

The filter accepts the following options:

loop: Sets the number of loops for integer values >0. Setting this value to -1 will result in infinite loops. Default is 0(no loops).
size: Sets maximal size in number of frames. Default is 0.
start: Sets first frame of loop. Default is 0.

Using loop=3 will loop video 4 times.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# define loop 4 times, each loop is 15 frames, each loop skips the first 25 frames
ffparams = {
    "-filter_complex": "loop=loop=3:size=15:start=25" # Or use: `loop=3:15:25`
}  

# initialize and formulate the decoder with suitable source
decoder = FFdecoder(
    "input.mp4", frame_format="bgr24", verbose=True, **ffparams
).formulate()

# print metadata as `json.dump`
print(decoder.metadata)

# grab the BGR24 frame from the decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()