Skip to content

Decoding Video files

DeFFcode's FFdecoder API readily supports multimedia Video files path as input to its source parameter. And with its frame_format parameter, you can easily decode video frames in any pixel format(s) that are readily supported by all well known Computer Vision libraries (such as OpenCV).

We'll discuss its video files support and pixel format capabilities briefly in the following recipes:

DeFFcode APIs requires FFmpeg executable

DeFFcode APIs MUST requires valid FFmpeg executable for all of its core functionality, and any failure in detection will raise RuntimeError immediately. Follow dedicated FFmpeg Installation doc ➶ for its installation.

Additional Python Dependencies for following recipes

Following recipes requires additional python dependencies which can be installed easily as below:

  • OpenCV: OpenCV is required for previewing video frames. You can easily install it directly via pip:

    OpenCV installation from source

    You can also follow online tutorials for building & installing OpenCV on Windows, Linux, MacOS and Raspberry Pi machines manually from its source.

    ⚠ Make sure not to install both pip and source version together. Otherwise installation will fail to work!

    Other OpenCV binaries

    OpenCV maintainers also provide additional binaries via pip that contains both main modules and contrib/extra modules opencv-contrib-python, and for server (headless) environments like opencv-python-headless and opencv-contrib-python-headless. You can also install any one of them in similar manner. More information can be found here.

    pip install opencv-python       
    

Always use FFdecoder API's terminate() method at the end to avoid undesired behavior.

Never name your python script deffcode.py

When trying out these recipes, never name your python script deffcode.py otherwise it will result in ModuleNotFound error.

Accessing RGB frames from a video file

The default function of FFdecoder API is to decode 24-bit RGB video frames from the given source.

FFdecoder API's generateFrame() function can be used in multiple methods to access RGB frames from a given source, such as as a Generator (Recommended Approach), calling with Statement, and as a Iterator.

In this example we will decode the default RGB24 video frames from a given Video file (say foo.mp4) using above mentioned accessing methods:

This is a recommended approach for faster and error-proof access of decoded frames. We'll use it throughout the recipes.

# import the necessary packages
from deffcode import FFdecoder

# initialize and formulate the decoder
decoder = FFdecoder("foo.mp4").formulate()

# grab RGB24(default) frame from decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # lets print its shape
    print(frame.shape) # for e.g. (1080, 1920, 3)

# terminate the decoder
decoder.terminate()

Calling with Statement approach can be used to make the code easier, cleaner, and much more readable. This approach also automatically handles management of formulate() and terminate() methods in FFdecoder API, so don't need to explicitly call them. See PEP343 -- The 'with' statement' for more information on this approach.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder
with FFdecoder("foo.mp4") as decoder:

    # grab the BGR24 frames from decoder
    for frame in decoder.generateFrame():

        # check if frame is None
        if frame is None:
            break

        # {do something with the frame here}

        # lets print its shape
        print(frame.shape)  # for e.g. (1080, 1920, 3)

This Iterator Approach bears a close resemblance to OpenCV-Python (Python API for OpenCV) coding syntax, thereby easier to learn and remember.

# import the necessary packages
from deffcode import FFdecoder

# initialize and formulate the decoder
decoder = FFdecoder("foo.mp4").formulate()

# loop over frames
while True:

    # grab RGB24(default) frames from decoder
    frame = next(decoder.generateFrame(), None)

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # lets print its shape
    print(frame.shape) # for e.g. (1080, 1920, 3)

# terminate the decoder
decoder.terminate()

 

Capturing and Previewing BGR frames from a video file

In this example we will decode OpenCV supported live BGR24 video frames from a given Video file (say foo.mp4) in FFdecoder API, and preview them using OpenCV Library's cv2.imshow() method.

By default, OpenCV expects BGR format frames in its cv2.imshow() method by using two accessing methods.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder for BGR24 pixel format output
decoder = FFdecoder("foo.mp4", frame_format="bgr24").formulate()

# grab the BGR24 frames from decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

Calling with Statement approach can be used to make the code easier, cleaner, and much more readable. This approach also automatically handles management of formulate() and terminate() methods in FFdecoder API, so don't need to explicitly call them. See PEP343 -- The 'with' statement' for more information on this approach.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder for BGR24 pixel format output
with FFdecoder("foo.mp4", frame_format="bgr24") as decoder:

    # grab the BGR24 frames from decoder
    for frame in decoder.generateFrame():

        # check if frame is None
        if frame is None:
            break

        # {do something with the frame here}

        # Show output window
        cv2.imshow("Output", frame)

        # check for 'q' key if pressed
        key = cv2.waitKey(1) & 0xFF
        if key == ord("q"):
            break

# close output window
cv2.destroyAllWindows()

 

Playing with any other FFmpeg pixel formats

Similar to BGR, you can input any pixel format (supported by installed FFmpeg) by way of frame_format parameter of FFdecoder API for the desired video frame format.

In this example we will decode live Grayscale and YUV video frames from a given Video file (say foo.mp4) in FFdecoder API, and preview them using OpenCV Library's cv2.imshow() method.

Use ffmpeg -pix_fmts terminal command to lists all FFmpeg supported pixel formats.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# initialize and formulate the decoder for GRAYSCALE output
decoder = FFdecoder("input_foo.mp4", frame_format="gray", verbose=True).formulate()

# grab the GRAYSCALE frames from the decoder
for gray in decoder.generateFrame():

    # check if frame is None
    if gray is None:
        break

    # {do something with the gray frame here}

    # Show output window
    cv2.imshow("Gray Output", gray)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

With FFdecoder API, frames extracted with YUV pixel formats (yuv420p, yuv444p, nv12, nv21 etc.) are generally incompatible with OpenCV APIs. But you can make them easily compatible by using exclusive -enforce_cv_patch boolean attribute of its ffparam dictionary parameter.

Let's try decoding YUV420p pixel-format frames in following python code:

You can also use other YUV pixel formats such yuv422p(4:2:2 subsampling) or yuv444p(4:4:4 subsampling) etc. instead for more higher dynamic range in the similar manner.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# enable OpenCV patch for YUV frames
ffparams = {"-enforce_cv_patch": True}

# initialize and formulate the decoder for YUV420p output
decoder = FFdecoder(
    "input_foo.mp4", frame_format="yuv420p", verbose=True, **ffparams
).formulate()

# grab the YUV420p frames from the decoder
for yuv in decoder.generateFrame():

    # check if frame is None
    if yuv is None:
        break

    # convert it to `BGR` pixel format,
    # since imshow() method only accepts `BGR` frames
    bgr = cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_I420)

    # {do something with the bgr frame here}

    # Show output window
    cv2.imshow("Output", bgr)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

 

Capturing and Previewing frames from a Looping Video

In this example we will decode live BGR24 video frames from looping video using different means in FFdecoder API, and preview them using OpenCV Library's cv2.imshow() method.

The recommend way to loop video is to use -stream_loop option via. -ffprefixes list attribute of ffparam dictionary parameter in FFdecoder API. Possible values are integer values: > 0 value of loop, 0 means no loop, -1 means infinite loop.

Using -stream_loop 3 will loop video 4 times.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# define `-stream_loop 3` for looping 4 times
ffparams = {"-ffprefixes":["-stream_loop", "3"]}

# initialize and formulate the decoder with suitable source
decoder = FFdecoder("input.mp4", frame_format="bgr24", verbose=True, **ffparams).formulate()

# print metadata as `json.dump`
print(decoder.metadata)

# grab the BGR24 frame from the decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

Another way to loop video is to use loop complex filter via. -filter_complex FFmpeg flag as attribute of ffparam dictionary parameter in FFdecoder API.

This filter places all frames into memory(RAM), so applying trim filter first is strongly recommended. Otherwise you might probably run Out of Memory.

Using loop filter for looping video

The filter accepts the following options:

  • loop: Sets the number of loops for integer values >0. Setting this value to -1 will result in infinite loops. Default is 0(no loops).
  • size: Sets maximal size in number of frames. Default is 0.
  • start: Sets first frame of loop. Default is 0.

Using loop=3 will loop video 4 times.

# import the necessary packages
from deffcode import FFdecoder
import cv2

# define loop 4 times, each loop is 15 frames, each loop skips the first 25 frames
ffparams = {
    "-filter_complex": "loop=loop=3:size=15:start=25" # Or use: `loop=3:15:25`
}  

# initialize and formulate the decoder with suitable source
decoder = FFdecoder(
    "input.mp4", frame_format="bgr24", verbose=True, **ffparams
).formulate()

# print metadata as `json.dump`
print(decoder.metadata)

# grab the BGR24 frame from the decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()