Hardware-Accelerated Video Decoding¶

FFmpeg offer access to dedicated GPU hardware with varying support on different platforms for performing a range of video-related tasks to be completed faster or using less of other resources (particularly CPU).

By default, DeFFcode's FFdecoder API uses the Input Source's video-decoder (extracted using Sourcer API) itself for decoding its input. However, you could easily change the video-decoder to your desired specific supported Video-Decoder using FFmpeg options by way of its ffparams dictionary parameter. This feature provides easy access to GPU Accelerated Hardware Decoder in FFdecoder API that will generate faster video frames while using little to no CPU power, as opposed to CPU intensive Software Decoders.

We'll discuss its Hardware-Accelerated Video Decoding capabilities briefly in the following recipes:

DeFFcode APIs requires FFmpeg executable

DeFFcode APIs MUST requires valid FFmpeg executable for all of its core functionality, and any failure in detection will raise RuntimeError immediately. Follow dedicated FFmpeg Installation doc ➶ for its installation.

Additional Python Dependencies for following recipes

Following recipes requires additional python dependencies which can be installed easily as below:

OpenCV: OpenCV is required for previewing video frames. You can easily install it directly via pip:

OpenCV installation from source

You can also follow online tutorials for building & installing OpenCV on Windows, Linux, MacOS and Raspberry Pi machines manually from its source.

Make sure not to install both pip and source version together. Otherwise installation will fail to work!

Other OpenCV binaries

OpenCV maintainers also provide additional binaries via pip that contains both main modules and contrib/extra modules opencv-contrib-python, and for server (headless) environments like opencv-python-headless and opencv-contrib-python-headless. You can also install any one of them in similar manner. More information can be found here.
```
pip install opencv-python       
```

Always use FFdecoder API's terminate() method at the end to avoid undesired behavior.

Never name your python script deffcode.py

When trying out these recipes, never name your python script deffcode.py otherwise it will result in ModuleNotFound error.

CUVID-accelerated Hardware-based Video Decoding and Previewing¶

Example Assumptions

Please note that following recipe explicitly assumes:

You're running Linux operating system with a supported NVIDIA GPU.
You're using FFmpeg 4.4 or newer, configured with at least --enable-nonfree --enable-cuda-nvcc --enable-libnpp --enable-cuvid --enable-nvenc configuration flags during compilation. For compilation follow these instructions ➶

Using h264_cuvid decoder: Remember to check if your FFmpeg compiled with H.264 CUVID decoder support by executing following one-liner command in your terminal, and observing if output contains something similar as follows:

Verifying H.264 CUVID decoder support in FFmpeg

$ ffmpeg  -hide_banner -decoders | grep cuvid

V..... av1_cuvid            Nvidia CUVID AV1 decoder (codec av1)
V..... h264_cuvid           Nvidia CUVID H264 decoder (codec h264)
V..... hevc_cuvid           Nvidia CUVID HEVC decoder (codec hevc)
V..... mjpeg_cuvid          Nvidia CUVID MJPEG decoder (codec mjpeg)
V..... mpeg1_cuvid          Nvidia CUVID MPEG1VIDEO decoder (codec mpeg1video)
V..... mpeg2_cuvid          Nvidia CUVID MPEG2VIDEO decoder (codec mpeg2video)
V..... mpeg4_cuvid          Nvidia CUVID MPEG4 decoder (codec mpeg4)
V..... vc1_cuvid            Nvidia CUVID VC1 decoder (codec vc1)
V..... vp8_cuvid            Nvidia CUVID VP8 decoder (codec vp8)
V..... vp9_cuvid            Nvidia CUVID VP9 decoder (codec vp9)

You can also use any of above decoder in the similar way, if supported.

Use ffmpeg -decoders terminal command to lists all FFmpeg supported decoders.

You already have appropriate Nvidia video drivers and related softwares installed on your machine.
If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically, but hardware filters won't be applicable.

These assumptions MAY/MAY NOT suit your current setup. Kindly use suitable parameters based your system platform and hardware settings only.

In this example, we will be using Nvidia's H.264 CUVID Video decoder in FFdecoder API to achieve GPU-accelerated hardware video decoding of YUV420p frames from a given Video file (say foo.mp4), and preview them using OpenCV Library's cv2.imshow() method.

With FFdecoder API, frames extracted with YUV pixel formats (yuv420p, yuv444p, nv12, nv21 etc.) are generally incompatible with OpenCV APIs such as imshow(). But you can make them easily compatible by using exclusive -enforce_cv_patch boolean attribute of its ffparam dictionary parameter.

More information on Nvidia's CUVID can be found here ➶

# import the necessary packages
from deffcode import FFdecoder
import cv2

# define suitable FFmpeg parameter
ffparams = {
    "-vcodec": "h264_cuvid",  # use H.264 CUVID Video-decoder
    "-enforce_cv_patch": True # enable OpenCV patch for YUV(YUV420p) frames
}

# initialize and formulate the decoder with `foo.mp4` source
decoder = FFdecoder(
    "foo.mp4",
    frame_format="yuv420p",  # use YUV420p frame pixel format
    verbose=True, # enable verbose output
    **ffparams # apply various params and custom filters
).formulate()

# grab the YUV420p frame from the decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # convert it to `BGR` pixel format,
    # since imshow() method only accepts `BGR` frames
    frame = cv2.cvtColor(frame, cv2.COLOR_YUV2BGR_I420)

    # {do something with the BGR frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()

CUDA-accelerated Hardware-based Video Decoding and Previewing¶

Example Assumptions

Please note that following recipe explicitly assumes:

You're running Linux operating system with a supported NVIDIA GPU.

You're using FFmpeg 4.4 or newer, configured with at least --enable-nonfree --enable-cuda-nvcc --enable-libnpp --enable-cuvid --enable-nvenc configuration flags during compilation. For compilation follow these instructions ➶

Verifying NVDEC/CUDA support in FFmpeg

To use CUDA Video-decoder(cuda), remember to check if your FFmpeg compiled with it by executing following commands in your terminal, and observing if output contains something similar as follows:

$ ffmpeg  -hide_banner -pix_fmts | grep cuda
..H.. cuda                   0              0      0

$ ffmpeg  -hide_banner -filters | egrep "cuda|npp"
... bilateral_cuda    V->V       GPU accelerated bilateral filter
... chromakey_cuda    V->V       GPU accelerated chromakey filter
... colorspace_cuda   V->V       CUDA accelerated video color converter
... hwupload_cuda     V->V       Upload a system memory frame to a CUDA device.
... overlay_cuda      VV->V      Overlay one video on top of another using CUDA
... scale_cuda        V->V       GPU accelerated video resizer
... scale_npp         V->V       NVIDIA Performance Primitives video scaling and format conversion
... scale2ref_npp     VV->VV     NVIDIA Performance Primitives video scaling and format conversion to the given reference.
... sharpen_npp       V->V       NVIDIA Performance Primitives video sharpening filter.
... thumbnail_cuda    V->V       Select the most representative frame in a given sequence of consecutive frames.
... transpose_npp     V->V       NVIDIA Performance Primitives video transpose
T.. yadif_cuda        V->V       Deinterlace CUDA frames

You already have appropriate Nvidia video drivers and related softwares installed on your machine.
If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically, but hardware filters won't be applicable.

These assumptions MAY/MAY NOT suit your current setup. Kindly use suitable parameters based your system platform and hardware settings only.

In this example, we will be using Nvidia's CUDA Internal hwaccel Video decoder(cuda) in FFdecoder API to automatically detect best NV-accelerated video codec and keeping video frames in GPU memory (for applying hardware filters), thereby achieving GPU-accelerated decoding of NV12 pixel-format frames from a given video file (say foo.mp4), and preview them using OpenCV Library's cv2.imshow() method.

NV12(for 4:2:0 input) and NV21(for 4:4:4 input) are the only supported pixel format. You cannot change pixel format to any other since NV-accelerated video codec supports only them.

NV12 is a biplanar format with a full sized Y plane followed by a single chroma plane with weaved U and V values. NV21 is the same but with weaved V and U values. The 12 in NV12 refers to 12 bits per pixel. NV12 has a half width and half height chroma channel, and therefore is a 420 subsampling. NV16 is 16 bits per pixel, with half width and full height. aka 422. NV24 is 24 bits per pixel with full sized chroma channel. aka 444. Most NV12 functions allow the destination Y pointer to be NULL.

With FFdecoder API, frames extracted with YUV pixel formats (yuv420p, yuv444p, nv12, nv21 etc.) are generally incompatible with OpenCV APIs such as imshow(). But you can make them easily compatible by using exclusive -enforce_cv_patch boolean attribute of its ffparam dictionary parameter.

More information on Nvidia's GPU Accelerated Decoding can be found here ➶

# import the necessary packages
from deffcode import FFdecoder
import cv2

# define suitable FFmpeg parameter
ffparams = {
    "-vcodec": None,  # skip source decoder and let FFmpeg chose
    "-enforce_cv_patch": True # enable OpenCV patch for YUV(NV12) frames
    "-ffprefixes": [
        "-vsync",
        "0",  # prevent duplicate frames
        "-hwaccel",
        "cuda",  # accelerator
        "-hwaccel_output_format",
        "cuda",  # output accelerator
    ],
    "-custom_resolution": "null",  # discard source `-custom_resolution`
    "-framerate": "null",  # discard source `-framerate`
    "-vf": "scale_cuda=640:360,"  # scale to 640x360 in GPU memory
    + "fps=60.0,"  # framerate 60.0fps in GPU memory
    + "hwdownload,"  # download hardware frames to system memory
    + "format=nv12",  # convert downloaded frames to NV12 pixel format
}

# initialize and formulate the decoder with `foo.mp4` source
decoder = FFdecoder(
    "foo.mp4",
    frame_format="null",  # discard source frame pixel format
    verbose=True, # enable verbose output
    **ffparams # apply various params and custom filters
).formulate()

# grab the NV12 frame from the decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # convert it to `BGR` pixel format,
    # since imshow() method only accepts `BGR` frames
    frame = cv2.cvtColor(frame, cv2.COLOR_YUV2BGR_NV12)

    # {do something with the BGR frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# close output window
cv2.destroyAllWindows()

# terminate the decoder
decoder.terminate()