Skip to content

Sourcer API

Sourcer API Functional Diagram

Sourcer API acts as Source Probing Utility that unlike other FFmpeg Wrappers which mostly uses ffprobe module, attempts to open the given Input Source directly with FFmpeg inside a subprocess pipe, and parses/probes the standard output(stdout) employing various pattern matching methods in order to recognize all the properties(metadata) of each media stream contained in it.

Sourcer API primarily acts as a backend for FFdecoder API for gathering, processing, and validating all multimedia streams metadata available in the given Input Source. Sourcer shares this information with FFdecoder API which helps in formulating its default FFmpeg pipeline parameters for real-time video-frames generation.

Sourcer API is design as a standalone Metadata Extraction API for easily parsing information from multimedia streams available in the given Input Source and returns it in either Human-readable (JSON string) or Machine-readable (Dictionary object) type with its retrieve_metadata() method.

All metadata attributes available with Sourcer API(On Windows) are discussed here ➶.

Furthermore, Sourcer's sourcer_params dictionary parameter can be used to define almost any FFmpeg parameter as well as alter internal API settings.

For usage examples, kindly refer our Basic Recipes 🍰 and Advanced Recipes 🥐

Sourcer API parameters are explained here ➶

Source code in deffcode/sourcer.py
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
class Sourcer:
    """
    > Sourcer API acts as **Source Probing Utility** that unlike other FFmpeg Wrappers which mostly uses [`ffprobe`](https://ffmpeg.org/ffprobe.html) module,
    attempts to open the given Input Source directly with [**FFmpeg**](https://ffmpeg.org/) inside a [`subprocess`](https://docs.python.org/3/library/subprocess.html) pipe,
    and parses/probes the standard output(stdout) employing various pattern matching methods in order to recognize all the properties(metadata) of each
    media stream contained in it.

    Sourcer API primarily acts as a **backend for [FFdecoder API](../../reference/ffdecoder)** for gathering, processing, and validating
    all multimedia streams metadata available in the given Input Source. Sourcer shares this information with FFdecoder API which helps in
    formulating its default FFmpeg pipeline parameters for real-time video-frames generation.

    Sourcer API is design as a standalone **Metadata Extraction API** for easily parsing information from multimedia streams available in the
    given Input Source and returns it in either Human-readable _(JSON string)_ or Machine-readable _(Dictionary object)_ type with its
    [`retrieve_metadata()`](#deffcode.sourcer.Sourcer.retrieve_metadata) method.

    !!! info "All metadata attributes available with Sourcer API(On :fontawesome-brands-windows: Windows) are discussed [here ➶](../../recipes/basic/#display-source-video-metadata)."

    Furthermore, Sourcer's [`sourcer_params`](params/#sourcer_params) dictionary parameter can be used to define almost any FFmpeg parameter as well as alter internal API settings.

    !!! example "For usage examples, kindly refer our **[Basic Recipes :cake:](../../recipes/basic)** and **[Advanced Recipes :croissant:](../../recipes/advanced)**"

    !!! info "Sourcer API parameters are explained [here ➶](params/)"
    """

    def __init__(
        self,
        source: str | list[str],
        source_demuxer: str | list[str] | None = None,
        custom_ffmpeg: str = "",
        verbose: bool = False,
        **sourcer_params: Any,
    ) -> None:
        """
        This constructor method initializes the object state and attributes of the Sourcer Class.

        Parameters:
            source (str): defines the input(`-i`) source filename/URL/device-name/device-path.
            source_demuxer (str): specifies the demuxer(`-f`) for the input source.
            custom_ffmpeg (str): assigns the location of custom path/directory for custom FFmpeg executable.
            verbose (bool): enables/disables verbose.
            sourcer_params (dict): provides the flexibility to control supported internal and FFmpeg parameters.
        """
        # checks machine OS
        self.__machine_OS = platform.system()

        # define internal parameters
        self.__verbose_logs = (  # enable verbose if specified
            verbose if (verbose and isinstance(verbose, bool)) else False
        )

        # handle metadata received
        self.__ffsp_output = None

        # sanitize sourcer_params
        self.__sourcer_params = {
            str(k).strip(): (
                str(v).strip() if not isinstance(v, (dict, list, int, float, tuple)) else v
            )
            for k, v in sourcer_params.items()
        }

        # handle whether to force validate source
        self.__forcevalidatesource = self.__sourcer_params.pop("-force_validate_source", False)
        if not isinstance(self.__forcevalidatesource, bool):
            # reset improper values
            self.__forcevalidatesource = False

        # sanitize externally accessible parameters and setup list mapping
        self.__is_multi = isinstance(source, list)
        self.__source_list = source if self.__is_multi else [source]

        # validate source list early so downstream errors stay coherent
        if self.__is_multi and not self.__source_list:
            raise ValueError("Input `source` list is empty!")

        # handle user defined ffmpeg pre-headers(parameters such as `-re`) parameters (must be a list)
        _prefixes = self.__sourcer_params.pop("-ffprefixes", [])
        if not isinstance(_prefixes, list):
            # log it
            logger.warning(
                "Discarding invalid `-ffprefixes` value of wrong type `{}`!".format(
                    type(_prefixes).__name__
                )
            )
            # reset improper values
            _prefixes = []

        if self.__is_multi:
            # multi-input requires per-source list-of-lists with matching length
            # to keep prefix routing unambiguous; flat lists are rejected.
            if _prefixes:
                if not all(isinstance(p, list) for p in _prefixes):
                    raise ValueError(
                        "Multi-input `-ffprefixes` must be a list of per-input lists "
                        "(e.g. `[['-re'], ['-stream_loop', '-1']]`). "
                        "Flat lists are ambiguous in multi-input mode."
                    )
                if len(_prefixes) != len(self.__source_list):
                    raise ValueError(
                        "`-ffprefixes` length ({}) must match `source` list length ({})!".format(
                            len(_prefixes), len(self.__source_list)
                        )
                    )
                self.__ffmpeg_prefixes_list = _prefixes
                self.__ffmpeg_prefixes = _prefixes[0]
            else:
                self.__ffmpeg_prefixes_list = [[] for _ in self.__source_list]
                self.__ffmpeg_prefixes = []
        else:
            # single-input keeps the original flat-list contract; nested lists are
            # only meaningful in multi-input mode, so reject them with a warning.
            if any(isinstance(p, list) for p in _prefixes):
                logger.warning(
                    "Nested lists in `-ffprefixes` are only supported for multi-input sources. Discarding!"
                )
                _prefixes = []
            self.__ffmpeg_prefixes = _prefixes
            self.__ffmpeg_prefixes_list = [_prefixes]

        # handle source_demuxer list mapping
        if self.__is_multi:
            if isinstance(source_demuxer, list):
                if len(source_demuxer) != len(self.__source_list):
                    raise ValueError(
                        "`source_demuxer` length ({}) must match `source` list length ({})!".format(
                            len(source_demuxer), len(self.__source_list)
                        )
                    )
                self.__source_demuxer_list = source_demuxer
            else:
                self.__source_demuxer_list = [source_demuxer] * len(self.__source_list)
        else:
            self.__source_demuxer_list = [source_demuxer]

        # initialize per-source metadata buffer so retrieve_metadata can be
        # called safely (e.g. via the recursive primary probe in probe_stream)
        # without polluting the result with stale `sources` keys.
        self.__multi_source_metadata: list[Any] = []

        # handle where to save the downloaded FFmpeg Static assets on Windows(if specified)
        __ffmpeg_download_path = self.__sourcer_params.pop("-ffmpeg_download_path", "")
        if not isinstance(__ffmpeg_download_path, str):
            # reset improper values
            __ffmpeg_download_path = ""

        # validate the FFmpeg assets and return location (also downloads static assets on windows)
        self.__ffmpeg = get_valid_ffmpeg_path(
            str(custom_ffmpeg),
            self.__machine_OS == "Windows",
            ffmpeg_download_path=__ffmpeg_download_path,
            verbose=self.__verbose_logs,
        )

        # check if valid FFmpeg path returned
        if self.__ffmpeg:
            self.__verbose_logs and logger.debug(
                "Found valid FFmpeg executable: `{}`.".format(self.__ffmpeg)
            )
        else:
            # else raise error
            raise RuntimeError(
                "[DeFFcode:ERROR] :: Failed to find FFmpeg assets on this system. Kindly compile/install FFmpeg or provide a valid custom FFmpeg binary path!"
            )

        # sanitize externally accessible parameters and assign them
        # Use primary index 0 for fallback properties validation
        if not self.__source_list:
            raise ValueError("Input `source` parameter is empty!")
        source = self.__source_list[0]
        source_demuxer = self.__source_demuxer_list[0]

        # handles source demuxer
        if source is None:
            # first check if source value is empty
            # raise error if true
            raise ValueError("Input `source` parameter is empty!")
        elif isinstance(source_demuxer, str):
            # assign if valid demuxer value
            self.__source_demuxer = source_demuxer.strip().lower()
            # assign if valid demuxer value
            assert self.__source_demuxer != "auto" or validate_device_index(source), (
                "Invalid `source_demuxer='auto'` value detected with source: `{}`. Aborting!".format(
                    source
                )
            )
        else:
            # otherwise find valid default source demuxer value
            # enforce "auto" if valid index device
            self.__source_demuxer = "auto" if validate_device_index(source) else None
            # log if not valid index device and invalid type
            self.__verbose_logs and self.__source_demuxer not in [
                "auto",
                None,
            ] and logger.warning(
                "Discarding invalid `source_demuxer` parameter value of wrong type: `{}`".format(
                    type(source_demuxer).__name__
                )
            )
            # log if not valid index device and invalid type
            self.__verbose_logs and self.__source_demuxer == "auto" and logger.critical(
                "Given source `{}` is a valid device index. Enforcing 'auto' demuxer.".format(
                    source
                )
            )

        # handles source stream
        self.__source = source

        # creates shallow copy for further usage #TODO
        self.__source_org = copy.copy(self.__source)
        self.__source_demuxer_org = copy.copy(self.__source_demuxer)

        # handles all extracted devices names/paths list
        # when source_demuxer = "auto"
        self.__extracted_devices_list: list[Any] = []

        # various source stream params
        self.__default_video_resolution = ""  # handles stream resolution
        self.__default_video_orientation = ""  # handles stream's video orientation
        self.__default_video_framerate = ""  # handles stream framerate
        self.__default_video_bitrate = ""  # handles stream's video bitrate
        self.__default_video_pixfmt = ""  # handles stream's video pixfmt
        self.__default_video_decoder = ""  # handles stream's video decoder
        self.__default_source_duration = ""  # handles stream's video duration
        self.__approx_video_nframes = ""  # handles approx stream frame number
        self.__default_audio_bitrate = ""  # handles stream's audio bitrate
        self.__default_audio_samplerate = ""  # handles stream's audio samplerate

        # handle various stream flags
        self.__contains_video = False  # contains video
        self.__contains_audio = False  # contains audio
        self.__contains_images = False  # contains image-sequence

        # handles output parameters through filters
        self.__metadata_output = None  # handles output stream metadata
        self.__output_frames_resolution = ""  # handles output stream resolution
        self.__output_framerate = ""  # handles output stream framerate
        self.__output_frames_pixfmt = ""  # handles output frame pixel format
        self.__output_orientation = ""  # handles output frame orientation

        # check whether metadata probed or not?
        self.__metadata_probed = False

    def probe_stream(self, default_stream_indexes: list[int] | tuple[int, int] = (0, 0)) -> Sourcer:
        """
        This method Parses/Probes FFmpeg `subprocess` pipe's Standard Output for given input source and Populates the information in private class variables.

        Parameters:
            default_stream_indexes (list, tuple): selects specific video and audio stream index in case of multiple ones. Value can be of format: `(int,int)`. For example `(0,1)` is ("0th video stream", "1st audio stream").

        **Returns:** Reference to the instance object.
        """
        assert (
            isinstance(default_stream_indexes, (list, tuple))
            and len(default_stream_indexes) == 2
            and all(isinstance(x, int) for x in default_stream_indexes)
        ), "Invalid default_stream_indexes value!"
        # validate source and extract metadata
        self.__ffsp_output = self.__validate_source(
            self.__source,
            source_demuxer=self.__source_demuxer,
            forced_validate=(self.__forcevalidatesource if self.__source_demuxer is None else True),
        )
        # parse resolution and framerate
        video_rfparams = self.__extract_resolution_framerate(
            default_stream=default_stream_indexes[0]
        )
        if video_rfparams:
            self.__default_video_resolution = video_rfparams["resolution"]
            self.__default_video_framerate = video_rfparams["framerate"]
            self.__default_video_orientation = video_rfparams["orientation"]

        # parse output parameters through filters (if available)
        if self.__metadata_output is not None:
            # parse output resolution and framerate
            out_video_rfparams = self.__extract_resolution_framerate(
                default_stream=default_stream_indexes[0], extract_output=True
            )
            if out_video_rfparams:
                self.__output_frames_resolution = out_video_rfparams["resolution"]
                self.__output_framerate = out_video_rfparams["framerate"]
                self.__output_orientation = out_video_rfparams["orientation"]
            # parse output pixel-format
            self.__output_frames_pixfmt = self.__extract_video_pixfmt(
                default_stream=default_stream_indexes[0], extract_output=True
            )

        # parse pixel-format
        self.__default_video_pixfmt = self.__extract_video_pixfmt(
            default_stream=default_stream_indexes[0]
        )

        # parse video decoder
        self.__default_video_decoder = self.__extract_video_decoder(
            default_stream=default_stream_indexes[0]
        )
        # parse rest of metadata
        if not self.__contains_images:
            # parse video bitrate
            self.__default_video_bitrate = self.__extract_video_bitrate(
                default_stream=default_stream_indexes[0]
            )
            # parse audio bitrate and samplerate
            audio_params = self.__extract_audio_bitrate_nd_samplerate(
                default_stream=default_stream_indexes[1]
            )
            if audio_params:
                self.__default_audio_bitrate = audio_params["bitrate"]
                self.__default_audio_samplerate = audio_params["samplerate"]
            # parse video duration
            self.__default_source_duration = self.__extract_duration()
            # calculate all flags
            if (
                self.__default_video_bitrate
                or (self.__default_video_framerate and self.__default_video_resolution)
            ) and (self.__default_audio_bitrate or self.__default_audio_samplerate):
                self.__contains_video = True
                self.__contains_audio = True
            elif self.__default_video_bitrate or (
                self.__default_video_framerate and self.__default_video_resolution
            ):
                self.__contains_video = True
            elif self.__default_audio_bitrate or self.__default_audio_samplerate:
                self.__contains_audio = True
            else:
                raise ValueError(
                    "Invalid source with no decodable audio or video stream provided. Aborting!"
                )
        # calculate approximate number of video frame
        if self.__default_video_framerate and self.__default_source_duration:
            self.__approx_video_nframes = np.rint(
                self.__default_video_framerate * self.__default_source_duration
            ).astype(int, casting="unsafe")

        # signal metadata has been probed
        self.__metadata_probed = True

        if self.__is_multi:
            # collect per-source metadata for the `sources` key. The primary
            # source's flat metadata is captured first via retrieve_metadata;
            # the guard inside retrieve_metadata (checks for non-empty
            # __multi_source_metadata) prevents `sources: []` self-pollution.
            self.__multi_source_metadata.append(self.retrieve_metadata(force_retrieve_missing=True))
            for idx in range(1, len(self.__source_list)):
                _src = self.__source_list[idx]
                _demux = self.__source_demuxer_list[idx]
                _prefixes = self.__ffmpeg_prefixes_list[idx]
                _params = self.__sourcer_params.copy()
                _params["-ffprefixes"] = _prefixes
                # spawn an independent single-source Sourcer per extra input;
                # this reuses the resolved ffmpeg path and isolates per-source
                # parsing state (which probe_stream otherwise clobbers).
                # Resolve to an absolute path: on Unix `self.__ffmpeg` may be
                # the bare command "ffmpeg" found via PATH, which the nested
                # `get_valid_ffmpeg_path()` would reject as "not a file".
                _custom_ffmpeg = (
                    self.__ffmpeg
                    if self.__ffmpeg and os.path.isfile(self.__ffmpeg)
                    else (shutil.which(self.__ffmpeg) or "")
                )
                _s = Sourcer(
                    _src,
                    source_demuxer=_demux,
                    custom_ffmpeg=_custom_ffmpeg,
                    verbose=self.__verbose_logs,
                    **_params,
                )
                _s.probe_stream(default_stream_indexes)
                self.__multi_source_metadata.append(
                    _s.retrieve_metadata(force_retrieve_missing=True)
                )

        # return reference to the instance object.
        return self

    def retrieve_metadata(
        self, pretty_json: bool = False, force_retrieve_missing: bool = False
    ) -> dict[str, Any] | str | tuple[dict[str, Any] | str, dict[str, Any] | str]:
        """
        This method returns Parsed/Probed Metadata of the given source.

        Parameters:
            pretty_json (bool): whether to return metadata as JSON string(if `True`) or Dictionary(if `False`) type?
            force_retrieve_missing (bool): whether to also return metadata missing in current Pipeline. This method returns `(metadata, metadata_missing)` tuple if `force_retrieve_missing=True` instead of `metadata`.

        **Returns:** `metadata` or `(metadata, metadata_missing)`, formatted as JSON string or python dictionary.
        """
        # check if metadata has been probed or not
        assert self.__metadata_probed, (
            "Source Metadata not been probed yet! Check if you called `probe_stream()` method."
        )
        # log it
        self.__verbose_logs and logger.debug("Extracting Metadata...")
        # create metadata dictionary from information populated in private class variables
        metadata = {
            "ffmpeg_binary_path": self.__ffmpeg,
            "source": self.__source,
        }
        metadata_missing = {}
        # Only either `source_demuxer` or `source_extension` attribute can be
        # present in metadata.
        if self.__source_demuxer is None:
            metadata.update({"source_extension": os.path.splitext(self.__source)[-1]})
            # update missing
            force_retrieve_missing and metadata_missing.update({"source_demuxer": ""})
        else:
            metadata.update({"source_demuxer": self.__source_demuxer})
            # update missing
            force_retrieve_missing and metadata_missing.update({"source_extension": ""})
        # add source video metadata properties
        metadata.update(
            {
                "source_video_resolution": self.__default_video_resolution,
                "source_video_pixfmt": self.__default_video_pixfmt,
                "source_video_framerate": self.__default_video_framerate,
                "source_video_orientation": self.__default_video_orientation,
                "source_video_decoder": self.__default_video_decoder,
                "source_duration_sec": self.__default_source_duration,
                "approx_video_nframes": (
                    int(self.__approx_video_nframes)
                    if self.__approx_video_nframes
                    and not any(
                        "loop" in x for x in self.__ffmpeg_prefixes
                    )  # check if any loops in prefix
                    and not any(
                        "loop" in x for x in dict2Args(self.__sourcer_params)
                    )  # check if any loops in filters
                    else None
                ),
                "source_video_bitrate": self.__default_video_bitrate,
                "source_audio_bitrate": self.__default_audio_bitrate,
                "source_audio_samplerate": self.__default_audio_samplerate,
                "source_has_video": self.__contains_video,
                "source_has_audio": self.__contains_audio,
                "source_has_image_sequence": self.__contains_images,
            }
        )
        # add output metadata properties (if available)
        if self.__metadata_output is not None:
            metadata.update(
                {
                    "output_frames_resolution": self.__output_frames_resolution,
                    "output_frames_pixfmt": self.__output_frames_pixfmt,
                    "output_framerate": self.__output_framerate,
                    "output_orientation": self.__output_orientation,
                }
            )
        else:
            # since output stream metadata properties are only available when additional
            # FFmpeg parameters(such as filters) are defined manually, thereby missing
            # output stream properties are handled by assigning them counterpart source
            # stream metadata property values
            force_retrieve_missing and metadata_missing.update(
                {
                    "output_frames_resolution": self.__default_video_resolution,
                    "output_frames_pixfmt": self.__default_video_pixfmt,
                    "output_framerate": self.__default_video_framerate,
                    "output_orientation": self.__default_video_orientation,
                }
            )

        # Only emit the `sources` key after per-source metadata is populated.
        # probe_stream() calls retrieve_metadata() once for the primary input
        # *before* populating __multi_source_metadata; without this guard the
        # primary's per-source dict would carry a stray empty `sources: []`
        # field that pollutes metadata["sources"][0].
        if self.__is_multi and self.__multi_source_metadata:
            metadata["sources"] = [m[0] for m in self.__multi_source_metadata]
            force_retrieve_missing and metadata_missing.update(
                {"sources": [m[1] for m in self.__multi_source_metadata]}
            )

        # log it
        self.__verbose_logs and logger.debug("Metadata Extraction completed successfully!")
        # parse as JSON string(`json.dumps`), if defined
        metadata = json.dumps(metadata, indent=2) if pretty_json else metadata
        metadata_missing = (
            json.dumps(metadata_missing, indent=2) if pretty_json else metadata_missing
        )
        # return `metadata` or `(metadata, metadata_missing)`
        return metadata if not force_retrieve_missing else (metadata, metadata_missing)

    @property
    def enumerate_devices(self) -> dict[int, Any]:
        """
        A property object that enumerate all probed Camera Devices connected to your system names
        along with their respective "device indexes" or "camera indexes" as python dictionary.

        **Returns:** Probed Camera Devices as python dictionary.
        """
        # check if metadata has been probed or not
        assert self.__metadata_probed, (
            "Source Metadata not been probed yet! Check if you called `probe_stream()` method."
        )

        # log if specified
        self.__verbose_logs and logger.debug("Enumerating all probed Camera Devices.")

        # return probed Camera Devices as python dictionary.
        return dict(enumerate(self.__extracted_devices_list))

    def __validate_source(
        self,
        source: str,
        source_demuxer: str | None = None,
        forced_validate: bool = False,
    ) -> str:
        """
        This Internal method validates source and extracts its metadata.

        Parameters:
            source_demuxer(str): specifies the demuxer(`-f`) for the input source.
            forced_validate (bool): whether to skip validation tests or not?

        **Returns:** `True` if passed tests else `False`.
        """
        # validate source demuxer(if defined)
        if source_demuxer is not None:
            # check if "auto" demuxer is specified
            if source_demuxer == "auto":
                # integerise source to get index
                index = int(source)
                # extract devices list and actual demuxer value
                (
                    self.__extracted_devices_list,
                    source_demuxer,
                ) = extract_device_n_demuxer(
                    self.__ffmpeg,
                    machine_OS=self.__machine_OS,
                    verbose=self.__verbose_logs,
                )
                # valid indexes range
                valid_indexes = list(
                    range(
                        -len(self.__extracted_devices_list),
                        len(self.__extracted_devices_list),
                    )
                )
                # check index is within valid range
                if self.__extracted_devices_list and index in valid_indexes:
                    # overwrite actual source device name/path/index
                    if self.__machine_OS == "Windows":
                        # Windows OS requires "video=" suffix
                        self.__source = source = "video={}".format(
                            self.__extracted_devices_list[index]
                        )
                    elif self.__machine_OS == "Darwin":
                        # Darwin OS requires only device indexes
                        self.__source = source = (
                            str(index)
                            if index >= 0
                            else str(len(self.__extracted_devices_list) + index)
                        )
                    else:
                        # Linux OS require /dev/video format
                        self.__source = source = next(
                            iter(self.__extracted_devices_list[index].keys())
                        )
                    # overwrite source_demuxer global variable
                    self.__source_demuxer = source_demuxer
                    self.__verbose_logs and logger.debug(
                        "Successfully configured device `{}` at index `{}` with demuxer `{}`.".format(
                            (
                                self.__extracted_devices_list[index]
                                if self.__machine_OS != "Linux"
                                else next(iter(self.__extracted_devices_list[index].values()))[0]
                            ),
                            (index if index >= 0 else len(self.__extracted_devices_list) + index),
                            self.__source_demuxer,
                        )
                    )
                else:
                    # raise error otherwise
                    raise ValueError(
                        "Given source `{}` is not a valid device index. Possible values index values can be: {}".format(
                            source,
                            ",".join(f"{x}" for x in valid_indexes),
                        )
                    )
            # otherwise validate against supported demuxers
            elif source_demuxer not in get_supported_demuxers(self.__ffmpeg):
                # raise if fails
                raise ValueError(
                    "Installed FFmpeg failed to recognize `{}` demuxer. Check `source_demuxer` parameter value again!".format(
                        source_demuxer
                    )
                )
            else:
                pass

        # assert if valid source
        assert source and isinstance(source, str), "Input `source` parameter is of invalid type!"

        # Differentiate input
        if forced_validate:
            source_demuxer is None and logger.critical(
                "Forcefully passing validation test for given source!"
            )
            self.__source = source
        elif os.path.isfile(source):
            self.__source = os.path.abspath(source)
        elif is_valid_image_seq(self.__ffmpeg, source=source, verbose=self.__verbose_logs):
            self.__source = source
            self.__contains_images = True
        elif is_valid_url(self.__ffmpeg, url=source, verbose=self.__verbose_logs):
            self.__source = source
        else:
            logger.error("`source` value is unusable or unsupported!")
            # discard the value otherwise
            raise ValueError("Input source is invalid. Aborting!")
        # format command
        if self.__sourcer_params:
            # handle additional params separately
            meta_cmd = (
                [self.__ffmpeg]
                + (["-hide_banner"] if not self.__verbose_logs else [])
                + ["-t", "0.0001"]
                + self.__ffmpeg_prefixes
                + (["-f", source_demuxer] if source_demuxer else [])
                + ["-i", source]
                + dict2Args(self.__sourcer_params)
                + ["-f", "null", "-"]
            )
        else:
            meta_cmd = (
                [self.__ffmpeg]
                + (["-hide_banner"] if not self.__verbose_logs else [])
                + self.__ffmpeg_prefixes
                + (["-f", source_demuxer] if source_demuxer else [])
                + ["-i", source]
            )
        # extract metadata, decode, and filter
        metadata = (
            check_sp_output(
                meta_cmd,
                force_retrieve_stderr=True,
            )
            .decode("utf-8")
            .strip()
        )
        # separate input and output metadata (if available)
        if "Output #" in metadata:
            (metadata, self.__metadata_output) = metadata.split("Output #")
        # return metadata based on params
        return metadata

    def __extract_video_bitrate(self, default_stream: int = 0) -> str:
        """
        This Internal method parses default video-stream bitrate from metadata.

        Parameters:
            default_stream (int): selects specific video-stream in case of multiple ones.

        **Returns:** Default Video bitrate as string value.
        """
        identifiers = ["Video:", "Stream #"]
        video_bitrate_text = [
            line.strip()
            for line in self.__ffsp_output.split("\n")
            if all(x in line for x in identifiers)
        ]
        if video_bitrate_text:
            selected_stream = video_bitrate_text[
                (
                    default_stream
                    if default_stream > 0 and default_stream < len(video_bitrate_text)
                    else 0
                )
            ]
            filtered_bitrate = re.findall(r",\s[0-9]+\s\w\w[\/]s", selected_stream.strip())
            if len(filtered_bitrate):
                default_video_bitrate = filtered_bitrate[0].split(" ")[1:3]
                final_bitrate = "{}{}".format(
                    int(default_video_bitrate[0].strip()),
                    "k" if (default_video_bitrate[1].strip().startswith("k")) else "M",
                )
                return final_bitrate
        return ""

    def __extract_video_decoder(self, default_stream: int = 0) -> str:
        """
        This Internal method parses default video-stream decoder from metadata.

        Parameters:
            default_stream (int): selects specific video-stream in case of multiple ones.

        **Returns:** Default Video decoder as string value.
        """
        assert isinstance(default_stream, int), "Invalid input!"
        identifiers = ["Video:", "Stream #"]
        meta_text = [
            line.strip()
            for line in self.__ffsp_output.split("\n")
            if all(x in line for x in identifiers)
        ]
        if meta_text:
            selected_stream = meta_text[
                (default_stream if default_stream > 0 and default_stream < len(meta_text) else 0)
            ]
            filtered_pixfmt = re.findall(r"Video:\s[a-z0-9_-]*", selected_stream.strip())
            if filtered_pixfmt:
                return filtered_pixfmt[0].split(" ")[-1]
        return ""

    def __extract_video_pixfmt(self, default_stream: int = 0, extract_output: bool = False) -> str:
        """
        This Internal method parses default video-stream pixel-format from metadata.

        Parameters:
            default_stream (int): selects specific video-stream in case of multiple ones.

        **Returns:** Default Video pixel-format as string value.
        """
        identifiers = ["Video:", "Stream #"]
        meta_text = (
            [
                line.strip()
                for line in self.__ffsp_output.split("\n")
                if all(x in line for x in identifiers)
            ]
            if not extract_output
            else [
                line.strip()
                for line in self.__metadata_output.split("\n")
                if all(x in line for x in identifiers)
            ]
        )
        if meta_text:
            selected_stream = meta_text[
                (default_stream if default_stream > 0 and default_stream < len(meta_text) else 0)
            ]
            filtered_pixfmt = re.findall(r",\s[a-z][a-z0-9_-]*", selected_stream.strip())
            if filtered_pixfmt:
                return filtered_pixfmt[0].split(" ")[-1]
        return ""

    def __extract_audio_bitrate_nd_samplerate(self, default_stream: int = 0) -> dict[str, str]:
        """
        This Internal method parses default audio-stream bitrate and sample-rate from metadata.

        Parameters:
            default_stream (int): selects specific audio-stream in case of multiple ones.

        **Returns:** Default Audio-stream bitrate and sample-rate as string value.
        """
        identifiers = ["Audio:", "Stream #"]
        meta_text = [
            line.strip()
            for line in self.__ffsp_output.split("\n")
            if all(x in line for x in identifiers)
        ]
        result = {}
        if meta_text:
            selected_stream = meta_text[
                (default_stream if default_stream > 0 and default_stream < len(meta_text) else 0)
            ]
            # filter data
            filtered_audio_bitrate = re.findall(
                r"fltp,\s[0-9]+\s\w\w[\/]s", selected_stream.strip()
            )
            filtered_audio_samplerate = re.findall(r",\s[0-9]+\sHz", selected_stream.strip())
            # get audio bitrate metadata
            if filtered_audio_bitrate:
                filtered = filtered_audio_bitrate[0].split(" ")[1:3]
                result["bitrate"] = "{}{}".format(
                    int(filtered[0].strip()),
                    "k" if (filtered[1].strip().startswith("k")) else "M",
                )
            else:
                result["bitrate"] = ""
            # get audio samplerate metadata
            result["samplerate"] = (
                filtered_audio_samplerate[0].split(", ")[1] if filtered_audio_samplerate else ""
            )
        return result if result and (len(result) == 2) else {}

    def __extract_resolution_framerate(
        self, default_stream: int = 0, extract_output: bool = False
    ) -> dict[str, Any]:
        """
        This Internal method parses default video-stream resolution, orientation, and framerate from metadata.

        Parameters:
            default_stream (int): selects specific audio-stream in case of multiple ones.
            extract_output (bool): Whether to extract from output(if true) or input(if false) stream?

        **Returns:** Default Video resolution and framerate as dictionary value.
        """
        identifiers = ["Video:", "Stream #"]
        # use output metadata if available
        meta_text = (
            [
                line.strip()
                for line in self.__ffsp_output.split("\n")
                if all(x in line for x in identifiers)
            ]
            if not extract_output
            else [
                line.strip()
                for line in self.__metadata_output.split("\n")
                if all(x in line for x in identifiers)
            ]
        )
        # extract video orientation metadata if available
        identifiers_orientation = ["displaymatrix:", "rotation"]
        meta_text_orientation = (
            [
                line.strip()
                for line in self.__ffsp_output.split("\n")
                if all(x in line for x in identifiers_orientation)
            ]
            if not extract_output
            else [
                line.strip()
                for line in self.__metadata_output.split("\n")
                if all(x in line for x in identifiers_orientation)
            ]
        )
        # use metadata if available
        result = {}
        if meta_text:
            selected_stream = meta_text[
                (default_stream if default_stream > 0 and default_stream < len(meta_text) else 0)
            ]

            # filter data
            filtered_resolution = re.findall(r"([1-9]\d+)x([1-9]\d+)", selected_stream.strip())
            filtered_framerate = re.findall(r"\d+(?:\.\d+)?\sfps", selected_stream.strip())
            filtered_tbr = re.findall(r"\d+(?:\.\d+)?\stbr", selected_stream.strip())

            # extract framerate metadata
            if filtered_framerate:
                # calculate actual framerate
                result["framerate"] = float(re.findall(r"[\d\.\d]+", filtered_framerate[0])[0])
            elif filtered_tbr:
                # guess from TBR(if fps unavailable)
                result["framerate"] = float(re.findall(r"[\d\.\d]+", filtered_tbr[0])[0])

            # extract resolution metadata
            if filtered_resolution:
                result["resolution"] = [int(x) for x in filtered_resolution[0]]

            # extract video orientation metadata
            if meta_text_orientation:
                selected_stream = meta_text_orientation[
                    (
                        default_stream
                        if default_stream > 0 and default_stream < len(meta_text)
                        else 0
                    )
                ]
                filtered_orientation = re.findall(r"[-]?\d+\.\d+", selected_stream.strip())
                result["orientation"] = float(filtered_orientation[0])
            else:
                result["orientation"] = 0.0

        return result if result and (len(result) == 3) else {}

    def __extract_duration(self, inseconds: bool = True) -> float | list[str]:
        """
        This Internal method parses stream duration from metadata.

        Parameters:
            inseconds (bool): whether to parse time in second(s) or `HH::mm::ss`?

        **Returns:** Default Stream duration as string value.
        """
        identifiers = ["Duration:"]
        stripped_data = [
            line.strip()
            for line in self.__ffsp_output.split("\n")
            if all(x in line for x in identifiers)
        ]
        if stripped_data:
            t_duration = re.findall(
                r"(?:[01]\d|2[0123]):(?:[012345]\d):(?:[012345]\d+(?:\.\d+)?)",
                stripped_data[0],
            )
            if t_duration:
                return (
                    sum(float(x) * 60**i for i, x in enumerate(reversed(t_duration[0].split(":"))))
                    if inseconds
                    else t_duration
                )
        return 0

enumerate_devices property

A property object that enumerate all probed Camera Devices connected to your system names along with their respective "device indexes" or "camera indexes" as python dictionary.

Returns: Probed Camera Devices as python dictionary.

__init__(source, source_demuxer=None, custom_ffmpeg='', verbose=False, **sourcer_params)

This constructor method initializes the object state and attributes of the Sourcer Class.

Parameters:

Name Type Description Default
source str

defines the input(-i) source filename/URL/device-name/device-path.

required
source_demuxer str

specifies the demuxer(-f) for the input source.

None
custom_ffmpeg str

assigns the location of custom path/directory for custom FFmpeg executable.

''
verbose bool

enables/disables verbose.

False
sourcer_params dict

provides the flexibility to control supported internal and FFmpeg parameters.

{}
Source code in deffcode/sourcer.py
def __init__(
    self,
    source: str | list[str],
    source_demuxer: str | list[str] | None = None,
    custom_ffmpeg: str = "",
    verbose: bool = False,
    **sourcer_params: Any,
) -> None:
    """
    This constructor method initializes the object state and attributes of the Sourcer Class.

    Parameters:
        source (str): defines the input(`-i`) source filename/URL/device-name/device-path.
        source_demuxer (str): specifies the demuxer(`-f`) for the input source.
        custom_ffmpeg (str): assigns the location of custom path/directory for custom FFmpeg executable.
        verbose (bool): enables/disables verbose.
        sourcer_params (dict): provides the flexibility to control supported internal and FFmpeg parameters.
    """
    # checks machine OS
    self.__machine_OS = platform.system()

    # define internal parameters
    self.__verbose_logs = (  # enable verbose if specified
        verbose if (verbose and isinstance(verbose, bool)) else False
    )

    # handle metadata received
    self.__ffsp_output = None

    # sanitize sourcer_params
    self.__sourcer_params = {
        str(k).strip(): (
            str(v).strip() if not isinstance(v, (dict, list, int, float, tuple)) else v
        )
        for k, v in sourcer_params.items()
    }

    # handle whether to force validate source
    self.__forcevalidatesource = self.__sourcer_params.pop("-force_validate_source", False)
    if not isinstance(self.__forcevalidatesource, bool):
        # reset improper values
        self.__forcevalidatesource = False

    # sanitize externally accessible parameters and setup list mapping
    self.__is_multi = isinstance(source, list)
    self.__source_list = source if self.__is_multi else [source]

    # validate source list early so downstream errors stay coherent
    if self.__is_multi and not self.__source_list:
        raise ValueError("Input `source` list is empty!")

    # handle user defined ffmpeg pre-headers(parameters such as `-re`) parameters (must be a list)
    _prefixes = self.__sourcer_params.pop("-ffprefixes", [])
    if not isinstance(_prefixes, list):
        # log it
        logger.warning(
            "Discarding invalid `-ffprefixes` value of wrong type `{}`!".format(
                type(_prefixes).__name__
            )
        )
        # reset improper values
        _prefixes = []

    if self.__is_multi:
        # multi-input requires per-source list-of-lists with matching length
        # to keep prefix routing unambiguous; flat lists are rejected.
        if _prefixes:
            if not all(isinstance(p, list) for p in _prefixes):
                raise ValueError(
                    "Multi-input `-ffprefixes` must be a list of per-input lists "
                    "(e.g. `[['-re'], ['-stream_loop', '-1']]`). "
                    "Flat lists are ambiguous in multi-input mode."
                )
            if len(_prefixes) != len(self.__source_list):
                raise ValueError(
                    "`-ffprefixes` length ({}) must match `source` list length ({})!".format(
                        len(_prefixes), len(self.__source_list)
                    )
                )
            self.__ffmpeg_prefixes_list = _prefixes
            self.__ffmpeg_prefixes = _prefixes[0]
        else:
            self.__ffmpeg_prefixes_list = [[] for _ in self.__source_list]
            self.__ffmpeg_prefixes = []
    else:
        # single-input keeps the original flat-list contract; nested lists are
        # only meaningful in multi-input mode, so reject them with a warning.
        if any(isinstance(p, list) for p in _prefixes):
            logger.warning(
                "Nested lists in `-ffprefixes` are only supported for multi-input sources. Discarding!"
            )
            _prefixes = []
        self.__ffmpeg_prefixes = _prefixes
        self.__ffmpeg_prefixes_list = [_prefixes]

    # handle source_demuxer list mapping
    if self.__is_multi:
        if isinstance(source_demuxer, list):
            if len(source_demuxer) != len(self.__source_list):
                raise ValueError(
                    "`source_demuxer` length ({}) must match `source` list length ({})!".format(
                        len(source_demuxer), len(self.__source_list)
                    )
                )
            self.__source_demuxer_list = source_demuxer
        else:
            self.__source_demuxer_list = [source_demuxer] * len(self.__source_list)
    else:
        self.__source_demuxer_list = [source_demuxer]

    # initialize per-source metadata buffer so retrieve_metadata can be
    # called safely (e.g. via the recursive primary probe in probe_stream)
    # without polluting the result with stale `sources` keys.
    self.__multi_source_metadata: list[Any] = []

    # handle where to save the downloaded FFmpeg Static assets on Windows(if specified)
    __ffmpeg_download_path = self.__sourcer_params.pop("-ffmpeg_download_path", "")
    if not isinstance(__ffmpeg_download_path, str):
        # reset improper values
        __ffmpeg_download_path = ""

    # validate the FFmpeg assets and return location (also downloads static assets on windows)
    self.__ffmpeg = get_valid_ffmpeg_path(
        str(custom_ffmpeg),
        self.__machine_OS == "Windows",
        ffmpeg_download_path=__ffmpeg_download_path,
        verbose=self.__verbose_logs,
    )

    # check if valid FFmpeg path returned
    if self.__ffmpeg:
        self.__verbose_logs and logger.debug(
            "Found valid FFmpeg executable: `{}`.".format(self.__ffmpeg)
        )
    else:
        # else raise error
        raise RuntimeError(
            "[DeFFcode:ERROR] :: Failed to find FFmpeg assets on this system. Kindly compile/install FFmpeg or provide a valid custom FFmpeg binary path!"
        )

    # sanitize externally accessible parameters and assign them
    # Use primary index 0 for fallback properties validation
    if not self.__source_list:
        raise ValueError("Input `source` parameter is empty!")
    source = self.__source_list[0]
    source_demuxer = self.__source_demuxer_list[0]

    # handles source demuxer
    if source is None:
        # first check if source value is empty
        # raise error if true
        raise ValueError("Input `source` parameter is empty!")
    elif isinstance(source_demuxer, str):
        # assign if valid demuxer value
        self.__source_demuxer = source_demuxer.strip().lower()
        # assign if valid demuxer value
        assert self.__source_demuxer != "auto" or validate_device_index(source), (
            "Invalid `source_demuxer='auto'` value detected with source: `{}`. Aborting!".format(
                source
            )
        )
    else:
        # otherwise find valid default source demuxer value
        # enforce "auto" if valid index device
        self.__source_demuxer = "auto" if validate_device_index(source) else None
        # log if not valid index device and invalid type
        self.__verbose_logs and self.__source_demuxer not in [
            "auto",
            None,
        ] and logger.warning(
            "Discarding invalid `source_demuxer` parameter value of wrong type: `{}`".format(
                type(source_demuxer).__name__
            )
        )
        # log if not valid index device and invalid type
        self.__verbose_logs and self.__source_demuxer == "auto" and logger.critical(
            "Given source `{}` is a valid device index. Enforcing 'auto' demuxer.".format(
                source
            )
        )

    # handles source stream
    self.__source = source

    # creates shallow copy for further usage #TODO
    self.__source_org = copy.copy(self.__source)
    self.__source_demuxer_org = copy.copy(self.__source_demuxer)

    # handles all extracted devices names/paths list
    # when source_demuxer = "auto"
    self.__extracted_devices_list: list[Any] = []

    # various source stream params
    self.__default_video_resolution = ""  # handles stream resolution
    self.__default_video_orientation = ""  # handles stream's video orientation
    self.__default_video_framerate = ""  # handles stream framerate
    self.__default_video_bitrate = ""  # handles stream's video bitrate
    self.__default_video_pixfmt = ""  # handles stream's video pixfmt
    self.__default_video_decoder = ""  # handles stream's video decoder
    self.__default_source_duration = ""  # handles stream's video duration
    self.__approx_video_nframes = ""  # handles approx stream frame number
    self.__default_audio_bitrate = ""  # handles stream's audio bitrate
    self.__default_audio_samplerate = ""  # handles stream's audio samplerate

    # handle various stream flags
    self.__contains_video = False  # contains video
    self.__contains_audio = False  # contains audio
    self.__contains_images = False  # contains image-sequence

    # handles output parameters through filters
    self.__metadata_output = None  # handles output stream metadata
    self.__output_frames_resolution = ""  # handles output stream resolution
    self.__output_framerate = ""  # handles output stream framerate
    self.__output_frames_pixfmt = ""  # handles output frame pixel format
    self.__output_orientation = ""  # handles output frame orientation

    # check whether metadata probed or not?
    self.__metadata_probed = False

probe_stream(default_stream_indexes=(0, 0))

This method Parses/Probes FFmpeg subprocess pipe's Standard Output for given input source and Populates the information in private class variables.

Parameters:

Name Type Description Default
default_stream_indexes (list, tuple)

selects specific video and audio stream index in case of multiple ones. Value can be of format: (int,int). For example (0,1) is ("0th video stream", "1st audio stream").

(0, 0)

Returns: Reference to the instance object.

Source code in deffcode/sourcer.py
def probe_stream(self, default_stream_indexes: list[int] | tuple[int, int] = (0, 0)) -> Sourcer:
    """
    This method Parses/Probes FFmpeg `subprocess` pipe's Standard Output for given input source and Populates the information in private class variables.

    Parameters:
        default_stream_indexes (list, tuple): selects specific video and audio stream index in case of multiple ones. Value can be of format: `(int,int)`. For example `(0,1)` is ("0th video stream", "1st audio stream").

    **Returns:** Reference to the instance object.
    """
    assert (
        isinstance(default_stream_indexes, (list, tuple))
        and len(default_stream_indexes) == 2
        and all(isinstance(x, int) for x in default_stream_indexes)
    ), "Invalid default_stream_indexes value!"
    # validate source and extract metadata
    self.__ffsp_output = self.__validate_source(
        self.__source,
        source_demuxer=self.__source_demuxer,
        forced_validate=(self.__forcevalidatesource if self.__source_demuxer is None else True),
    )
    # parse resolution and framerate
    video_rfparams = self.__extract_resolution_framerate(
        default_stream=default_stream_indexes[0]
    )
    if video_rfparams:
        self.__default_video_resolution = video_rfparams["resolution"]
        self.__default_video_framerate = video_rfparams["framerate"]
        self.__default_video_orientation = video_rfparams["orientation"]

    # parse output parameters through filters (if available)
    if self.__metadata_output is not None:
        # parse output resolution and framerate
        out_video_rfparams = self.__extract_resolution_framerate(
            default_stream=default_stream_indexes[0], extract_output=True
        )
        if out_video_rfparams:
            self.__output_frames_resolution = out_video_rfparams["resolution"]
            self.__output_framerate = out_video_rfparams["framerate"]
            self.__output_orientation = out_video_rfparams["orientation"]
        # parse output pixel-format
        self.__output_frames_pixfmt = self.__extract_video_pixfmt(
            default_stream=default_stream_indexes[0], extract_output=True
        )

    # parse pixel-format
    self.__default_video_pixfmt = self.__extract_video_pixfmt(
        default_stream=default_stream_indexes[0]
    )

    # parse video decoder
    self.__default_video_decoder = self.__extract_video_decoder(
        default_stream=default_stream_indexes[0]
    )
    # parse rest of metadata
    if not self.__contains_images:
        # parse video bitrate
        self.__default_video_bitrate = self.__extract_video_bitrate(
            default_stream=default_stream_indexes[0]
        )
        # parse audio bitrate and samplerate
        audio_params = self.__extract_audio_bitrate_nd_samplerate(
            default_stream=default_stream_indexes[1]
        )
        if audio_params:
            self.__default_audio_bitrate = audio_params["bitrate"]
            self.__default_audio_samplerate = audio_params["samplerate"]
        # parse video duration
        self.__default_source_duration = self.__extract_duration()
        # calculate all flags
        if (
            self.__default_video_bitrate
            or (self.__default_video_framerate and self.__default_video_resolution)
        ) and (self.__default_audio_bitrate or self.__default_audio_samplerate):
            self.__contains_video = True
            self.__contains_audio = True
        elif self.__default_video_bitrate or (
            self.__default_video_framerate and self.__default_video_resolution
        ):
            self.__contains_video = True
        elif self.__default_audio_bitrate or self.__default_audio_samplerate:
            self.__contains_audio = True
        else:
            raise ValueError(
                "Invalid source with no decodable audio or video stream provided. Aborting!"
            )
    # calculate approximate number of video frame
    if self.__default_video_framerate and self.__default_source_duration:
        self.__approx_video_nframes = np.rint(
            self.__default_video_framerate * self.__default_source_duration
        ).astype(int, casting="unsafe")

    # signal metadata has been probed
    self.__metadata_probed = True

    if self.__is_multi:
        # collect per-source metadata for the `sources` key. The primary
        # source's flat metadata is captured first via retrieve_metadata;
        # the guard inside retrieve_metadata (checks for non-empty
        # __multi_source_metadata) prevents `sources: []` self-pollution.
        self.__multi_source_metadata.append(self.retrieve_metadata(force_retrieve_missing=True))
        for idx in range(1, len(self.__source_list)):
            _src = self.__source_list[idx]
            _demux = self.__source_demuxer_list[idx]
            _prefixes = self.__ffmpeg_prefixes_list[idx]
            _params = self.__sourcer_params.copy()
            _params["-ffprefixes"] = _prefixes
            # spawn an independent single-source Sourcer per extra input;
            # this reuses the resolved ffmpeg path and isolates per-source
            # parsing state (which probe_stream otherwise clobbers).
            # Resolve to an absolute path: on Unix `self.__ffmpeg` may be
            # the bare command "ffmpeg" found via PATH, which the nested
            # `get_valid_ffmpeg_path()` would reject as "not a file".
            _custom_ffmpeg = (
                self.__ffmpeg
                if self.__ffmpeg and os.path.isfile(self.__ffmpeg)
                else (shutil.which(self.__ffmpeg) or "")
            )
            _s = Sourcer(
                _src,
                source_demuxer=_demux,
                custom_ffmpeg=_custom_ffmpeg,
                verbose=self.__verbose_logs,
                **_params,
            )
            _s.probe_stream(default_stream_indexes)
            self.__multi_source_metadata.append(
                _s.retrieve_metadata(force_retrieve_missing=True)
            )

    # return reference to the instance object.
    return self

retrieve_metadata(pretty_json=False, force_retrieve_missing=False)

This method returns Parsed/Probed Metadata of the given source.

Parameters:

Name Type Description Default
pretty_json bool

whether to return metadata as JSON string(if True) or Dictionary(if False) type?

False
force_retrieve_missing bool

whether to also return metadata missing in current Pipeline. This method returns (metadata, metadata_missing) tuple if force_retrieve_missing=True instead of metadata.

False

Returns: metadata or (metadata, metadata_missing), formatted as JSON string or python dictionary.

Source code in deffcode/sourcer.py
def retrieve_metadata(
    self, pretty_json: bool = False, force_retrieve_missing: bool = False
) -> dict[str, Any] | str | tuple[dict[str, Any] | str, dict[str, Any] | str]:
    """
    This method returns Parsed/Probed Metadata of the given source.

    Parameters:
        pretty_json (bool): whether to return metadata as JSON string(if `True`) or Dictionary(if `False`) type?
        force_retrieve_missing (bool): whether to also return metadata missing in current Pipeline. This method returns `(metadata, metadata_missing)` tuple if `force_retrieve_missing=True` instead of `metadata`.

    **Returns:** `metadata` or `(metadata, metadata_missing)`, formatted as JSON string or python dictionary.
    """
    # check if metadata has been probed or not
    assert self.__metadata_probed, (
        "Source Metadata not been probed yet! Check if you called `probe_stream()` method."
    )
    # log it
    self.__verbose_logs and logger.debug("Extracting Metadata...")
    # create metadata dictionary from information populated in private class variables
    metadata = {
        "ffmpeg_binary_path": self.__ffmpeg,
        "source": self.__source,
    }
    metadata_missing = {}
    # Only either `source_demuxer` or `source_extension` attribute can be
    # present in metadata.
    if self.__source_demuxer is None:
        metadata.update({"source_extension": os.path.splitext(self.__source)[-1]})
        # update missing
        force_retrieve_missing and metadata_missing.update({"source_demuxer": ""})
    else:
        metadata.update({"source_demuxer": self.__source_demuxer})
        # update missing
        force_retrieve_missing and metadata_missing.update({"source_extension": ""})
    # add source video metadata properties
    metadata.update(
        {
            "source_video_resolution": self.__default_video_resolution,
            "source_video_pixfmt": self.__default_video_pixfmt,
            "source_video_framerate": self.__default_video_framerate,
            "source_video_orientation": self.__default_video_orientation,
            "source_video_decoder": self.__default_video_decoder,
            "source_duration_sec": self.__default_source_duration,
            "approx_video_nframes": (
                int(self.__approx_video_nframes)
                if self.__approx_video_nframes
                and not any(
                    "loop" in x for x in self.__ffmpeg_prefixes
                )  # check if any loops in prefix
                and not any(
                    "loop" in x for x in dict2Args(self.__sourcer_params)
                )  # check if any loops in filters
                else None
            ),
            "source_video_bitrate": self.__default_video_bitrate,
            "source_audio_bitrate": self.__default_audio_bitrate,
            "source_audio_samplerate": self.__default_audio_samplerate,
            "source_has_video": self.__contains_video,
            "source_has_audio": self.__contains_audio,
            "source_has_image_sequence": self.__contains_images,
        }
    )
    # add output metadata properties (if available)
    if self.__metadata_output is not None:
        metadata.update(
            {
                "output_frames_resolution": self.__output_frames_resolution,
                "output_frames_pixfmt": self.__output_frames_pixfmt,
                "output_framerate": self.__output_framerate,
                "output_orientation": self.__output_orientation,
            }
        )
    else:
        # since output stream metadata properties are only available when additional
        # FFmpeg parameters(such as filters) are defined manually, thereby missing
        # output stream properties are handled by assigning them counterpart source
        # stream metadata property values
        force_retrieve_missing and metadata_missing.update(
            {
                "output_frames_resolution": self.__default_video_resolution,
                "output_frames_pixfmt": self.__default_video_pixfmt,
                "output_framerate": self.__default_video_framerate,
                "output_orientation": self.__default_video_orientation,
            }
        )

    # Only emit the `sources` key after per-source metadata is populated.
    # probe_stream() calls retrieve_metadata() once for the primary input
    # *before* populating __multi_source_metadata; without this guard the
    # primary's per-source dict would carry a stray empty `sources: []`
    # field that pollutes metadata["sources"][0].
    if self.__is_multi and self.__multi_source_metadata:
        metadata["sources"] = [m[0] for m in self.__multi_source_metadata]
        force_retrieve_missing and metadata_missing.update(
            {"sources": [m[1] for m in self.__multi_source_metadata]}
        )

    # log it
    self.__verbose_logs and logger.debug("Metadata Extraction completed successfully!")
    # parse as JSON string(`json.dumps`), if defined
    metadata = json.dumps(metadata, indent=2) if pretty_json else metadata
    metadata_missing = (
        json.dumps(metadata_missing, indent=2) if pretty_json else metadata_missing
    )
    # return `metadata` or `(metadata, metadata_missing)`
    return metadata if not force_retrieve_missing else (metadata, metadata_missing)