AILab/ai/video-sca/README

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2019-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

Prerequisites:
- DeepStreamSDK 8.0
- NVIDIA Triton Inference Server (optional)
- Python 3.12
- Gst-python

To set up Triton Inference Server: (optional)
For x86_64 and Jetson Docker:
  1. Use the provided docker container and follow directions for
     Triton Inference Server in the SDK README --
     be sure to prepare the detector models.
  2. Run the docker with this Python Bindings directory mapped
  3. Install required Python packages inside the container:
     $ apt update
     $ apt install python3-gi python3-dev python3-gst-1.0 -y
     $ pip3 install pathlib
  4. Build and install pyds bindings:
     Follow the instructions in bindings README in this repo to build and install
     pyds wheel for Ubuntu 24.04
  5. For Triton gRPC setup, please follow the instructions at below location:
     /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app-triton-grpc/README

For Jetson without Docker:
  1. Follow instructions in the DeepStream SDK README to set up
     Triton Inference Server:
     2.1 Compile and install the nvdsinfer_customparser
     2.2 Prepare at least the Triton detector models
  2. Build and install pyds bindings:
     Follow the instructions in bindings README in this repo to build and install
     pyds wheel for Ubuntu 24.04
  3. Clear the GStreamer cache if pipeline creation fails:
     rm ~/.cache/gstreamer-1.0/*
  4. For Triton gRPC setup, please follow the instructions at below location:
     /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app-triton-grpc/README

To setup peoplenet model and configs (optional):
Download Peoplenet model:
  $ mkdir -p /opt/nvidia/deepstream/deepstream/samples/models/peoplenet
  $ cd /opt/nvidia/deepstream/deepstream/samples/models/peoplenet
  $ wget --content-disposition 'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/tao/peoplenet/pruned_quantized_decrypted_v2.3.4/files?redirect=true&path=resnet34_peoplenet_int8.onnx' -O resnet34_peoplenet_int8.onnx
  $ wget --content-disposition 'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/tao/peoplenet/pruned_quantized_decrypted_v2.3.4/files?redirect=true&path=resnet34_peoplenet_int8.txt' -O resnet34_peoplenet_int8.txt
  $ wget --content-disposition 'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/tao/peoplenet/pruned_quantized_decrypted_v2.3.4/files?redirect=true&path=labels.txt' -O labels.txt

Additionally, for Triton and Triton gRPC
   $ cp config.pbtxt /opt/nvidia/deepstream/deepstream/samples/models/peoplenet
   $ mkdir -p /opt/nvidia/deepstream/deepstream/samples/models/peoplenet/1
   $ /usr/src/tensorrt/bin/trtexec --onnx=/opt/nvidia/deepstream/deepstream/samples/models/peoplenet/resnet34_peoplenet_int8.onnx --fp16 \
   --saveEngine=/opt/nvidia/deepstream/deepstream/samples/models/peoplenet/1/resnet34_peoplenet_int8.onnx_b2_gpu0_fp16.engine \
   --minShapes="input_1:0":1x3x544x960 \
   --optShapes="input_1:0":2x3x544x960 \
   --maxShapes="input_1:0":2x3x544x960

To run:
  $ python3 deepstream_test_3.py -i <uri1> [uri2] ... [uriN] [--no-display] [--silent]
e.g.
  $ python3 deepstream_test_3.py -i file:///home/ubuntu/video1.mp4 file:///home/ubuntu/video2.mp4
  $ python3 deepstream_test_3.py -i rtsp://127.0.0.1/video1 rtsp://127.0.0.1/video2 -s

To run peoplenet, test3 now supports 3 modes:

  1. nvinfer + peoplenet: this mode still uses TRT for inferencing.

     $ python3 deepstream_test_3.py -i <uri1> [uri2] ... [uriN] --pgie nvinfer -c config_infer_primary_peoplenet.txt [--no-display] [--silent]

  2. nvinferserver + peoplenet : this mode uses Triton for inferencing.

     $ python3 deepstream_test_3.py -i <uri1> [uri2] ... [uriN] --pgie nvinferserver -c config_triton_infer_primary_peoplenet.txt [--no-display] [-s]

  3. nvinferserver (gRPC) + peoplenet : this mode uses Triton gRPC for inferencing.

     $ mkdir -p /opt/nvidia/deepstream/deepstream/samples/models/peoplenet-grpc/peoplenet
     $ cp /opt/nvidia/deepstream/deepstream/samples/models/peoplenet/config.pbtxt /opt/nvidia/deepstream/deepstream/samples/models/peoplenet-grpc/peoplenet/config.pbtxt
     $ cp -a /opt/nvidia/deepstream/deepstream/samples/models/peoplenet/1 /opt/nvidia/deepstream/deepstream/samples/models/peoplenet-grpc/peoplenet/1
     $ tritonserver --model-repository=/opt/nvidia/deepstream/deepstream/samples/models/peoplenet-grpc
     $ python3 deepstream_test_3.py -i <uri1> [uri2] ... [uriN] --pgie nvinferserver-grpc -c config_triton_grpc_infer_primary_peoplenet.txt [--no-display] [--silent]

e.g.
  $ python3 deepstream_test_3.py -i file:///home/ubuntu/video1.mp4 file:///home/ubuntu/video2.mp4 --pgie nvinfer -c config_infer_primary_peoplenet.txt --no-display --silent
  $ python3 deepstream_test_3.py -i rtsp://127.0.0.1/video1 rtsp://127.0.0.1/video2 --pgie nvinferserver -c config_triton_infer_primary_peoplenet.txt -s
  $ python3 deepstream_test_3.py -i rtsp://127.0.0.1/video1 rtsp://127.0.0.1/video2 --pgie nvinferserver-grpc -c config_triton_grpc_infer_primary_peoplenet.txt --no-display --silent

Note:
1) if --pgie is not specified, test3 uses nvinfer and default model, not peoplenet.
2) Both --pgie and -c need to be provided for custom models.
3) Configs other than peoplenet can also be provided using the above approach.
4) --no-display option disables on-screen video display.
5) -s/--silent option can be used to suppress verbose output.
6) --file-loop option can be used to loop input files after EOS.
7) --disable-probe option can be used to disable the probe function and to use nvdslogger for perf measurements.
8) To enable Pipeline Latency Measurement, set environment variable : NVDS_ENABLE_LATENCY_MEASUREMENT=1
9) To enable Component Level Latency Measurement, set environment variable : NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1 in addition to NVDS_ENABLE_LATENCY_MEASUREMENT=1

This document describes the sample deepstream-test3 application.


 * Use multiple sources in the pipeline.
 * Use a uridecodebin so that any type of input (e.g. RTSP/File), any GStreamer
   supported container format, and any codec can be used as input.
 * Configure the stream-muxer to generate a batch of frames and infer on the
   batch for better resource utilization.
 * Extract the stream metadata, which contains useful information about the
   frames in the batched buffer.
 * Showcases how to enable latency measurement using probe function

Refer to the deepstream-test1 sample documentation for an example of simple
single-stream inference, bounding-box overlay, and rendering.

This sample accepts one or more H.264/H.265 video streams as input. It creates
a source bin for each input and connects the bins to an instance of the
"nvstreammux" element, which forms the batch of frames. The batch of
frames is fed to "nvinfer" for batched inferencing. The batched buffer is
composited into a 2D tile array using "nvmultistreamtiler." The rest of the
pipeline is similar to the deepstream-test1 sample.

The "width" and "height" properties must be set on the stream-muxer to set the
output resolution. If the input frame resolution is different from
stream-muxer's "width" and "height", the input frame will be scaled to muxer's
output resolution.

The stream-muxer waits for a user-defined timeout before forming the batch. The
timeout is set using the "batched-push-timeout" property. If the complete batch
is formed before the timeout is reached, the batch is pushed to the downstream
element. If the timeout is reached before the complete batch can be formed
(which can happen in case of rtsp sources), the batch is formed from the
available input buffers and pushed. Ideally, the timeout of the stream-muxer
should be set based on the framerate of the fastest source. It can also be set
to -1 to make the stream-muxer wait infinitely.

The "nvmultistreamtiler" composite streams based on their stream-ids in
row-major order (starting from stream 0, left to right across the top row, then
across the next row, etc.).