A high-performance real-time pipeline for detecting, tracking, and analyzing people in video streams. Achieves ≥25 FPS processing with comprehensive analytics including age and gender estimation.
Abstract: Multi-label image classification, which involves recognizing multiple objects within a single image, is a fundamental task in computer vision. Recently, Visual-Language Models (VLMs) have ...
Abstract: In this paper, we propose an automatic topic-name extraction method that combines explainable AI (XAI) and a large language model (LLM) for text datasets. In the past, topic models such as ...