Username   Password       Forgot your password?  Forgot your username? 


Method based on Separation Confidence Computation and Scale Synthesis Optimization for Real-Time Target Detection in Streetscape Videos

Volume 15, Number 6, June 2019, pp. 1538-1547
DOI: 10.23940/ijpe.19.06.p5.15381547

Jianmin Liua,b, Minhua Yangb, and Jianmei Tana

aSchool of Information and Statistics, Guangxi University of Finance and Economics, Nanning, 530003, China
bSchool of Geosciences and Info-Physics, Central South University, Changsha, 410000, China


(Submitted on March 20, 2019; Revised on April 8, 2019; Accepted on June 6, 2019)


This study proposes a method for the real-time detection and recognition of targets in streetscape videos. The proposed method is based on separation confidence computation and scale synthesis optimization. First, on the basis of generalization in transfer learning, we combine a fine-tuning method suitable for non-convex optimization and adaptive moment estimation in high-dimensional space. Then, we dynamically adjust the learning rates of parameters on the basis of first and second gradient moment estimations. We establish the framework and implementation steps of the proposed method by organically combining regular term super-parameter generalization and hard-example mining technology. We use the proposed method to detect and recognize targets in streetscape videos with high frame rates and high definition. Furthermore, we experimentally demonstrate that the accuracy and robustness of our proposed method are superior to those of conventional methods.


References: 23

  1. S. Wu, D. Chen, and X. Wang, “Moving Target Detection based on Improved Three Frame Difference and Visual Background Extractor,” in Proceedings of 2017 10th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, IEEE, 2017
  2. X. Shen, Z. Song, and H. Fan, “Data Level Moving Target Detection Algorithm based on Bernoulli Random Finite Set,” Iet Signal Processing, Vol. 12, No. 6, 2018
  3. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems, pp. 1097-1105, 2012
  4. K. He, X. Zhang, and S. Ren, “Deep Residual Learning for Image Recognition,” in Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
  5. S. Christian, T. Alexander, and E. Dumitru, “Deep Neural Networks for Object Detection,” in Proceedings of Conference on Neural Information Processing Systems, pp. 2553-2561, 2013
  6. R. Girshick, J. Donahue, and T. Darrell, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” arXiv Preprint arXiv: 1311.2524v3, 2014
  7. R. Girshick, “Fast R-CNN,” arXiv Preprint arXiv: 1504.08083, 2015
  8. K. R. He and R. Girshick, “Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks,” in Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 1, pp. 91-99, 2016
  9. J. Redmon, S. Divvala, and R. Girshick, “You Only Look Once: Unified, Real-Time Object Detection,” arXiv Preprint arXiv: 1506. 02640, 2016
  10. C. Szegedy, W. Liu, and Y. Jia, “Going Deeper with Convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015
  11. J. Li, H. C. Wong, and S. L. Lo, “Multiple Object Detection by Deformable Part-based Model and R-CNN,” IEEE Signal Processing Letters, Vol. 1, No. 1, pp. 99, 2018
  12. P. Dong and W. Wang, “Better, Region Proposals for Pedestrian Detection with R-CNN,” in Proceedings of Conference on Visual Communications and Image Processing, pp. 1-4, IEEE, 2017
  13. J. R. R. Uijlings, K. Sande, and T. Gevers, “Selective Search for Object Recognition,” International Journal of Computer Vision, Vol. 104, No. 2, pp. 154-171, 2013
  14. J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” arXiv Preprint arXiv: 1612.08242, 2016
  15. W. Liu, D. Anguelov, and D. Erhan, “SSD: Single Shot MultiBox Detector,” in Proceedings of European Conference on Computer Vision, Springer, Cham, pp. 21-37, 2016
  16. M. Zhu, “Recall, Precision and Average Precision,” Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, pp. 2-30, 2004
  17. A. Neubeck and G. L. Van, “Efficient Non-Maximum Suppression,” in Proceedings of ICPR 2006 18th International Conference on Pattern Recognition, pp. 850-855, IEEE, 2006
  18. S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Transactions on Knowledge & Data Engineering, Vol. 22, No. 10, pp. 1345-1359, 2010
  19. A. Shrivastava, A. Gupta, and R. Girshick, “Training Region-based Object Detectors with Online Hard Example Mining,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 761-769, 2016
  20. “The Pascal Visual Object Classes Challenge 2012 (voc2012) Results (2012),” ( /challenges/VOC/, last accessed on January 1, 2019)
  21. A. Geiger, P. Lenz, and R. Urtasun, “Are We Ready for Autonomous Driving the Kitti Vision Benchmark Suite,” in Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354-3361, 2012
  22. “Udacity. Public Driving Dataset,” (,
  23. O. Russakovsky, J. Deng, and H. Su, “Imagenet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252, 2015


Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

This site uses encryption for transmitting your passwords.