Wong, Yew Chien (2022) Construction Site Object Detection API Comparative Study. Final Year Project, UTAR.
Abstract
Application of object detection in the construction industry has been extensive, albeit not rich. There is a huge potential in the implementation of object detection. In this project, three object detection models – YOLOv3, Detectron2, and EfficienDet are coded with Python on Google Colab. A modified image dataset that contains 100 images of construction object were prepared. The object classes were person, helmet, vest, and slogan. The image dataset was pre-processed on Roboflow, before being fed into the model. Two out of three models produced meaningful results, whereas one required more data in order to produce results with acceptable level of accuracy. The more data hungry model was dissected and analysed. YOLOv4 model took 1 hour 30 minutes and 4 seconds to train, Detectron2 took 2 hours 42 minutes and 4 seconds, and EfficientDet took 28 minutes and 38 seconds. The longer training time corresponds to a lesser average detection time. YOLOv4 took 4143 milliseconds to detect an image, Detectron2 took 742 milliseconds, and EfficientDet took 305 milliseconds. EfficientDet were the fastest because it does not need to be accurate. The results also suggested that some object detectors were better to detect specific objects. For instance, YOLOv4 can detect hat and slogan better than Detectron2, whereas the latter model can detect person and vest better. Therefore, it is recommended that different object detectors are used depending on the objective of the detection. Among the three object detectors, the highest achieved accuracy for person, hat, slogan, and vest is 94.44%, 78.00%, 81.25%, and 60.00% respectively. All three models successfully detect small and large sized objects. YOLOv4 and Detectron2 were suitable to be applied in the field, but not the EfficientDet model. Surprisingly, the model was able to detect objects that were unannotated in the pre-processing phase. This indicated that the model received enough data on some classes to make predictions by itself. Despite that, more data should be provided in order to produce a more robust object detector. This also means that a better hardware should be acquired to provide better computational power. In this study, the deployment phase was not included, due to the limitation of resources. However, this report shows the necessary steps needed in order to code a working object detector to detect custom image data. In the future, it is recommended that more object classes were to be detected, and the model should move into deployment phase to study its real time accuracy.
Actions (login required)