Conclusion
All in all, the results of this internship are interesting and promising, even if not decisive.
Regarding datasets, the new dataset that was created is promising for several reasons. First, its spatial extent can easily be extended since the raw data that is used is publicly available and covers the whole of the Netherlands. Then, its quality will also surely keep increasing in the future with the different iterations of the images and the point clouds, at the cost of small annotations modifications to update the dataset with new and cut-off trees, as well as tree growth. The diversity of trees and environments from the Netherlands is obviously not even close from what can be found globally, which wouldn’t make it a great dataset to train a global model, but it has the potential to be a perfect playground for testing new methods. Finally, the main drawback of this dataset are the spatial and temporal shifts between each type of raw data. But these shifts have at least proven to be manageable by the deep learning models that were trained here. Having these shifts is also interesting because counting on having perfectly aligned RGB images and point clouds is even less likely than having both of them available in the first place.
Regarding the model, it is unclear whether having multiple layers of CHM really improves the results. This is because these layers would have the biggest impact in the detection of covered trees, which are a specific case that is harder than the other trees. And since the training dataset was too small, the model overfitted quickly and could really reach the state when it start learning to find these harder trees. Therefore, more experiments on a bigger dataset, maybe using better augmentation techniques, would be required to get an answer. Besides that, the architecture in itself proved to provide great performance and is quickly able to learn to detect the medium and large trees. Some interesting improvements could easily be added to the model, such as the prediction of mask instead of bounding boxes, which only requires to change the detection heads, or the prediction of species. However, these changes would require the dataset to be substantially with species and precise delineations for all trees.