1 State-of-the-art

1.1 Computer vision tasks related to trees

Before talking about models and datasets, let’s define properly the task that this project focused on, in the midst of all the various computer vision tasks, and specifically those related to tree detection.

The first main differentiation between tree recognition tasks comes from the acquisition of the data. There are some very different tasks and methods using either ground data or aerial/satellite data. This is especially true when focusing on urban trees, since a lot of street view data is available (Arevalo-Ramirez et al. 2024).

This leads to the second variation, which is related to the kind of environment that we are interested in. Papers in this field usually focus on one or two types of tree environments: urban areas (Arevalo-Ramirez et al. 2024; Ventura et al. 2024), tree plantations (Gomes and Maillard 2016; Safonova et al. 2021; Hao et al. 2021) and forests (Gomes and Maillard 2016; B. Weinstein, Marconi, and White 2022; Zhong et al. 2024). These types of environments influence, among other things, the organization of the trees in space. This is important, because the tasks and the difficulty depends on the type of environment. Tree plantations are much easier to work with than completely wild forests, while urban areas contain various levels of difficulty ranging from alignment trees to private and disorganized gardens and parks. For this project, we mainly focused on urban areas, but the pipeline and the model should still be applicable to tree plantations and forests.

Then, there are four fundamental computer vision tasks that have their respective applications when dealing with trees (Safonova et al. 2021):

Classification, which consists in assigning one class label to an image, equivalent to putting it into a category given a list of possible categories. This is quite rare for airborne tree applications though since there are multiple trees on each image most of the time
Detection, which consists in detecting objects and placing boxes around them
Semantic segmentation, which consists in associating a class label to every pixel of an image,
Instance segmentation, which consists in adding a layer of complexity to semantic segmentation by also differentiating between the different instances of each class

These generic tasks can be extended by trying to get more information from the data about the trees. The most common pieces of information are the species and the height, but some models also try to predict the health of the trees (Safonova et al. 2021), or their carbon stock (Reiersen et al. 2022).

In this work, I focus on the detection of trees, with a classification between several labels related to the discrepancies between the different kinds of data.

1.2 Datasets

1.2.1 Requirements

Before presenting the different annotated trees datasets and the reasons why they were not fully usable for the project, let’s enumerate the different conditions and requirements I was looking for to properly train the model:

Multiple types of data:
- Aerial RGB images
- LiDAR point clouds (preferably aerial)
- Aerial infrared (CIR) images (optional)
Tree crown annotations or bounding boxes
High-enough resolution:
- For images, about 25 cm
- For point clouds, about 10 cm

Here are the explanations for these requirements. As for the types of data, RGB images and point clouds are required to experiment on the ability of the model to combine the two very different kinds of information they hold. Infrared data can also improve tree detection, but it was optional for this work because RGB images are enough to study the combination. Regarding tree annotations, it is necessary to have a way to spatially identify them individually, using crown contours or simply bounding boxes. Since the model outputs bounding boxes, any kind of other format can easily be transformed to bounding boxes. Finally, the resolution has to be high enough to identify all individual trees, including the smallest ones. For the point clouds especially, the whole idea is to see if and how the topology of the trees can be learnt, using at least the trunks and even the biggest branches if possible. Therefore, even if they are not really comparable, this is the reason why the required resolution is more precise for the point clouds.

1.2.2 Existing datasets with annotated trees

As explained above, there are quite a lot of requirements to fulfill to have a complete dataset with annotated trees which is suitable for the task. In practice, almost all the available datasets with annotated trees are missing something, as they are mainly focusing on using one kind of raw data (either spectral/hyperspectral images or LiDAR point clouds) and try to make the most out of it, instead of trying to use all the types of data together.

The most comprehensive list of tree annotations datasets was published in OpenForest (Ouaknine et al. 2023). FoMo-Bench (Bountos, Ouaknine, and Rolnick 2023) also lists several interesting datasets, even though most of them can also be found in OpenForest. Without enumerating all of them, there are multiple kinds of datasets that all have their own flaws regarding the requirements of this work.

Firstly, there are the forest inventories. TALLO (Jucker et al. 2022) is probably the most interesting one in this category, because it contains a lot of spatial information about almost 500K trees, with their locations, their crown radii and their heights. Therefore, everything needed to localize trees is in the dataset. However, I didn’t manage to find RGB images or LiDAR point clouds of the areas where the trees are located, making it impossible to use these annotations to train tree detection.

Secondly, there are the RGB datasets. Two examples of these datasets with a high quality of image are ReforesTree (Reiersen et al. 2022) and MillionTrees (B. Weinstein 2023). The only but major drawback of these datasets is obviously that they don’t provide any kind of point cloud, which makes them unsuitable for the task.

Thirdly, there are the LiDAR datasets, such as (Kalinicheva et al. 2022) and (Puliti et al. 2023). Similarly to RGB datasets, they lack one of the data source for the task I worked on. But unlike them, they have the advantage that the missing data could be much easier to acquire from another source, since RGB aerial or satellite images are much more common than LiDAR point clouds. However, this solution was abandoned for two main reasons. First it is often quite challenging to find the exact locations where the point clouds were acquired. Then, even when the location is known, it is often in the middle of a forest where the quality of openly available satellite imagery very low.

Finally, I also found two datasets that had RGB and LiDAR components. The first one is MDAS (Hu et al. 2023). This benchmark dataset encompasses RGB images, hyperspectral images and Digital Surface Models (DSM). There are however two major flaws. The obvious one is that this dataset was created with land semantic segmentation tasks in mind, so there is no tree annotations. The less obvious one is that a DSM is not a point cloud, even though it is some kind of 3D information and is often created using a LiDAR point cloud. As a consequence, this substantially limits the ability to experiment with the point cloud.

The only real dataset with RGB and LiDAR comes from NEON (B. Weinstein, Marconi, and White 2022). This dataset contains exactly all the data I was looking for, with RGB images, hyperspectral images and LiDAR point clouds. With 30975 tree annotations, it is also a quite large dataset, spanning across multiple various forests. The main reason why I decided not to use it in the end is the quality of the data, which is not bad but not as great as the one from the data available for the Netherlands, which I will talk about in the next section Section 1.2.3.

1.2.3 Public data

After rejecting all the available datasets I had found, the only remaining solution was to create my own dataset. I won’t dive too much in this process that I will explain in Chapter 3. I just want to mention all the publicly available raw data that I used or could have used to create this custom dataset.

For practical reasons, the two countries where I mostly searched for available data are France and the Netherlands. I was looking for three different data types independently:

RGB (and if possible CIR) images
LiDAR point clouds
Tree annotations

These three types of data are available in similar ways in both countries, although the Netherlands have a small edge over France. RGB images are really easy to find in France with the BD ORTHO (Institut national de l’information géographique et forestière (IGN) 2021) and in the Netherlands with the Luchtfotos (Beeldmateriaal Nederland 2024), but the resolution is better in the Netherlands (8 cm vs 20 cm). Hyperspectral images are also available in both countries, although for those the resolution is only 25 cm in the Netherlands.

As for LiDAR point clouds, the Netherlands have a small edge over France, because they have already completed their forth version covering the whole country with AHN4 (Actueel Hoogtebestand Nederland 2020), and are working on the fifth version. In France, data acquisition for the first LiDAR point cloud covering the whole country started a few years ago (Institut national de l’information géographique et forestière (IGN) 2020). It is not yet finished, even though the data is already available for half of the country. The other advantage of the data from the Netherlands regarding LiDAR point clouds is that all flights are performed during winter, which allows light beams to penetrate more deeply in trees and reach trunks and branches. This is not the case in France, were data is acquired during the whole year, adding a level of variation and therefore more difficulty.

The part that is missing in both countries is related to tree annotations. Many municipalities have datasets containing information about all the public trees they handle. This is for example the case for Amsterdam (Gemeente Amsterdam 2024) and Bordeaux (Bordeaux Métropole 2024). However, these datasets cannot really be used as ground truth for a custom dataset for several reasons. First, many of them do not contain coordinates indicating the position of each tree in the city. Then, even those that contain coordinates are most of the time missing any kind of information allowing to deduce a bounding box for the tree crowns. Finally, even if they did contain everything, they only focus on public trees, and are missing every single tree located in a private area. Since public and private areas are obviously imbricated in all cities, it means that any area we try to train the model on would be missing all the private trees, making the training process impossible because we cannot have only a partial annotation of images.

The other tree annotation source that we could have used is the Boomregister (Coöperatief Boomregister U.A. 2014). This work covers the whole of the Netherlands, including public and private trees. However, the precision of the masks is far from perfect, and many trees are missing or incorrectly segmented, especially when they are less than 9 m high or have a crown diameter smaller than 4 m. Therefore, even though it is a very impressive piece of work, I decided that it could not be used as training data for deep learning models due to its biases and imperfections. Therefore, the only remaining solution was to annotate trees by myself, to create my own dataset of annotated trees using the available data.

1.2.4 Dataset augmentation techniques

When a dataset is too small to train a model, there are several ways of artificially enlarging it.

The most common way is to randomly apply deterministic or random transformations to the data, during the training process, to be able to generate several unique and different realistic data instances from one real data instance. There are a lot of different transformations that can be applied to images, divided into two categories: pixel-level and spatial-level (Buslaev et al. 2020). Pixel-level transformations modify the value of individual pixels, by applying different filters, such as random noise, color shifts and more complex effects like fog and sun flare. Spatial-level transformations modify the spatial arrangement of the image, without changing the pixel values. In other words, these transformations move the pixels in the image. These transformations range from simple rotations and croppings to complex spatial distortions. In the end, all these transformations are simply producing one artificial image out of one real image.

Another way to enlarge a dataset is to instead generate completely new input data sharing the same properties as the initial dataset. This can be done using Generative Adversarial Networks (GAN). These models usually have two parts, a generator and a discriminator, which are trained in parallel. The generator learns to produce realistic artificial data, while the discriminator learns to discriminate between real data and artificial data produced by the generator. If the training is successful, we can then use the generator and random seeds to generate random but realistic artificial data similar to the dataset. This method has for example been successfully used to generate artificial tree height maps (Sun et al. 2022).

However, training GANs can be very unstable, and I haven’t found any paper applying this technique to generate LiDAR and RGB data at the same time. The artificial instances would need to be consistent between the two types of data, which might be very difficult. Therefore, I only used the random image transformations during the training process, because the chances of training of successful GAN seemed too low to be worth it.

1.3 Algorithms and models

In this section, the different algorithms and methods are grouped according to the type of data they use as input.

1.3.1 Images only

First, there are methods that perform tree detection using only visible or hyperspectral images or both. Several different algorithms have been developed to analytically delineate tree crowns from images, by using the particular shape of the trees and its effect on images (Gomes and Maillard 2016). Without diving into the details, here are a few of them. The watershed algorithm identifies trees to inverted watersheds using the grey-scale image and tree crowns frontiers are found by incrementally flooding the watersheds (Vincent and Soille 1991). The local maxima filtering uses the intensity of the pixels in the grey-scale image to identify the brightest points locally and use them as treetops (Wulder, Niemann, and Goodenough 2000). Reversely, the valley-following algorithm uses the darkest pixels which are considered as the junctions between the trees since shaded areas are the lower part of the tree crowns (Gougeon et al. 1998). Another interesting algorithm is template matching. This algorithm simulates the appearance of simple tree templates with the light effects, and tries to identify similar patterns in the grey-scale image (Pollock 1996). Combinations of these techniques and others have also been proposed.

But with the recent developments of deep learning in image analysis, deep learning models are increasingly used to detect trees using RGB images. In some cases, deep learning is used to extract features that can then be the input of one of the algorithms described above. One example is the use of two neural networks to predict masks, outlines and distance transforms which can then be the input of a watershed algorithm (Freudenberg, Magdon, and Nölke 2022). In other cases, a deep learning model is responsible of directly detecting tree masks or bounding boxes, often using CNNs, given the images (B. G. Weinstein et al. 2020).

1.3.2 LiDAR only

Reversely, some of the methods to identify individual trees use LiDAR data only. There are a lot of different ways to use and analyze point clouds, but the one that is mostly used for trees is based on height maps, or Canopy Height Models (CHM).

A CHM is a raster computed as the subtraction of the Digital Terrain Model (DTM) to the Digital Surface Model (DSM). What it means is that a CHM contains the height above ground of the highest point in the area corresponding to each pixel. This CHM can for example be used as the input raster for the watershed algorithm, as it contains the height values that can be used to determine local maxima (Kwak et al. 2007). The other analytical methods described in the previous section (Section 1.3.1) also have their equivalents using the CHM.

A lot of different analytical methods and variations of the simple CHM were proposed to perform individual tree detection, but in the end, most of them still use the concept of local maxima (Eysn et al. 2015; Wang et al. 2016). A CHM can also be used as the input of any kind of convolutional neural network (CNN) because it is shaped exactly like any image. This allows to use a lot of different techniques usually applied to object detection in images.

Then, even though I finally used an approach similar to the CHM, I want to mention other kinds of deep learning techniques that exist and could potentially leverage all the information contained in a point cloud. These techniques can be divided in two categories: projection-based and point-based methods (Diab, Kashef, and Shaker 2022). The main difference between the two is that projection-based techniques are based on grids while point-based methods take unstructured point clouds as input. Among projection-based methods, the most basic method is 2D CNN, which is how CHM can be processed. Then, multiview representation tries to tackle the 3D aspect by projecting the point cloud in multiple directions before merging them together. To really deal with 3D data, volumetric grid representation consists in using 3D occupancy grids, which are processed using 3D CNNs. Among point-based methods, there are methods based on PointNet, which are able to extract features and perform the classical computer vision tasks by taking point clouds as input. Finally, Convolutional Point Networks use a continuous generalization of convolutions to apply convolution kernels to arbitrarily distributed point clouds.

1.3.3 LiDAR and images

Let’s now talk about the models of interest for this work, which are machine learning pipelines using both LiDAR point cloud data and RGB images.

One example is a pipeline which uses a watershed algorithm to extract crown boundaries, before extracting individual tree features from the LiDAR point cloud, hyperspectral and RGB images (Qin et al. 2022). These features are then used by a random forest classifier to identify which species the tree belongs to. This pipeline therefore makes the most out of all data to identify species, but sticks to an improved variant of the watershed algorithm for individual tree segmentation, which only uses a CHM raster.

Other works focused on using a model from end to end that is able to take both the CHM and the RGB data as input and combine them to make the most out of all the available data. Among other examples, there are ACE R-CNN (Li et al. 2022), an evolution of Mask region-based convolution neural network (Mask R-CNN), and AMF GD YOLOv8 (Zhong et al. 2024), an evolution of YOLOv8. These two models have proven to give much better results when using both the images and the LiDAR data as a CHM than when using only one of them.

In this work, I focus on AMF GD YOLOv8, which looks promising and flexible thanks to the flexibility of YOLOv8 (Redmon et al. 2016). Indeed, YoloV8 has different variants allowing the model to deal with different computer vision tasks.

Actueel Hoogtebestand Nederland. 2020. “AHN4 - Actual Height Model of the Netherlands.” https://www.ahn.nl/.

Arevalo-Ramirez, Tito, Anali Alfaro, José Figueroa, Mauricio Ponce-Donoso, Jose M. Saavedra, Matías Recabarren, and José Delpiano. 2024. “Challenges for Computer Vision as a Tool for Screening Urban Trees Through Street-View Images.” Urban Forestry & Urban Greening 95: 128316. https://doi.org/10.1016/j.ufug.2024.128316.

Beeldmateriaal Nederland. 2024. “Luchtfoto’s (Aerial Photographs).” https://www.beeldmateriaal.nl/luchtfotos.

Bordeaux Métropole. 2024. “Patrimoine arboré de Bordeaux Métropole (Tree Heritage of Bordeaux Metropole).” https://opendata.bordeaux-metropole.fr/explore/dataset/ec_arbre_p/information/?disjunctive.insee.

Bountos, Nikolaos Ioannis, Arthur Ouaknine, and David Rolnick. 2023. “FoMo-Bench: A Multi-Modal, Multi-Scale and Multi-Task Forest Monitoring Benchmark for Remote Sensing Foundation Models.” arXiv Preprint arXiv:2312.10114. https://arxiv.org/abs/2312.10114.

Buslaev, Alexander, Vladimir I. Iglovikov, Eugene Khvedchenya, Alex Parinov, Mikhail Druzhinin, and Alexandr A. Kalinin. 2020. “Albumentations: Fast and Flexible Image Augmentations.” Information 11 (2). https://doi.org/10.3390/info11020125.

Coöperatief Boomregister U.A. 2014. “Boom Register (Tree Register).” https://boomregister.nl/.

Diab, Ahmed, Rasha Kashef, and Ahmed Shaker. 2022. “Deep Learning for LiDAR Point Cloud Classification in Remote Sensing.” Sensors (Basel) 22 (20): 7868. https://doi.org/10.3390/s22207868.

Eysn, Lothar, Markus Hollaus, Eva Lindberg, Frédéric Berger, Jean-Matthieu Monnet, Michele Dalponte, Milan Kobal, et al. 2015. “A Benchmark of Lidar-Based Single Tree Detection Methods Using Heterogeneous Forest Data from the Alpine Space.” Forests 6 (5): 1721–47. https://doi.org/10.3390/f6051721.

Freudenberg, Maximilian, Paul Magdon, and Nils Nölke. 2022. “Individual Tree Crown Delineation in High-Resolution Remote Sensing Images Based on u-Net.” Neural Computing and Applications 34 (24): 22197–207. https://doi.org/10.1007/s00521-022-07640-4.

Gemeente Amsterdam. 2024. “Bomenbestand Amsterdam (Amsterdam Tree Dataset).” https://maps.amsterdam.nl/open_geodata/?k=505.

Gomes, Marilia Ferreira, and Philippe Maillard. 2016. “Detection of Tree Crowns in Very High Spatial Resolution Images.” In Environmental Applications of Remote Sensing, edited by Maged Marghany. Rijeka: IntechOpen. https://doi.org/10.5772/62122.

Gougeon, François A et al. 1998. “Automatic Individual Tree Crown Delineation Using a Valley-Following Algorithm and Rule-Based System.” In Proc. International Forum on Automated Interpretation of High Spatial Resolution Digital Imagery for Forestry, Victoria, British Columbia, Canada, 11–23. Citeseer. https://d1ied5g1xfgpx8.cloudfront.net/pdfs/4583.pdf.

Hao, Zhenbang, Lili Lin, Christopher J. Post, Elena A. Mikhailova, Minghui Li, Yan Chen, Kunyong Yu, and Jian Liu. 2021. “Automated Tree-Crown and Height Detection in a Young Forest Plantation Using Mask Region-Based Convolutional Neural Network (Mask r-CNN).” ISPRS Journal of Photogrammetry and Remote Sensing 178: 112–23. https://doi.org/https://doi.org/10.1016/j.isprsjprs.2021.06.003.

Hu, J., R. Liu, D. Hong, A. Camero, J. Yao, M. Schneider, F. Kurz, K. Segl, and X. X. Zhu. 2023. “MDAS: A New Multimodal Benchmark Dataset for Remote Sensing.” Earth System Science Data 15 (1): 113–31. https://doi.org/10.5194/essd-15-113-2023.

Institut national de l’information géographique et forestière (IGN). 2020. “LiDAR HD.” https://geoservices.ign.fr/lidarhd.

———. 2021. “BD ORTHO.” https://geoservices.ign.fr/bdortho.

Jucker, Tommaso, Fabian Jörg Fischer, Jérôme Chave, David A. Coomes, John Caspersen, Arshad Ali, Grace Jopaul Loubota Panzou, et al. 2022. “Tallo: A Global Tree Allometry and Crown Architecture Database.” Global Change Biology 28 (17): 5254–68. https://doi.org/10.1111/gcb.16302.

Kalinicheva, Ekaterina, Loic Landrieu, Clément Mallet, and Nesrine Chehata. 2022. “Multi-Layer Modeling of Dense Vegetation from Aerial LiDAR Scans.” https://arxiv.org/abs/2204.11620.

Kwak, Doo-Ahn, Woo-Kyun Lee, Jun-Hak Lee, Greg S. Biging, and Peng Gong. 2007. “Detection of Individual Trees and Estimation of Tree Height Using LiDAR Data.” Journal of Forest Research 12 (6): 425–34. https://doi.org/10.1007/s10310-007-0041-9.

Li, Yingbo, Guoqi Chai, Yueting Wang, Lingting Lei, and Xiaoli Zhang. 2022. “ACE r-CNN: An Attention Complementary and Edge Detection-Based Instance Segmentation Algorithm for Individual Tree Species Identification Using UAV RGB Images and LiDAR Data.” Remote Sensing 14 (13). https://doi.org/10.3390/rs14133035.

Ouaknine, Arthur, Teja Kattenborn, Etienne Laliberté, and David Rolnick. 2023. “OpenForest: A Data Catalogue for Machine Learning in Forest Monitoring.” https://arxiv.org/abs/2311.00277.

Pollock, Richard James. 1996. “The Automatic Recognition of Individual Trees in Aerial Images of Forests Based on a Synthetic Tree Crown Image Model.” PhD thesis, The University of British Columbia (Canada). https://dx.doi.org/10.14288/1.0051597.

Puliti, Stefano, Grant Pearse, Peter Surový, Luke Wallace, Markus Hollaus, Maciej Wielgosz, and Rasmus Astrup. 2023. “FOR-Instance: A UAV Laser Scanning Benchmark Dataset for Semantic and Instance Segmentation of Individual Trees.” https://arxiv.org/abs/2309.01279.

Qin, Haiming, Weiqi Zhou, Yang Yao, and Weimin Wang. 2022. “Individual Tree Segmentation and Tree Species Classification in Subtropical Broadleaf Forests Using UAV-Based LiDAR, Hyperspectral, and Ultrahigh-Resolution RGB Data.” Remote Sensing of Environment 280: 113143. https://doi.org/10.1016/j.rse.2022.113143.

Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. “You Only Look Once: Unified, Real-Time Object Detection.” In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 779–88. https://doi.org/10.1109/CVPR.2016.91.

Reiersen, Gyri, David Dao, Björn Lütjens, Konstantin Klemmer, Kenza Amara, Attila Steinegger, Ce Zhang, and Xiaoxiang Zhu. 2022. “ReforesTree: A Dataset for Estimating Tropical Forest Carbon Stock with Deep Learning and Aerial Imagery.” https://arxiv.org/abs/2201.11192.

Safonova, Anastasiia, Emilio Guirado, Yuriy Maglinets, Domingo Alcaraz-Segura, and Siham Tabik. 2021. “Olive Tree Biovolume from UAV Multi-Resolution Image Segmentation with Mask r-CNN.” Sensors 21 (5): 1617. https://doi.org/10.3390/s21051617.

Sun, Chenxin, Chengwei Huang, Huaiqing Zhang, Bangqian Chen, Feng An, Liwen Wang, and Ting Yun. 2022. “Individual Tree Crown Segmentation and Crown Width Extraction from a Heightmap Derived from Aerial Laser Scanning Data Using a Deep Learning Framework.” Frontiers in Plant Science 13. https://doi.org/10.3389/fpls.2022.914974.

Ventura, Jonathan, Camille Pawlak, Milo Honsberger, Cameron Gonsalves, Julian Rice, Natalie L. R. Love, Skyler Han, et al. 2024. “Individual Tree Detection in Large-Scale Urban Environments Using High-Resolution Multispectral Imagery.” International Journal of Applied Earth Observation and Geoinformation 130: 103848. https://doi.org/https://doi.org/10.1016/j.jag.2024.103848.

Vincent, L., and P. Soille. 1991. “Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations.” IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (6): 583–98. https://doi.org/10.1109/34.87344.

Wang, Yunsheng, Juha Hyyppä, Xinlian Liang, Harri Kaartinen, Xiaowei Yu, Eva Lindberg, Johan Holmgren, et al. 2016. “International Benchmarking of the Individual Tree Detection Methods for Modeling 3-d Canopy Structure for Silviculture and Forest Ecology Using Airborne Laser Scanning.” IEEE Transactions on Geoscience and Remote Sensing 54 (9): 5011–27. https://doi.org/10.1109/TGRS.2016.2543225.

Weinstein, Ben. 2023. “MillionTrees.” 2023. https://milliontrees.idtrees.org/.

Weinstein, Ben G., Sergio Marconi, Mélaine Aubry-Kientz, Gregoire Vincent, Henry Senyondo, and Ethan P. White. 2020. “DeepForest: A Python Package for RGB Deep Learning Tree Crown Delineation.” Methods in Ecology and Evolution 11 (12): 1743–51. https://doi.org/10.1111/2041-210X.13472.

Weinstein, Ben, Sergio Marconi, and Ethan White. 2022. “Data for the NeonTreeEvaluation Benchmark (0.2.2).” Zenodo. https://doi.org/10.5281/zenodo.5914554.

Wulder, Mike, K.Olaf Niemann, and David G. Goodenough. 2000. “Local Maximum Filtering for the Extraction of Tree Locations and Basal Area from High Spatial Resolution Imagery.” Remote Sensing of Environment 73 (1): 103–14. https://doi.org/10.1016/S0034-4257(00)00101-2.

Zhong, Hao, Zheyu Zhang, Haoran Liu, Jinzhuo Wu, and Wenshu Lin. 2024. “Individual Tree Species Identification for Complex Coniferous and Broad-Leaved Mixed Forests Based on Deep Learning Combined with UAV LiDAR Data and RGB Images.” Forests 15 (2). https://doi.org/10.3390/f15020293.