Objective The number of nursery stock, their location, and crown spread are important data foundations for the scientific management of nursery stock. The traditional approach of conducting nursery stock inventories through on-site individual plant surveys is labor-intensive and time-consuming. As a result, researchers are beginning to utilize low-cost and convenient unmanned aerial vehicles (UAVs) for on-site collection of nursery stock data. And achieve the statistical analysis of nursery stock information through technical means such as image processing. During the data collection process, as the flight altitude of the UAV increases, the number of trees in a single image also increases. Although the anchor box can cover more information about the trees, the cost of annotation is enormous in the case of a large number of densely populated tree images. To address the issues of tree adhesion and scale differences in UAV-captured nursery stock images, while considering annotation costs, an improved dense detection and counting model is proposed. This model uses point-labeled data as supervisory signals. Because it can obtain the location, size, and count of the targets, it was called the LSC-CNN. Method To enhance the diversity of nursery stock samples, this study selected the spruce dataset, the Yosemite, and the KCL-London publicly available tree datasets to construct a dense nursery stock dataset. A total of 1 520 nursery stock images were acquired and divided into training and testing sets at a ratio of 7:3. To enhance the model's adaptability to tree data of different scales and variations in lighting, data augmentation methods such as adjusting the contrast and resizing the images were applied to the images in the training set. After enhancement, the training set consists of 3 192 images, and the testing set contains 456 images. Considering the large number of trees contained in each image, to reduce the cost of annotation, the method of selecting the center point of the trees was used for labeling. The LSC-CNN model was selected as the base model for this study. This model can detect the quantity, location, and size of trees through point-supervised training, thereby obtaining more information about the trees. The LSC-CNN model was made improvements to address issues of missed detections and false positives that occurred during the testing process. Firstly, to address the issue of missed detections caused by severe adhesion of densely packed trees, the last convolutional layer of the feature extraction network was replaced with dilated convolution. This change enlarges the receptive field of the convolutional kernel on the input while preserving the detailed features of the trees. So the network is better able to capture a broader range of contextual information, thereby enhancing the model's understanding of the overall scene. Secondly, the convolutional block attention module (CBAM) attention mechanism was introduced at the beginning of each scale branch. This allowed the network to focus on the key features of trees at different scales and spatial locations, thereby improving the model's sensitivity to multi-scale information. Dilated convolutions can enhance the model's receptive field, while attention mechanisms further guide the model to focus on important parts within the receptive field. The combination of these two approaches effectively captured contextual information over a larger range, improving the recognition capabilities for multi-scale dense trees. Finally, the model was trained using label smooth cross-entropy loss function and grid winner-takes-all strategy, emphasizing regions with highest losses to boost tree feature recognition. Results and Discussions The mean counting accuracy (MCA), mean absolute error (MAE), and root mean square error (RMSE) were adopted as evaluation metrics in this study. It designed ablation studies and comparative experiments to demonstrate the performance of the improved LSC-CNN model. The ablation experiment proved that the improved LSC-CNN model could effectively resolve the issues of missed detections and false positives in the LSC-CNN model's detection, which were caused by the density and large-scale variations present in the nursery stock dataset. The comparative experiment took the IntegrateNet, PSGCNet, CANet, CSRNet, CLTR and LSC-CNN models as comparative models. The improved LSC-CNN model proposed achieved MCA, MAE, and RMSE of 91.23%, 14.24, and 22.22, respectively. The improved LSC-CNN model integrated the advantages of point supervision learning from density estimation methods and the generation of target bounding boxes from detection methods. Compared to the IntegrateNet, PSGCNet, CANet, CSRNet, CLTR and LSC-CNN models, the improved LSC-CNN model has seen an increase in MCA by 6.67%, 2.33%, 6.81%, 5.31%, 2.09% and 2.34%, respectively; a reduction in MAE by 21.19, 11.54, 18.92, 13.28, 11.30 and 10.26, respectively; and a decrease in RMSE by 28.22, 28.63, 26.63, 14.18, 24.38 and 12.15, respectively. These results indicate that the improved LSC-CNN model achieves high counting accuracy and exhibits strong generalization ability. Conclusions These improvements demonstrate the enhanced performance of the improved LSC-CNN model in terms of accuracy, precision, and reliability in detecting and counting trees. This study utilizes point-labeled data as supervisory signals to achieve a quantitative analysis of quantities and detection of location and size for nursery stocks, providing a reference for downstream crop phenotypic research. Additionally, it also holds practical reference value for the statistical work of other types of nursery stock.