[Significance] Fruit-picking robot stands as a crucial solution for achieving intelligent fruit harvesting. Significant progress has been made in developing foundational methods for picking robots, such as fruit recognition, orchard navigation, path planning for picking, and robotic arm control, the practical implementation of a seamless picking system that integrates sensing, movement, and picking capabilities still encounters substantial technical hurdles. In contrast to current picking systems, the next generation of fruit-picking robots aims to replicate the autonomous skills exhibited by human fruit pickers. This involves effectively performing ongoing tasks of perception, movement, and picking without human intervention. To tackle this challenge, this review delves into the latest research methodologies and real-world applications in this field, critically assesses the strengths and limitations of existing methods and categorizes the essential components of continuous operation into three sub-modules: local target recognition, global mapping, and operation planning. [Progress] Initially, the review explores methods for recognizing nearby fruit and obstacle targets. These methods encompass four main approaches: low-level feature fusion, high-level feature learning, RGB-D information fusion, and multi-view information fusion, respectively. Each of these approaches incorporates advanced algorithms and sensor technologies for cluttered orchard environments. For example, low-level feature fusion utilizes basic attributes such as color, shapes and texture to distinguish fruits from backgrounds, while high-level feature learning employs more complex models like convolutional neural networks to interpret the contextual relationships within the data. RGB-D information fusion brings depth perception into the mix, allowing robots to gauge the distance to each fruit accurately. Multi-view information fusion tackles the issue of occlusions by combining data from multiple cameras and sensors around the robot, providing a more comprehensive view of the environment and enabling more reliable sensing. Subsequently, the review shifts focus to orchard mapping and scene comprehension on a broader scale. It points out that current mapping methods, while effective, still struggle with dynamic changes in the orchard, such as variations of fruits and light conditions. Improved adaptation techniques, possibly through machine learning models that can learn and adjust to different environmental conditions, are suggested as a way forward. Building upon the foundation of local and global perception, the review investigates strategies for planning and controlling autonomous behaviors. This includes not only the latest advancements in devising movement paths for robot mobility but also adaptive strategies that allow robots to react to unexpected obstacles or changes within the whole environment. Enhanced strategies for effective fruit picking using the Eye-in-Hand system involve the development of more dexterous robotic hands and improved algorithms for precisely predicting the optimal picking point of each fruit. The review also identifies a crucial need for further advancements in the dynamic behavior and autonomy of these technologies, emphasizing the importance of continuous learning and adaptive control systems to improve operational efficiency in diverse orchard environments. [Conclusions and Prospects] The review underscores the critical importance of coordinating perception, movement, and picking modules to facilitate the transition from a basic functional prototype to a practical machine. Moreover, it emphasizes the necessity of enhancing the robustness and stability of core algorithms governing perception, planning, and control, while ensuring their seamless coordination which is a central challenge that emerges. Additionally, the review raises unresolved questions regarding the application of picking robots and outlines future trends, include deeper integration of stereo vision and deep learning, enhanced global vision sampling, and the establishment of standardized evaluation criteria for overall operational performance. The paper can provide references for the eventual development of robust, autonomous, and commercially viable picking robots in the future.