[1] |
Mitchell T M. Machine learning[M]. New York: McGraw-hill, 1997:4-16.
|
[2] |
Winston P H. Artificial intelligence[M]. London: Addison-Wesley Longman Publishing Company, 1984: 12-18.
|
[3] |
Murphy K P. Machine learning:A probabilistic perspective[M]. Cambridge: MIT Press, 2012:62-63.
|
[4] |
Neunert M, Abdolmaleki A, Wulfmeier M, et al. Continuous-discrete reinforcement learning for hybrid control in robotics[C]. Osaka:Conference on Robot Learning, 2020:735-751.
|
[5] |
Xie Z, Berseth G, Clary P, et al. Feedback control for cassie with deep reinforcement learning[C]. Madrid:International Conference on Intelligent Robots and Systems, 2018:1241-1246.
|
[6] |
张佳鹏, 李琳, 朱叶. 基于强化学习的无人驾驶车辆行为决策方法研究进展[J]. 电子科技, 2021, 34(5):66-71.
|
|
Zhang Jiapeng, Li Lin, Zhu Ye. A review of research on decision-making method of autonomous vehicle based on reinforcement learning[J]. Electronic Science and Technology, 2021, 34(5):66-71.
|
[7] |
Vinyals O, Babuschkin I, Czarnecki W M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575(7782):350-354.
|
[8] |
Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic:Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]. Stockholm:International Conference on Machine Learning, 2018:1861-1870.
|
[9] |
王骏超. 基于SAC算法的机械臂控制策略的研究[D]. 南昌: 华东交通大学, 2020:33-42.
|
|
Wang Junchao. Research on manipulator control strategy based on SAC algorithm[D]. Nanchang: East China Jiaotong University, 2020:33-42.
|
[10] |
Wong C C, Chien S Y, Feng H M, et al. Motion planning for dual-arm robot based on soft actor-critic[J]. IEEE Access, 2021, 9(2):26871-26885.
|
[11] |
De Jesus J C, Kich V A, Kolling A H, et al. Soft actor-critic for navigation of mobile robots[J]. Journal of Intelligent & Robotic Systems, 2021, 102(2):31-36.
|
[12] |
Banerjee C, Chen Z, Noman N. Improved soft actor-critic:Mixing prioritized off-policy samples with on-policy experiences[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 65(9):1-9.
|
[13] |
Chen Y, Ying F, Li X, et al. Deep reinforcement learning in maximum entropy framework with automatic adjustment of mixed temperature parameters for path planning[C]. Taizhou:The Seventh International Conference on Robotics,Control and Automation, 2023:78-82.
|
[14] |
肖硕, 黄珍珍, 张国鹏, 等. 基于SAC的多智能体深度强化学习算法[J]. 电子学报, 2021, 49(9):1675-1681.
doi: 10.12263/DZXB.20200243
|
|
Xiao Shuo, Huang Zhenzhen, Zhang Guopeng, et al. Deep reinforcement learning algorithm of multi-agent based on SAC[J]. Acta Electronica Sinica, 2021, 49(9):1675-1681.
doi: 10.12263/DZXB.20200243
|
[15] |
范静宇. 基于熵的深度强化学习优化算法[D]. 苏州: 苏州大学, 2021:7-15.
|
|
Fan Jingyu. Optimized algorithm of deep reinforcement learning based on entropy[D]. Suzhou: Soochow University, 2021:7-15.
|
[16] |
Konda V, Tsitsiklis J. Actor-critic algorithms[J]. Advances in Neural Information Processing Systems, 1999, 12(5):1008-1014.
|
[17] |
Mnih V, Badia A P, Mirza M, et al. Asynchronous methods for deep reinforcement learning[C]. New York: International Conference on Machine Learning, 2016:1928-1937.
|
[18] |
Prianto E, Kim M S, Park J H, et al. Path planning for multi-arm manipulators using deep reinforcement learning:Soft actor-critic with hindsight experience replay[J]. Sensors, 2020, 20(20):5911-5933.
|
[19] |
Kim S K, Shin W H, Ko S Y, et al. Design of a compact 5-DOF surgical robot of a spherical mechanism: CURES[C]. Xi'an:IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2008:990-995.
|
[20] |
马保平. 微创手术机器人机构设计与人机交互研究[D]. 上海: 上海工程技术大学, 2020:11-21.
|
|
Ma Baoping. Mechanism design and human-robot interaction of endoscopic surgery robot[D]. Shanghai: Shanghai University of Engineering Science, 2020:11-21.
|
[21] |
Farley A, Wang J, Marshall J A. How to pick a mobile robot simulator:A quantitative comparison of CoppeliaSim,Gazebo,MORSE and Webots with a focus on accuracy of motion[J]. Simulation Modelling Practice and Theory, 2022, 120(7):1026-1058.
|