×

reinforcement learning algorithm meaning in English

强化式学习算法
强化学习算法

Examples

  1. In this paper , introducing joint - action to the traditional reinforcement learning , a new multi - agent reinforcement learning algorithm based on behavior prediction is presented and several methods for predicting other agents " behaviors are discussed
    在传统强化学习方式中引入组合动作的基础上,本文提出了一种基于行为预测的多智能体强化学习方法,研究了对其他智能体行为进行预测的几种可行方法。
  2. The reinforcement learning algorithm was also introduced , since it has some relations with the colony algorithm and can be need in the problem of scheduling . 4 . some new concepts and scheduling algorithms for batch chemical process were proposed in our studies
    由于蚁群算法与人工智能中的强化学习算法之间有着某种联系,同时强化学习近年来也应用于求解调度问题,因此本文也涉及到了一些强化学习的主要算法。
  3. Reinforcement learning algorithms that use cerebellar model articulation controller ( cmac ) are studied to estimate the optimal value function of markov decision processes ( mdps ) with continuous states and discrete actions . the state discretization for mdps using sarsa - learning algorithms based on cmac networks and direct gradient rules is analyzed . two new coding methods for cmac neural networks are proposed so that the learning efficiency of cmac - based direct gradient learning algorithms can be improved
    在求解离散行为空间markov决策过程( mdp )最优策略的增强学习算法研究方面,研究了小脑模型关节控制器( cmac )在mdp行为值函数逼近中的应用,分析了基于cmac的直接梯度算法对mdp状态空间离散化的特点,研究了两种改进的cmac编码结构,即:非邻接重叠编码和变尺度编码,以提高直接梯度学习算法的收敛速度和泛化性能。
  4. By means of the proposed reinforcement learning algorithm and modified genetic algorithm , neural network controller whose weights are optimized could generate time series small perturbation signals to convert chaotic oscillations of chaotic systems into desired regular ones . the computer simulations on controlling henon map and logistic chaotic system have demonstrated the capacity of the presented strategy by suppressing lower periodic orbits such as period - 1 and period - 2 . meanwhile , the periodic control methodology is utilized , the higher periods such as period - 4 can also be successfully directed to expected periodic orbits
    该控制方法无需了解系统的动态特性和精确的数学模型,也不需监督学习所要求的训练数据,通过增强学习训练方式,采用改进遗传算法优化神经网络权系数,使之成为混沌控制器,便可产生控制混沌系统的时间序列小扰动信号,仿真实验结果表明它不仅能有效镇定混沌周期1 、 2等低周期轨道,而且在周期控制技术基础上,也可成功将高周期混沌轨道(如周期4轨道)变成期望周期行为。
  5. L3ased on the organization rules of internet data , the distribution laws of hyperlinks and the name rules of url , a algorithm of tvm rebuilding is established , and satisfactory experiment results are obtained by applying this algorithm . furthermore , efforts are made by applying of tvm on browse navigation , web page classification and reinforcement learning algorithm
    结合互联网资源的构建规则、链接分布规律和url命名规则,论文提出了树藤共生数据模型的重建算法,实验结果验证了树藤共生模型的有效性与合理性,在此基础上初步讨论了树藤共生模型在浏览导航、网页分类和reinforcementlearning算法中的应用。
More:   Prev

Related Words

  1. vertical reinforcement
  2. braid reinforcement
  3. distribution reinforcement
  4. binary reinforcement
  5. external reinforcement
  6. neural reinforcement
  7. autogenic reinforcement
  8. reinforcement material
  9. adventitious reinforcement
  10. fiber reinforcement
  11. reinforcement gymnastics
  12. reinforcement lay up
  13. reinforcement learning system
  14. reinforcement limitation
PC Version

Copyright © 2018 WordTech Co.