{“状态”：“确定”，“消息类型”：“工作”，“信息版本”：“1.0.0”，“邮件”：{“索引”：{-“日期-部件”：[[2024,6,22]]，“日期-时间”：“2024-06-22T14:01:20Z”，“时间戳”：1719064880826}，“引用-计数”：35，“发布者”：“爱思唯尔BV”，“许可证”：[{“开始”:{“日期-零件”：[2018,11,1]]，”日期-时间“：”2018-11-01T00:00:00 Z“，”timestamp“：1541030400000}，”content-version“：“tdm”，“delay-in-days”：0，“URL”：“https:\\/www.elsevier.com/tdm\/userlicense\/1.0\/”}，{“start”：{“date-parts”：[[2017,12,30]]，“date-time”：“2017-12-30T00:00:00Z”，“timestamp”：1514592000000}，“content-version”：“vor”，“delay-in-days”：“0，”URL“http://creativecommons.org\/license\/by-nc-n d\/4.0\/“}]，“出资人”：[{“DOI”：“10.13039\/501100001863”，“名称”：“新能源和工业技术发展组织（NEDO）”，“doi-asserted-by”：“publisher”}，{“doi”：“10.13039\/501100001700”，“name”：“MEXT KAKENHI”，“doi-asserted-by”：”publisher“，“award”：[“16H06563”，“17H06042”]}，}“doi:”10.13039\\501100004199“，”name“：”冲绳科技研究生院大学“，”doi-assert-by“：“publisher”}]，“content-domain”：{“domain”:[“elsevier.com”，“sciencedirect.com”]，“crossmark-restriction”：true}，“short-container-title”：[“Neural Networks”]，”published-print“：{”date-parts“：[[2018,11]}，”DOI“：”10.1016\/j.neunet.2017.012“，”type“：”journal-article“，”created“：”{“date-part”：[2018,11]]，“date-ti”我”：“2018-01-11T14:27:27Z”，“时间戳”：1515680847000｝，“page”：“3-11”，“update policy”：“http:\/\/dx.doi.org/10.1016\/elsevier_cm_policy”，“source”：“Crossref”，“被计数引用”：678，“title”：[“强化学习中神经网络函数近似的Sigmoid加权线性单元”]，“prefix”：“10.1016”，“volume”：“107”，“author”：[{“ORCID”：“http:\/\/ORCID.org/0000-0001-6689-1000”，“authenticated-orcid”：false，“given”：“Stefan”，“family”：“Elfwing”，“sequence”：“first”，“affiliation”：[]}，{“giving”：“Eiji”，“家族”：“Uchibe”，“序列”：“additional”，“从属关系”：[]}，“givind”：“Kenji”、“family:”Doya“，”sequence“：”additional“，“affaliation”:[]}]，“member”：“78”，“reference”：[{“key”：”10.1016\/j.n eunet.2017.12.012_b1“，”doi-asserted-by“：“crossref”，“first page”：“253”，“DOI”：“10.1613\/jair.3912”，“article-title”：“The Arcade learning environment:an evaluation platform for general agents”，“volume”：”47“，“author”：“Bellemare”，“year”：“2013”，“journal-title“：”journal of Artificial Intelligence Research“}，{“key”：《人工智能研究杂志》10.1016\/j.neunet.207.12.012_52“，“非结构化”：”Bertsekas，D。P.，&Ioffe，S.（1996年）。基于时间差异的策略迭代及其在神经动力学编程中的应用。技术报告LIDS-P-2349，麻省理工学院。“}，{”key“：”10.1016\/j.neunet.2017.12.012_b3“，”doi-asserted-by“：”crossref“，”first page“：“194”，”doi“：”10.2307\/3619195“，”article-title“：”How to lose at Tetris“，”volume“：es-title“：”CVPR09“，”文章-标题“：”ImageNet：大型分层图像数据库“，”author“：”Deng“，”year“：”2009“}，{”issue“：”3“，”key“：”10.1016\/j.neunet.2017.12.012_b5“，”doi-asserted-by“：”crossref“，”first page“：“29”，”doi“：”101016\/j.neunet.2014.09.006“，”article-title“：”Expected energy-based restricted Boltzmann machine for classification“，“volume”：“64”，“author”：“Elfwing”，“年份“：“2015年”，“journal-title”：“神经网络”}，{“key”：“10.1016\/j.neunet.2017.12.012_b6”，“doi-asserted-by”：“crossref”，“首页”：“17”，“doi”：“101016\/j.neunet.2016.07.013”，“文章标题”：“从自由能到期望能：改进强化学习中基于能量的价值函数逼近”，“卷”：“84”，“作者”：“Elfwing”，“年份”：“2016年，“期刊标题”：“神经网络”}，{“key”：“10.1016\/j.neunet.2017.12.012_b7”，“非结构化”：“Elfwing，S.，Uchibe，E.，&Doya，K.（2017）.强化学习中神经网络函数近似的Sigmoid加权线性单位.arXiv:1702.03118\u00a0[cs.LG].”}、{“key”：”10.1016\/j.nuenet.2017.12_b8“，“非结构性”：“Fahey，C.（2003）俄罗斯方块AI，计算机播放俄罗斯方块.colinfahey.com/Tetris\/Tetris.html[在线；（2017年2月22日访问）]。“｝，｛”key“：”10.1016\/j.neuet.2017.12.012_b9“，”首页“：”1“，”文章标题“：”强化学习中的神经网络集成“，”作者“：”Fau\u00dfer“，”年份“：”2013“，”期刊标题“：”神经处理快报“｝，｛”key“：”10.1016\/j.neuet.2017.12.012_b10“，”系列标题“：”神经信息处理系统进展论文集“，”article-title“：“使用双层网络对二进制向量分布的无监督学习”，“author”：“Freund”，“year”：“1992”}，{“key”：“10.1016\/j.neunet.2017.12.012_b11”，“unstructured”：“Gabillon，V.，Ghavamzadeh，M.，&Scherrer，B.（2013）。近似动态编程最终在俄罗斯方块游戏中表现良好。《神经信息处理系统进展学报》（第1754\u20131762页）。“}，{”key“：”10.1016\/j.neunet.2017.12.012_b12“，”doi-asserted-by“：”crossref“，”first page“：“947”，”doi“：”101038\/35016072“，”article-title“：”数字选择和模拟放大共存于皮质醇激励硅电路“，”volume“：10.1016\/j.neunet.2017.12.012_b13“，”doi-asserted-by“：”crossref“，”first page“：”1771“，”doi“：”10.1162\/089976602760128018“，”article-title“：”通过最小化对比度差异来培训专家产品“，”volume“：“12”，”author“：”Hinton“，”year“2002”，”journal-title”：“Neural Calculation”}，{“key”：“10.1016\\j.neunet.2017.12_b14”，“doi-assert-b”y“：”crossref”，“非结构化”：“Jaskowski，W.，Szubert，M.G.，Liskowski P.，&Krawiec，K.（2015）。无知识强化学习的高维函数近似：SZ-Tetris的一个案例研究。《遗传与进化计算会议论文集》（pp.567\u2013573）。“，”DOI“：”10.1145\/2739480.2754783“}，”{“key”：“10.1016\/j.neunet.2017.12.012_b15”，“unstructured”：“Krizhevsky，A.（2009）。从微小图像中学习多层特征。多伦多大学技术代表。”}，“key“：”101016\/j.neunet.207.12.012_b16“，”unstructure“：”Mnih，V.，Badia，A.P.，Mirza，M.，Graves，A.，Lillicrap，T。P.，&Harley，T.等人（2016年）。深度强化学习的异步方法。《机器学习国际会议论文集》（第1928\u20131937页）。“}，{”issue“：”7540“，”key“：”10.1016\/j.neunet.2017.12.012_b17“，”doi-asserted-by“：”crossref“，”first page“：“529”，”doi“：”101038\/nature14236“，”article-title“：”通过深度强化学习进行人性化控制“，”volume“：》518“，”author“：”Mnih“，”year“：”2015“，”journal-title：“Nature”}，”{“key”：“10.1016\\j.neunet。2017.12.012_b18“，”非结构化“：”Nair，A.、Srinivasan，P.、Blackwell，S.、Alcicek，C.、Fearon，R.和Maria，A.D.等人（2015）。深度强化学习的大规模并行方法。arXiv:1507.04296\u00a0[cs.LG]。“}，{”key“：”10.1016\/j.neunet.2017.12.012_b19“，”unstructured“：”Ramachandran，P.，Zoph，B.，&Le，Q.V.（2017）.搜索激活函数.arXiv:1710.05941\u00a0[cs.NE].“}”，{“key”：“10.1016\/j.nenet.2017.12_b20”，“unstructure”：“Rummery，G.A.，&Niranjan，M.（1994）.使用连接主义系统的在线Q-学习.Tech.Rep。CUED\/F-INFENG\/TR 166，剑桥大学工程系。“}，{”key“：”10.1016\/j.neunet.2017.12.012_b21“，”unstructured“：”Schaul，T.，Quan，j.，Antonoglou，I.，&Silver，D.（2016）。优先体验重播。在波多黎各学习表征国际会议上。“}”，{“key”：“10.1016\\j.neunet.207.12.012_b22”，“首页”：“1629”，“文章标题”：“近似修改策略迭代及其在俄罗斯方块游戏中的应用“，“volume”：“16”，“author”：“Scherrer”，“year”：“2015”，“journal-title”：“journal of Machine Learning Research（JMLR）”}，{“key”：”10.1016\/j.neunet.2017.12.012_b23“，”doi-asserted-by“：”crossref“，”first page“：”484“，”doi“：”10.1038\/nature16961“，”article-title“：”掌握深度神经网络和树搜索的围棋游戏“，“卷”：“529”，“作者”：“银牌”，“年份”：“2016年”，“新闻标题”：“自然”}，{“关键”：“10.1016\/j.neunet.2017.12.012_b24”，“doi-asserted-by”：“crossref”，“非结构化”：“Silver，D.，Schrittwieser，j.，Simonyan，K.，Antonoglou，I.，Huang，A.，&Guez，A.等人，（2017）.在没有人类知识的情况下掌握围棋游戏。第550卷，（第354\u2013359页）。“，”DOI“：”10.1038\/nature24270“}，{“key”：“10.1016\/j.neunet.2017.12.012_b25”，“series-title”：“并行分布式处理：认知微观结构的探索。第1卷：基础”，“文章-标题”：“动力系统中的信息处理：和谐理论的基础”，”author“：”Smolensky“，”year“：”1986“}10.1016\/j.neunet.2017.12.012_b26“，”doi-asserted-by“：”crossref“，”first page“：”9“，”doi“：”10.1007\/BF00115009“，”article-title“：”Learning to prediction by the method of temporal differences“，“volume”：“3”，“author”：“Sutton”，“year”：“1988”，“journal-title”：“Machine Learning”}，{“key”：”10.1016\\j.neunet.207.12.012_b27“，”series-title“：“”神经信息处理系统进展论文集”，“第一页”：“1038”，“文章标题”：“强化学习中的泛化：使用稀疏粗编码的成功示例”，“作者”：“Sutton”，“年份”：“1996”}，{“关键”：“10.1016\/j.neunet.2017.12.012_b28”，“系列标题”：年份“：”1998“}，{“key”：“10.1016\/j.neunet.2017.12.012_b29”，“unstructured”：“Szita，I.，&Szepesv\u00e1ri，C.（2010）。SZ-Tettris作为研究强化学习关键问题的基准。在ICML 2010机器学习和游戏研讨会上。”}，}“issue”：“2”，“key“10.1016”，“j.neunet.207.12.012_b30”，“doi-asserted-by”：“crossref”，“first page”：215”，“DOI”：“10.1162\/neco.1996.2.215”、“article-title”：“TD-Gammon，一个自学的双陆棋程序，实现了大师级游戏”，“volume”：“6”，“author”：“Tesauro”，“year”：“1994”，“journal-title“：“Neural Computation”}，{“key”：”10.1016\/j.neunet.207.12.012_b31“，”article-title“改进了交叉熵学习俄罗斯方块语”，“volume”:“32”，“作者：“Thiery”，“year”：“2009”，“journal title”：“International Computer Games Association journal”｝，｛“key”：“10.1016\/j.neut.2017.12.012_b32”，“nonstructured”：“Thrun，S.，&Schwartz，A.（1993）。使用函数近似进行强化学习的问题。在1993年连接主义模型暑期学校论文集（pp.255\u2013263）中。”｝，｛key“：”10.1016\/j.neunet.2017.12.012_b33“，”unstructured“：”van Hasselt，H.（2010）。双重q学习。《神经信息处理系统进展学报》（第2613\u20132621页）。“}，{”key“：”10.1016\/j.neunet.2017.12.012_b34“，”unstructured“：”van Hasselt，H.，Guez，A.，&Silver，D.（2015）。双q学习的深度强化学习。arXiv:1509.06461[cs.LG]。“}”，{“key”：“10.1016\/j.nenet.2017.12.02_b35”，“unstructure”：“Wang，Z.，Schaul，T.，Hessel，M.，van Hasselt，H.、Lanctot，M.和de\u00a0Freitas，N.”（2016）.为深度强化学习决斗网络架构。《机器学习国际会议论文集》（pp.1995\u20132003）。“}]，”container-title“：[”Neural Networks“]，”original-title”：[]，”language“：”en“，”link“：[{”URL“：”https:\/\/api.elsevier.com/content\/article\/PII:S0893608017302976？httpAccept=text\/xml“，”content-type“：”text\/xml“，”content-version“：”vor“，”intended-application“：”text-mining“}，”{“URL”：“https:\//api.elsevier.com\/content\/article\/PII:S0893608017302976？httpAccept=text\/plain“，”content-type“：”text\/prain“，“content-version”：“vor”，“intended-application”：“text-mining”}]，“deposed”：{“date-parts”：[[2019,10,9]]，“date-time”：“2019-10-09T04:05:18Z”，“timestamp”：1570593918000}，“score”：1，“resource”：{primary“：{”URL“https://linkinghub.elsevier.com\/retrieve\/pi\/S089 3608017302976“}}，”副标题“：[]，”short title“：[]，”issued“：{”date-parts“：[[2018,11]]}，”references-count“：35，”alternative-id“：[”S08933608017302976“]，”URL“：”http://\/dx.doi.org\/10.1016\/j.neunet.2017.12.012“，”relation“：{}，“ISSN”：[“0893-6080”]，“ISSN-type”：[{“value”：“08936080”，“type”:“print”}]，“subject”：[]，“published”：{“date-parts”：[[2018,11]]}，“assertion”：[{“value”：“Elsevier”，“name“：”publisher“，”label“：”本文由“}维护，{“value”：“强化学习中神经网络函数近似的Sigmoid加权线性单位”，“name”：“articletite”，“label”：“article Title”}，{”value“：”neural Networks“，”name“：“journaltitle”，“table”：“Journal Title“}，”value：“https:\/\/doi.org\/101016\/j.neunet.2017.12.012”，“”name”：“articlelink”，“label”：“CrossRef DOI链接到出版商维护的版本”｝，｛“value”：“article”，“name”：“content_type”，“label”：“content type”｝，｛“value”：“\u00a9 2017 The Author（s）.Published by Elsevier Ltd.”，“name”：“copyright”，“label”：“copyright”｝]｝