{“抽象”:“形态长度是帮助学习语言形态,特别是粘着语言的指示性特征之一。在本文中,我们介绍了一个简单的无监督形态分割模型,并研究了在贝叶斯框架下,形态长度的知识如何影响分割任务的性能。该模型基于(Goldwater et al.,2006)单字切分模型,并假设在变形长度上存在简单的先验分布。我们在两种高度相关的粘着语言泰米尔语和泰卢固语上实验这个模型,并将我们的结果与最先进的Morfessor系统进行比较。我们表明,对变形长度的了解具有积极的影响,并在整体性能方面提供具有竞争力的结果。“,”arxivId“:null,”authors“:[{”authorId“:”30725143“,”name“:”L.Ramasamy“,”url“:”https://www.semanticscholar.org/author/30725143“},{”authorId“:”2468732“,”name“:”Z.\u017dabokrtsk\u00fd“,”url“:”https://www.semanticscholar.org/author/2468732“},{”authorId“:”2070714“,”name“:”Sowmya Vajjala“,”url“:”https://www.semanticscholar.org/author/2070714“}],”citationVelocity“:0,”cittations“:[{”arxivId“:null,”authors“:[}”authorId“:”2631315“,”name“:”Ananthi Sheshasaayee“},{”authId:“1410547837”,“name”:“Angela Deepa.V.R”}],“doi”:“10.1109/GET.2016.7916723”,“intent”:[],“isInfluential”:false,“paperId:”73c2bd43381c0abfe2b4b2a4af2 909cc06e378d7“,”标题“:”使用无监督方法确定泰米尔语的形态学成分“,”url“:”https://www.semanticscholar.org/paper/73c2bd43381c0abfe2b4b2a4af2909cc06e378d7“,”“地点”:“2016年绿色工程与技术在线国际会议(IC-GET)”,“年份”:2016},{“arxivId”:null,“作者”:[{“authorId”:“3354873”,“名称”:“Koenraad J M J De Smedt”},}“authId:”“25766664”,“姓名”:“E.Hinrichs”}、{“authorId:”1831176“名称”“:”“Walt Detmar Meurers”};{“作者Id:”:“1889060”,“name”:“I.”Skadina“},{”authorId“:”144627695“,”name“:”Bolette S.Pedersen“},”{“authorId”:“1795757”,“name”:“Costanza Navarretta”},“author Id:”1742458“,”name“:”N\u00faria Bel“}”,{“authorId“:”13206584“,”name:“Krister Lind\u00e9n”}、{“authorId:”1711187“,”name=“Mark\u00e9 ta Lopatkov\u00e”}“,{”authorId“:”144002335“,”name“:”Jan Hajic“},{”authorId:“”40374369“,”name:“Gisle Andersen”},}“authorId“:”1734626“,”name“:”P.Lenkiewicz“}],”doi“:”10.15496/PUBLIKATION-10974“,”intent“:[”background“],”isInfluential“:false,”paperId“:“c955a20d79ae4c6dc91e3919ace0ec721470a077”,”title“:”CLARA:公共语言资源及其应用领域的新一代研究者“,”url“:”https://www.semanticscholar.org/paper/c955a20d79ae4c6dc91e3919ace0ec721470a077“,”场所“:”LREC“,”年份“:2014},{”arxivId“:null,”作者“:[{”authorId“:”1643929295“,”名称“:”Manh-Ke Tran“}],”doi“:nul,”意图“:[”方法论“],”isInfluential“:false,”paperId“:“38a3a0adab4eca91aa91e4536af2432baa1ad0f4”,“title”:“资源库语言的无监督和半监督多语言学习”,“url”:“https://www.semanticscholar.org/paper/38a3a0adab4eca91aa91e4536af2432baa1ad0f4“,”场所“:”“,”年份“:2012}],”corpusId“:2666779,”doi“:null,”fieldsOfStudy“:[”计算机科学“],”influentialCitationCount“:0,”isOpenAccess“:false,”isPublisherLicensed“:true,”is_open_access“:false,”is_publisher_licensed“:true,”numCitedBy“:3,”numCiting“:26,”paperId“:”e6bffa8c2b39a5ab9a614be2f442c95bcb7305f“,”引用“:[{”arxivId“:null,”作者“:[{”authorId“:”2300343“,”name“:”Jason Naradowsky“},{”authorId:“3259253”,“name”:“Kristina Toutanova”}],“doi”:null,“intent”:[],“isInfluential”:false,“paperId”:“d698cfb8f82628499f640b962bb860cc4e328c15”,“title”:“非监督双语语素分割和与上下文丰富的隐藏Semi-Markov模型对齐”,“url”:“https://www.semanticscholar.org/paper/d698cfb8f82628499f640b962bb860cc4e328c15“,”地点“:”ACL“,”年份“:2011},{”arxivId“:null,”作者“:[{”authorId“:”5541455“,”名称“:”G.Kiranmai“},”{“authorId”:“2105916171”,“name”:”K.Mallika“}”,{“authorId”:“37874481”,“name”:“M.A.Kumar”},“author Id:”2862769“,”“name”:”V.Dhanalakshmi“},“name”:“K.Soman”}],“doi”:“10.1007/978-3642-15766-0_68”,“intent”:[“background”],“isInfluential“:false,”paperId“:”56a5323865c0f08abb9be1a98b64210a50f00dd0“,”title“:”使用支持向量机的泰卢固语形态分析仪“,”url“:”https://www.semanticscholar.org/paper/56a5323865c0f08abb9be1a98b64210a50f00dd0“,”场所“:”ICT“,”年份“:2010},{”arxivId“:null,”作者“:[{”authorId“:”2082506592“,”名称“:”D.V.“},}”作者Id“:“2289019418”,“名称”:“Anandkumar M Rekha”},“作者Id”:“2299021564”,“姓名”:“R.U.Arunkumar”}、{”作者ID“:null,”姓名“:”Soman K“}”,{“作者ID”:“2073026993”,“name”:“R.S.”}],“doi”:“10.1109/ARTCom.2009.184”,“intent”:[“methodology”],“isInfluential“:false,”paperId“:”670d80ca8a3b9bf50c66e29fdc5a65b5206518c2“,”title“:”使用机器学习方法的粘合语言形态分析器“,”url“:”https://www.semanticscholar.org/paper/670d80ca8a3b9bf50c66e29fdc5a65b5206518c2“,”场所“:”ARTCom“,”年份“:2009},{”arxivId“:null,”作者“:[{”authorId“:”1991315“,”名称“:”S.Goldwater“},”作者Id“:“1799860”,“名称”:“T.Griffiths”},“作者Id”:“152465203”,“姓名”:“Mark Johnson”}],“doi”:“10.1016/j.认知.2009.03.008”,“意图”:[“方法论”],“具有影响力”:false,”paperId“:”46f31f9069bb934498a288126053bcab01ff34aa“,”标题“:”一个贝叶斯分词框架:探索上下文“,”url“的影响:”https://www.semanticscholar.org/paper/46f31f9069bb934498a28126053bcab01ff34aa“,”“地点”:“认知”,“年份”:2009},{“arxivId”:null,“作者”:[{“authorId”:“1759772”,“名称”:“Hoifung Poon”},}“authorId”:”144507724“,”名称“:”Colin Cherry“},”{“authorId“:”3259253“,”姓名“:”Kristina Toutanova“}],“doi”:“10.3115/1620754.1620785”,“意图”:[“背景”],“有影响力”:false,“paperId”:“681f9d533274d2a20f96349cac05ee0e68869b4c”,“标题”:“使用对数线性模型的无监督形态学分割“,”url“:”https://www.semanticscholar.org/paper/681f9d533274d2a20f96349cac05ee0e68869b4c“,”场所“:”NAACL“,”年份“:2009},{”arxivId“:null,”作者“:[{”authorId“:”144163566“,”名称“:”Benjamin Snyder“},{”authorId“:”1741283“,”名称“:”R.Barzilay“}],”doi“:null,”意图“:[”方法论“],”isInfluence“:false,”paperId“:”36ffcc1cc218ca36de384a107fb48e5abe2e6359“,”title“:”用于形态学分割的无监督多语言学习“,“url”:“https://www.semanticscholar.org/paper/36ffcc1cc218ca36de384a107fb48e5abe2e6359“,”场所“:”ACL“,”年份“:2008},{”arxivId“:null,”作者“:[{”authorId“:”40644569“,”名称“:”Sajib Dasgupta“},”作者ID“:”145106110“,”姓名“:”Vincent Ng“}],”doi“:nul,”意图“:[”背景“],”isInfluential“:false,”paperId“:“97ba3b789e092ef283e8c0a5f07779178d956f2d”,”标题“:”“高性能、与语言无关的形态学分割”,“url”:“https://www.semanticscholar.org/paper/97ba3b789e092ef283e8c0a5f07779178d956f2d“,”场所“:”NAACL“,”年份“:2007},{”arxivId“:null,”作者“:[{”authorId“:”2869436“,”名称“:”Vera Demberg“}],”doi“:nul,”意图“:[”背景“],”isInfluential“:false,”paperId“:https://www.semanticscholar.org/paper/646d43654bc1bb9ebb3e0127829485747a220dc8“,”场所“:”ACL“,”年份“:2007},{”arxivId“:null,”作者“:[{”authorId“:”1991315“,”名称“:”S.Goldwater“},”作者ID“:”1799860“,”姓名“:”T.Griffiths“}”,{“authorId”:“145177220”,“name”:“Mark Johnson”}],“doi”:“10.3115/1220175.120260”,“intent”:[],“isInfluential”:false,“paperId”:“a99d6ebe6”e583b752d1639a9002dd60581983dc1“,”标题“:”无监督分词中的上下文相关性“,”url“:”https://www.semanticscholar.org/paper/a99d6ebe6e583b752d1639a9002dd60581983dc1“,”场所“:”ACL“,”年份“:2006},{”arxivId“:null,”作者“:[{”authorId“:”2110153639“,”名称“:”Young-suk Lee“}],”doi“:”10.3115/1613984.1613999“,”意图“:[”方法论“],”isInfluential“:false,”paperId“:“a369fd029dff13c4148fca10ff6b0ebd66033aa2”,”标题“:”统计机器翻译的形态学分析“,”url“:”https://www.semanticscholar.org/paper/a369fd029dff13c4148fca10ff6b0ebd66033aa2“,”场所“:”NAACL“,”年份“:2004},{”arxivId“:null,”作者“:[{”authorId“:”144442621“,”名称“:”R.Xiao“},”作者Id“:“2344182”,“名称”:“Tony McEnery”},“作者Id”:“145706651”,“姓名”:“Paul Baker”}影响力“:false,”paperId“:”055a8d08a35e439de9f029ad60e4a4cd7727b804“,”标题“:”开发亚洲语言语料库:标准和实践“,”url“:”https://www.semanticscholar.org/paper/055a8d08a35e439de9f029ad60e4a4cd7727b804“,”场所“:”“,”年份“:2004},{”arxivId“:null,”作者“:[{”authorId“:”2257165738“,”名称“:”Mathias Creutz“}],”doi“:”10.3115/1075096.1075132“,”意图“:[”方法论“,”背景“],”isInfluential“:false,”paperId“:“8b9a274eec5f938595e0ebcbd0ce5f6cc619e75d”,”title“:”使用词形长度和频率的先验分布的无监督分词”,“url”:“https://www.semanticscholar.org/paper/8b9a274eec5f938595e0ebcbd0ce5f6cc619e75d“,”场所“:”ACL“,”年份“:2003},{”arxivId“:null,”作者“:[{”authorId“:”32278254“,”名称“:”M.Snover“},”作者Id“:“1731375”,“名称”:“M.Brent”}],“doi”:“10.3115/1073012.1073075”,“意图”:[],“具有影响力”:false,“paperId”:“f7a8bd1a62d22dev3a3db5edfcc364b462cd4558e”,“”title“:”语素和范式识别的贝叶斯模型“,”url“:”https://www.semanticscholar.org/paper/f7a8bd1a62d22de3a3db5edfcc364b462cd4558e“,”场所“:”ACL“,”年份“:2001},{”arxivId“:null,”作者“:[{”authorId“:”145926632“,”名称“:”J.Goldsmith“}],”doi“:”10.1162/08912011750300490“,”意图“:[”背景“],”有影响力“:false,”paperId“:“9f834ee11902ada79b874e7fe5072159d72a0f9f”,”标题“:”自然语言形态的无监督学习“,”url“:”https://www.semanticscholar.org/paper/9f834ee11902ada79b874e7fe5072159d72a0f9f“,”场所“:”CL“,”年份“:2001},{”arxivId“:null,”作者“:[{”authorId“:”1693517“,”名称“:”David Yarowsky“},”作者Id“:“2009300”,“名称”:”R.Wicentowski“}],”doi“:”10.3115/1075218.1075245“,”意图“:[”背景“],”有影响力“:false,”纸张Id“::”4d1dac08bb3d960baa88e4ff3477ec834446d056“,”“标题”:“通过多模态对齐进行最小监督形态学分析“,”url“:”https://www.semanticscholar.org/paper/4d1dac08bb3d960baa88e4ff3477ec834446d056“,”场所“:”ACL“,”年份“:2000},{”arxivId“:null,”作者“:[{”authorId“:”114544415“,”名称“:”Anne Lohrli“}],”doi“:”10.1093/NOTESJ/32.377-A“,”意图“:[],”具有影响力“:false,”paperId“:“0d75f47ef4ac8024d3c03efab0d05dbc19b056ea”,“title”:“Chapman and Hall”,“url”:“”https://www.semanticscholar.org/paper/0d75f47ef4ac8024d3c03efab0d05dbc19b056ea“,”场所“:”“,”年份“:1985},{”arxivId“:null,”作者“:[{”authorId“:”2466657“,”名称“:”K.Koskenniemi“}],”doi“:”10.3115/980491.980529“,”意图“:[”背景“],”具有影响力“:false,”paperId“:“ba817882de1a7093ee83bfa14a4005e5c8a5977e”,“title”:“用于单词形式识别和生成的通用计算模型”,“url”:“”https://www.semanticscholar.org/paper/ba817882de1a7093ee83bfa14a4005e5c8a5977e“,”场所“:”ACL“,”年份“:1984},{”arxivId“:null,”作者“:[{”authorId“:”2184438728“,”名称“:”Uma Maheshwar“},”作者ID“:”11970212“,”姓名“:”Amba P.Kulkarni“}”,{“authorId”:“2082333913”,“name”:“C.Mala”}],“doi”:null、“intent”:[“background”],“isInfluential”:false,“paperId”:“029aa 0ff23e4f420d6c3fabdc12f56200f1dd2fa“,”标题“:”A TELUGU形态分析仪“,”url“:”https://www.semanticscholar.org/paper/029aa0ff23e4f420d6c3fabdc12f56200f1dd2fa“,”场所“:”“,”年份“:2011},{”arxivId“:null,”作者“:[{”authorId“:”32747279“,”名称“:”M.Ganapathiraju“},”作者Id“:“1686960”,“名称”:“Lori S.Levin”}],”doi“:nul,”意图“:[”背景“],”isInfluential“:false,”paperId“:‘17f9e21a0ab8f52bfc49caa8baf074d2f90b22’,”title“”:“TelMore:泰卢固语名词和动词的形态生成器”,“url”:“https://www.semanticscholar.org/paper/17f9e21a0ab8f52bfc4949caa8baf074d2f90b22“,”场所“:”“,”年份“:2006},{”arxivId“:null,”作者“:[{”authorId“:”2219854“,”名称“:”Mathias Creutz“},”作者ID“:”2395884“,”姓名“:”K.Lagus“}],”doi“:nul,”意图“:[”方法“,”背景“],”具有影响力“:true,”纸张Id“:“2d6a97f83bb8207ea9d88118618ed3ab52054a88”,”标题“:”使用Morfessor 1.0“,”url“从文本语料库中进行无监督的词素分割和形态学归纳:”https://www.semanticscholar.org/paper/2d6a97f83bb8207ea9d88118618ed3ab52054a88“,”场所“:”“,”年份“:2005},{”arxivId“:null,”作者“:[{”authorId“:”2219854“,”名称“:”Mathias Creutz“},”作者Id“:“116724829”,“名称”:”Bo Krister Johan Linden“}],”doi“:nul,”意图“:[”方法“],”isInfluential“:false,”paperId“:”19b8d1def48288a22cc4256a9a620cdf7f294d2f“,”标题“:”芬兰语和英语的词素分割黄金标准“,”url“:”https://www.semanticscholar.org/paper/19b8d1def48288a22cc4256a9a620cdf7f294d2f“,”场所“:”“,”年份“:2004},{”arxivId“:null,”作者“:[{”authorId“:”1740202“,”名称“:”Akshar Bharati“},{”authorId“:”3341712“,”名称“:”R.Sangal“},{”authorId“:”35246152“,”名称“:”D.Sharma“},{”authorId“:”1829635“,”名称“:”R.Mamidi“}],”doi“:null,”intentient“:[”methodology“],”isInfluence“:false,“paperId”:“c66a257a4f55a6b5c5638223a5e1dee4353e1042”,“标题”:“通用形态分析外壳“,”url“:”https://www.semanticscholar.org/paper/c66a257a4f55a6b5c5638223a5e1dee4353e1042“,”场所“:”“,”年份“:2004},{”arxivId“:null,”作者“:[{”authorId“:”145392702“,”名称“:”P.Green“}],”doi“:nul,”意图“:[],”影响“:false,”paperId“:“3f1bb45d5d20c107daa9dbc489019cf22a3a6e6b”,“title”:“实践中的马尔可夫链蒙特卡罗”,“url”:“”https://www.semanticscholar.org/paper/3f1bb45d5d20c107daa9dbc489019cf22a3a6e6b“,”场所“:”“,”年份“:1996},”s2FieldsOfStudy“:[{”类别“:”计算机科学“,”源“:”外部“},{”范畴“:”电脑科学“,“源”:“s2-fos-model”},“category”:“语言学”,“source”:”s2-fos-model“}],“title”:“粘合语言形态切分中长度效应的研究”,“topics”:[{”主题“:”无监督学习“,”topicId“:”7721“,”url“:”https://www.semanticscholar.org/topic/7721“},{”topic“:”文本分段“,”topicId“:”64636“,”url“:”https://www.semanticscholar.org/topic/64636“},{”topic“:”F1得分“,”topicId“:”719“,”url“:”https://www.semanticscholar.org/topic/719“},{”主题“:”N-gram“,”主题Id“:”236768“,”url“:”https://www.semanticscholar.org/topic/236768“},{”topic“:”Galaxy形态学分类“,”topicId“:”8320“,”url“:”https://www.semanticscholar.org/topic/8320“}],”url“:”https://www.semanticscholar.org/paper/e6bffa8c2b39f57ab9a614be2f442c95bcb7305f“,”地点“:”“,”年份“:2012}