hanzi_char_featurizer:汉字字符特征提取器 (featurizer)，提取汉字的特征（发音特征、字形特征）用做深度学习的特征｜ A Chinese character feature extractor, which extracts the features of Chinese characters (pronunciation features, glyph features) as features for deep learning下载

【文件属性】：
文件名称：hanzi_char_featurizer:汉字字符特征提取器 (featurizer)，提取汉字的特征（发音特征、字形特征）用做深度学习的特征｜ A Chinese character feature extractor, which extracts the features of Chinese characters (pronunciation features, glyph features) as features for deep learning
文件大小：252KB
文件格式：ZIP
更新时间：2021-05-08 05:59:22
hanzi feature-engineering chinese-char-feature-dataset chinese-char-feature character-level-featurize 汉字字符特征提取器（featurizer）在深度学习中，很多场合需要提取汉字的特征（发音特征、字形特征）。本项目提供了一个通用的字符特征提取框架，并内建了拼音、字形（四角编码）和部首拆解的特征。特征提取器拼音特征提取器：提取汉字的拼音作为特征，发音相似的字在编码上应该相似。示例：胡 -> hú，福 -> fú 字形（四角编码）提取器：提取中文的外形作为特征，相似的汉字在编码上应该相近。示例：门 -> 37001，闩 -> 37101 部首拆解提取器：提取汉字的偏旁部首拆解作为特征，相似的汉字在编码上应该相近。示例：闩 -> ['门', '一']，闫 -> ['门', '三'] 使用 from hanzi_char_featurizer import Featurizor featurizor = Featurizor() result = featurizor.featur

立即下载

【文件预览】：
hanzi_char_featurizer-master
----example_code.py(126B)
----image()
--------structure.jpg(127KB)
--------huya_tv.png(123KB)
--------diagram.pptx(38KB)
----example_as_tensor.py(306B)
----raw_requirements.txt(0B)
----.idea()
--------misc.xml(4KB)
--------thriftCompiler.xml(140B)
--------hanzi_char_featurizer.iml(520B)
--------modules.xml(294B)
--------workspace.xml(17KB)
----makefile(3KB)
----hanzi_char_featurizer()
--------featurizers()
--------__init__.py(3KB)
----usage()
--------four_corner_example.py(3KB)
--------pinyin_parts_example.py(2KB)
--------data.txt(95B)
----setup.py(333B)
----dev_requirements.txt(44B)
----README.md(2KB)
----LICENSE.txt(11KB)

秒客网

网友评论

相关文章