3.zlib
wget http://zlib.net/zlib-1.2.11.tar.gz tar xzf zlib-1.2.11.tar.gz cd zlib-1.2.11 ./configure make make install
4.bzip
wget https://fossies.org/linux/misc/bzip2-1.0.6.tar.gz tar xzvf bzip2-1.0.6.tar.gz cd bzip2-1.0.6/ make make install
5.libbz2-dev
apt-get install libbz2-dev
6.kenlm
在github上有详细的说明,https://github.com/kpu/kenlm。下载解压后
cd kenlm mkdir -p build cd build cmake .. make -j 4 # 启用4个cpu去编译。提高编译速度 cd .. python setup.py install
测试,在python环境中导入kenlm无报错,说明kenlm安装成功。或者运行\kenlm\python\example.py文件
nltk直接用pip下载就行,nltk_data文件较大,可以离线下载后添加进路径。win10下使用nltk_data,直接放进D盘中就行,nltk会自动查找到。但是在Linux下需要将nltk_data路径添加到data,或者移动到下面输出的路径中。为了方便,我个人是建立了个软链接sudo ln -s /mnt/d/nltk_data /usr/local/nltk_data
import nltk nltk.data.find(".") # Searched in: # - '/root/nltk_data' # - '/usr/local/nltk_data' # - '/usr/local/share/nltk_data' # - '/usr/local/lib/nltk_data' # - '/usr/share/nltk_data' # - '/usr/local/share/nltk_data' # - '/usr/lib/nltk_data' # - '/usr/local/lib/nltk_data'
在当前会话下添加路径到data
from nltk import data data.path.append(r"你下载的nltk_data所在路径")
添加完路径,使用nltk.data.path
查看当前已添加路径
简单测试
from nltk.tokenize import word_tokenize sentence = "since the 1890s , and beginning in france , the term ''libertarianism '' has often been used as an synonym for anarchism and was used almost exclusively in this sense until the 1950s in the united states ; its use as an synonym is still common outside the united states ." print(word_tokenize(sentence))
总结
以上所述是小编给大家介绍的win10子系统python开发环境准备及kenlm和nltk的使用教程,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对IIS7站长之家网站的支持!
如果你觉得本文对你有帮助,欢迎转载,烦请注明出处,谢谢!