python-顶级程序员

NLTK extracting terms of chunker parse tree

John Edward Grey started running now that he knows he is fat She was listening to smack that by that awful singer I want to extract interesting terms from a sentence. I currently use POS tagging to identify grammatical types of each entity. Then I update each token to a counter (with different weights for nouns, verbs and adjectives). I now wish to use a chunker for this. I think the lea

2018-06-26 06:21:02

NLTK提取chunker解析树的术语

现在约翰爱德华格雷开始跑步，他知道他很胖她是听由可怕的歌手嫌那我想从句子中提取有趣的术语。我目前使用POS标记来识别每个实体的语法类型。然后我将每个标记更新到一个计数器（对于名词，动词和形容词使用不同的权重）。我现在希望为此使用一个chunker。我认为分析树的叶节点包含所有有趣的单词和短语。我如何从chunker输出中提取术语？在语言学中，“有趣的单词”被称为open class words 。而你所指

2018-06-26 06:21:02

Combining a Tokenizer into a Grammar and Parser with NLTK

I am making my way through the NLTK book and I can't seem to do something that would appear to be a natural first step for building a decent grammar. My goal is to build a grammar for a particular text corpus. (Initial question: Should I even try to start a grammar from scratch or should I start with a predefined grammar? If I should start with another grammar, which is a good one to star

2018-06-26 06:20:00

将一个Tokenizer与NLTK组合成一个语法和解析器

我正在通过NLTK书的方式，我似乎无法做一些似乎是建立一个体面的语法的自然的第一步。我的目标是为特定的文本语料库构建语法。（最初的问题：我是否应该尝试从头开始语法，或者我应该从一个预定义的语法开始？如果我应该从另一个语法开始，这对于英语来说是一个好语法？）假设我有以下简单的语法： simple_grammar = nltk.parse_cfg(""" S -> NP VP PP -> P NP NP -> Det N | Det N PP VP -> V NP | VP PP

2018-06-26 06:20:00

Comparing lists containing NaNs

I am trying to compare two different lists to see if they are equal, and was going to remove NaNs, only to discover that my list comparisons still work, despite NaN == NaN -> False . Could someone explain why the following evaluate True or False , as I am finding this behavior unexpected. Thanks, I have read the following which don't seem to resolve the issue: Why in numpy nan == na

2018-06-26 05:21:00

比较包含NaN的列表

我试图比较两个不同的列表以查看它们是否相等，并且要删除NaN，但发现列表比较仍然有效，尽管NaN == NaN -> False 。有人可以解释为什么以下评估为True或False ，因为我发现这种行为出乎意料。谢谢，我已阅读以下内容，似乎无法解决问题：为什么在numpy nan == nan是false而nan在nan是真的？为什么NaN不等于NaN？ [重复] （Python 2.7.3，numpy-1.9.2）最后我用*标出了惊人的评价 >>> nan = np.

2018-06-26 05:21:00

Python: sort function breaks in the presence of nan

sorted([2, float('nan'), 1]) returns [2, nan, 1] (At least on Activestate Python 3.1 implementation.) I understand nan is a weird object, so I wouldn't be surprised if it shows up in random places in the sort result. But it also messes up the sort for the non-nan numbers in the container, which is really unexpected. I asked a related question about max , and based on that I und

2018-06-26 05:19:58

Python：在nan存在的情况下，sort函数会中断

sorted([2, float('nan'), 1])返回[2, nan, 1] （至少在Activestate Python 3.1实现上。）我了解nan是一个奇怪的对象，所以如果它出现在排序结果的随机位置，我不会感到惊讶。但它也混淆了容器中的非nan数字，这实在是出乎意料。我问了一个关于max的相关问题，并基于这个我明白了为什么sort是这样的。但是，这应该被视为一个错误？文档只是说“返回一个新的排序列表[...]”，而没有指定任何细节。编辑：

2018-06-26 05:19:58

Calling external c++ template functions within Cython

I have a number of c++ template functions declared and implemented in a c++ header file and I want to access some of the functions within Cython. Suppose the c++ code is in header.hpp as follows template <class T> T doublit(T& x) { return 2*x; } What do I need to write in the .pyx file and in the setup.py file so that I can use the function in Python as >>> import mod

2018-06-26 04:36:37

在Cython中调用外部c ++模板函数

我有一些c ++模板函数在c ++头文件中声明和实现，我想访问Cython中的一些函数。假设c ++代码位于header.hpp ，如下所示 template <class T> T doublit(T& x) { return 2*x; } 我需要在.pyx文件和setup.py文件中写入什么内容才能使用Python中的函数 >>> import modname >>> print modname.doublit(3) 6 PS：是否可以在PYPY中访问相同的功能？如果是，如何？感谢您的帮助。但当我

2018-06-26 04:36:36

Python NameError: global name 'assertEqual' is not defined

I'm following Learn Python the Hard Way and I'm on Exercise 47 - Automated Testing (http://learnpythonthehardway.org/book/ex47.html) I am using Python3 (vs the book's use of Python 2.x) and I realize that assert_equals (which is used in the book) is deprecated. I am using assertEqual. I am trying to build a test case but for some reason, when using nosetests in cmd, I get the err

2018-06-26 01:10:40

Python NameError：未定义全局名称'assertEqual'

我正在学习Python的难题，我正在练习47 - 自动测试（http://learnpythonthehardway.org/book/ex47.html）我使用的是Python3（与本书使用的Python 2.x相同），我意识到assert_equals（本书中使用的）已被弃用。我正在使用assertEqual。我试图构建一个测试用例，但由于某些原因，在cmd中使用nosetests时，出现错误： NameError: global name 'assertEqual' is not defined 代码如下： from nose.tools import *

2018-06-26 01:10:40

resources within a project directory targeting python 2.5.1

I have python .egg files that are stored in a relative location to some .py code. The problem is, I am targeting python 2.5.1 computers which require my project be self contained in a folder (hundreds of thousands of OLPC XO 8.2.1 release laptops running Sugar). This means I cannot just ./ez_install to perform a system-wide setuptools/pkg_resources installation. Example directory structure:

2018-06-26 00:37:27

针对python 2.5.1的项目目录中的资源

我有python .egg文件存储在一些.py代码的相对位置。问题是，我针对python 2.5.1计算机，这些计算机需要将我的项目自包含在一个文件夹中（数十万个运行Sugar的OLPC XO 8.2.1版本的笔记本电脑）。这意味着我不能只用./ez_install来执行系统范围的setuptools / pkg_resources安装。示例目录结构： My Application/ My Application/library1.egg My Application/libs/library2.egg My Application/test.py 我想知道如何从te

2018-06-26 00:37:27

HOME Error with rpy2

I know there are quite a few posts on getting up and running with rpy2 on windows 7 32 bit. I have referenced a good number of them and attempted their solutions, including the use of PypeR . I dont explicitly have a R_HOME variable set in my path, but per this question, I confirmed that R is in my PATH (I can type R at the command line and get R to run) and even copied all of the files from t

2018-06-25 23:00:27

HOME rpy2错误

我知道在Windows 7 32位上启动和运行rpy2的文章有很多。我已经参考了其中很多人并尝试了他们的解决方案，包括使用PypeR 。我没有明确地在我的路径中设置R_HOME变量，但是通过这个问题，我确认了R在我的PATH中（我可以在命令行键入R并让R运行），甚至从i386中复制所有文件文件夹父bin文件夹。我的问题粘贴在下面。有什么想法吗？ In [5]: from rpy2 import robjects -------------------------------------------------

2018-06-25 23:00:26

trivial sums of outer products without temporaries in numpy

The actual problem I wish to solve is, given a set of N unit vectors and another set of M vectors calculate for each of the unit vectors the average of the absolute value of the dot product of it with every one of the M vectors. Essentially this is calculating the outer product of the two matrices and summing and averaging with an absolute value stuck in-between. For N and M not too large this

2018-06-25 22:58:14

没有临时装饰的外层产品的小数目

我希望解决的实际问题是，给定一组N个单位矢量，并且另一组M矢量针对每个单位矢量计算其与每个M矢量的点积的绝对值的平均值。基本上，这是计算两个矩阵的外积，并求中值之间的绝对值求和和求平均值。对于N和M不太大，这并不难，有很多方法可以继续（见下文）。问题是，当N和M很大时，创建的临时对象是巨大的，并为所提供的方法提供了实际的限制。这个计算可以完成而不需要创建临时对象吗？我所遇到的主要困难是由于绝

2018-06-25 22:58:13

What substitutes xreadlines() in Python 3?

In Python 2, file objects had an xreadlines() method which returned an iterator that would read the file one line at a time. In Python 3, the xreadlines() method no longer exists, and realines() still returns a list (not an iterator). Does Python 3 has something similar to xreadlines()? I know I can do for line in f: instead of for line in f.xreadlines(): But I would also like to use xrea

2018-06-25 20:21:35

在Python 3中替代xreadlines（）是什么？

在Python 2中，文件对象有一个xreadlines（）方法，它返回一个可以一次读取一行文件的迭代器。在Python 3中，xreadlines（）方法不再存在，realines（）仍然返回一个列表（不是迭代器）。 Python 3是否有类似于xreadlines（）的东西？我知道我能做到 for line in f: 代替 for line in f.xreadlines(): 但我也想使用没有for循环的xreadlines（）： print(f.xreadlines()[7]) #read lines 0 to 7 and prints line 7 文

2018-06-25 20:21:35