Chunking with nltk
How can I obtain all the chunk from a sentence given a pattern. Exemple
NP:{<NN><NN>}
Sentence tagged:
[("money", "NN"), ("market", "NN") ("fund", "NN")]
If I parse I obtain
(S (NP money/NN market/NN) fund/NN)
I would like to have also the other alternative that is
(S money/NN (NP market/NN fund/NN))
I think your question is about getting the n
most likely parses of a sentence. Am I right? If yes, see the nbest_parse(sent, n=None)
function in the 2.0 documentation.
@mbatchkarov is right about the nbest_parse documentation. For the sake of code example see:
import nltk
# Define the cfg grammar.
grammar = nltk.parse_cfg("""
S -> NP
S -> NN NP
S -> NP NN
NP -> NN NN
NN -> 'market'
NN -> 'money'
NN -> 'fund'
""")
# Make your string into a list of tokens.
sentence = "money market fund".split(" ")
# Load the grammar into the ChartParser.
cp = nltk.ChartParser(grammar)
# Generate and print the nbest_parse from the grammar given the sentence tokens.
for tree in cp.nbest_parse(sentence):
print tree
链接地址: http://www.djcxy.com/p/91716.html
上一篇: 我如何使用NLTK和Python标记和块法文文本?
下一篇: 用nltk分块