Extract Word from Synset using Wordnet in NLTK 3.0
Some time ago, someone on SO asked how to retrieve a list of words for a given synset using NLTK's wordnet wrapper. Here is one of the suggested responses:
for synset in wn.synsets('dog'):
print synset.lemmas[0].name
Running this code with NLTK 3.0 yields TypeError: 'instancemethod' object is not subscriptable
.
I tried each of the previously-proposed solutions (each of the solutions described on the page linked above), but each throws an error. I therefore wanted to ask: Is it possible to print the words for a list of synsets with NLTK 3.0? I would be thankful for any advice others can offer on this question.
WordNet works fine in NLTK 3.0. You are just accessing the lemmas (and names) in the wrong way. Try this instead:
>>> import nltk
>>> nltk.__version__
'3.0.0'
>>> from nltk.corpus import wordnet as wn
>>> for synset in wn.synsets('dog'):
for lemma in synset.lemmas():
print lemma.name()
dog
domestic_dog
Canis_familiaris
frump
dog
dog
cad
bounder
blackguard
...
synset.lemmas
is a method and does not have a __getitem__()
method (and so is not subscriptable).
You can also go directly to the lemma names with lemma_names()
:
>>> wordnet.synset('dog.n.1').lemma_names()
['dog', 'domestic_dog', 'Canis_familiaris']
And it works for multiple languages
>>>> wordnet.synset('dog.n.1').lemma_names(lang='jpn')
['イヌ', 'ドッグ', '洋犬', '犬', '飼犬', '飼い犬']
Use:
wn.synset('dog.n.1').name()
instead of:
wn.synset('dog.n.1').name
because NLTK changed Synset properties to get functions instead. see https://github.com/nltk/nltk/commit/ba8ab7e23ea2b8d61029484098fd62d5986acd9c
This is a good list of changes to NLTK's API to suit py3.x: https://github.com/nltk/nltk/wiki/Porting-your-code-to-NLTK-3.0
链接地址: http://www.djcxy.com/p/62414.html