Extract relationships from a sentence in NLTK
I am using NLTK to extract the relationship between a PERSON and an ORGANIZATION.
Also, I want to extract the relationship between ORGANIZATION and LOCATION. The NLTK version is 3.2.1.
I've made use of Part-Of-Speech tagging and Named Entity Recognition (NER). Also the Parse Tree is drawn for the NER results.
But I am not able to extract the mentioned relationships from that sentence.
Here is the code:
import nltk, re
from nltk import word_tokenize
sentence = "Mark works at JPMC in London every day"
pos_tags = nltk.pos_tag(word_tokenize(sentence)) # POS tagging of the sentence
ne = nltk.ne_chunk(pos_tags) # Named Entity Recognition
ne.draw() # Draw the Parse Tree
IN = re.compile(r'.*binb(?!b.+ing)')
for rel1 in nltk.sem.extract_rels('PER', 'ORG', pos_tags, pattern = IN):
print(nltk.sem.rtuple(rel1))
for rel2 in nltk.sem.extract_rels('ORG', 'LOC', pos_tags, pattern = IN):
print(nltk.sem.rtuple(rel2))
How to extract 'Person - Organization' relationship and 'Organization - Location' relationship?
I think docs is not tagged pos, it should be NE.
Working code
senten = "Mark works in JPMC in London every day"
pos_tags = nltk.pos_tag(word_tokenize(senten)) # POS tagging of the sentence
ne = nltk.ne_chunk(pos_tags) # Named Entity Recognition
chunked = nltk.ne_chunk_sents(pos_tags, binary=True)
# ne.draw() # Draw the Parse Tree
print(pos_tags)
IN = re.compile(r'.*binb(?!b.+ing)')
for rel in nltk.sem.extract_rels('PERSON', 'ORGANIZATION', ne, corpus='ace', pattern=IN):
print(nltk.sem.rtuple(rel))
Output
[PER: 'Mark/NNP'] 'works/VBZ in/IN' [ORG: 'JPMC/NNP']
链接地址: http://www.djcxy.com/p/65168.html上一篇: 如何使用nltk从字符串中提取名称
下一篇: 从NLTK中的句子中提取关系