nlg building a sentence

I would like to generate a sentence having as input words. Eg

Input:

Mary
chase
the monkey

Output:

Mary chases the monkey.

This could be done using a simpleNLG library: http://code.google.com/p/simplenlg/ in the following way:

String subject = "Mary";
String verb = "chase";
String object = "the monkey";

p.setSubject(subject);
p.setVerb(verb);
p.setObject(object);

String output = realiser.realiseSentence(p);
System.out.println(output);

This will generate the sentence Mary chases the monkey. But I would like to make it automated where I input words and the sentence gets generated. This would require some preprocessing that would specify which word is a subject which word is a verb and which is an object. I know there are POS (parts of speech) tagging libraries but they don't specify whether it is a subject or object. Any suggestions how this could be done? Also for make it work for bigger sentences with multiple objects, adverbs etc.


In order to obtain the subject, verb or object for the input sentence you need to perform syntactic analysis or parsing.

There are two main groups of parsing tools, constituent parsers and dependency parsers, but usually the former is the more direct path to obtain what you need.

These are some research constituent parsers that you may try:

  • Stanford parser
  • Berkeley parser
  • BUBS parser
  • This related question may also help: Simple Natural Language Processing Startup for Java


    Most common approach is to build ngramm statistics and then build most propable sequnce of words. Oen famous example can be found here http://scribe.googlelabs.com/


    It would depend on the order of the words. If the order is Mary chase the monkey then the output would be Mary chases the monkey. If the order is the monkey chase Mary then the output would be The monkey chases Mary.

    I had a look at the OpenNLP parser but it takes as input a sentence which is being parsed. What I have as input is words and I need to build a sentence.

    And anyway when I look at the example: The quick brown fox jumps over the lazy dog .

    The parser should now print the following to the console. (TOP (NP (NP (DT The) (JJ quick) (JJ brown) (NN fox) (NNS jumps)) (PP (IN over) (NP (DT the) (JJ lazy) (NN dog))) (. .)))

    All I can see is parts of speech. I can't see it specifying objects, subjects etc. unless there is such a function in the API.

    If I am wrong, correct me.

    链接地址: http://www.djcxy.com/p/52880.html

    上一篇: 如何减少django模型中的查询

    下一篇: nlg建立一个句子