This week's giveaway is in the Testing forum.
We're giving away four copies of TDD for a Shopping Website LiveProject and have Steven Solomon on-line!
See this thread for details.
Win a copy of TDD for a Shopping Website LiveProject this week in the Testing forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Jeanne Boyarsky
  • Tim Cooke
  • Liutauras Vilda
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Frits Walraven
  • Piet Souris
  • Himai Minh

OpenNLP and Apache commons.lang

Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
These two libraries I will be implementing into a text based game engine! (Since the primary focus is on these APIs, I decided to use the API board instead of the game development board.)

Specifically from the OpenNLP project, (Open Natural Language Processing), I am focused on the SentenceDetector (and related), as well as the Parser (and related).

Look, here in my constructor I have initiated these two tools with their, "Trainers" which give them the... 'brains' to do what they do in breaking down and recognizing natural language in english.

And now the reason I created this thread, is that an appropriate method for initializing these objects/tools?

Their purpose will be to analyze and parse various kinds of input from the user, to determine what the user is doing, because by text based,
I don't just mean choose your own adventure, I'm talking of implenting a full on complex control structure.

> Go west, grab that sword and swing at the orc with it.

The sentence detector would do it's thing to that, and tell me, "Yup, that is one sentence."


The parser will do some crazy %!#@ to it which I have a hard time describing, but it comes out looking like this...

Input: The quick brown fox jumps over the lazy dog .

Output: (TOP (NP (NP (DT The) (JJ quick) (JJ brown) (NN fox) (NNS jumps)) (PP (IN over) (NP (DT the)
(JJ lazy) (NN dog))) (. .)))

The tags and such generated I have yet to have memorized the glossary for, NN=Noun... and such...

The parenthesis is the framework too a tree mapping of the sentence structure I'm assuming but I have yet to visualize methods of applying this...


I am wondering if anyone is familiar with the libraries OpenNLP and the commons.lang from Apache.
And if one might be available to reply on this thread for the ongoing future to come because I will be working
heavily with these libraries implementing their interfaces into my engine.



Edit: Turns out I scrapped the parser tool! All I needed were the other tools, POSTagger (Part-of-speech Tagger) and the tokenizer!

Process of using these tools:

1. Sentences are separated into different elements of an array as Strings.
2. Each sentence is tokenized and the tokenized sentences are stored into an ArrayList<String[]>. (Each String array, contained the tokenized sentences.)
3. Then the POSTagger iterates through the sentences' tokens, and generates an ArrayList<String[] of tags! (Each element to each array corresponds to the matching tokens from the sentences.)

The results yield:

Sentence: Crypto_PRP ,_, please_VB execute_VB command_NN zero_NN ._.
> Crypto, please execute command zero.

This one time, at bandcamp, I had relations with a tiny ad.
free, earth-friendly heat - a kickstarter for putting coin in your pocket while saving the earth
    Bookmark Topic Watch Topic
  • New Topic