• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

reading info from english dictionary

 
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi all,

here is what i am trying to do :
Trying to achieve Natural Language Processing in java.
To do that, the first step is to be able to classify words into their respective parts of speech. To do that, I need to refer to a dictionary or build a database myself. Building a database myself to classify noun or verb seems stupid, so I was thinking if I could make the program to go online when it finds the words not in its database and add that word to the local database using some online dictionary?

If anyone feels uncomfortable to read the question, please post your doubts,
if anyone feels there is a better way of doing this, help me with your ideas,
if anyone knows how to do it, please do guide me..
thanks to all
 
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Seems reasonable, although if the dictionary in question doesn't have an API it'll be a lot of work. You should probably check for existing word classification work since NLP isn't a new field.

Be aware that classification depends on context, and NLP in general is a non-truvial problem.
 
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Not a "beginning" question. Moving.

As David Newton says, natural language processing is a major problem; it is really a science in its own right.
 
Shrinath M Aithal
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
ok, thank you guys..
But may I know how do you read from a online page on the web and extract only the information you want?? Like lookup a word in online thesauraus and say if it is verb or noun or what part of speech it is?
Because I googled a bit, and couldn't find many java source codes that could do what I wanted.. Any help would be enlightning and appreciated..
 
David Newton
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Without an API you'd have to screen-scrape.

As I said--I'd seriously consider looking for existing datasets, although naive, non-contextual usage may not be what you want.

I'd probably join the ACM (if you're not already a member) and start reading papers---a ton of dissertations and theses have been written on what you're trying to accomplish.
 
Shrinath M Aithal
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
ok.. So what you say is I use the already existing datasets, what do you feel about Wordnet? would it be easier ?
By the way, thanks for that ACM, I wasn't aware of that.. Now there are loads of things what I wanted
 
Shrinath M Aithal
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
found a good api based and command line based Parts of Speech tagger, "stanford pos tagger", thought would just let anyone know if they are looking for one.. Thank you guys
 
reply
    Bookmark Topic Watch Topic
  • New Topic