• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to get Company name from domain name

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, I want to write one java API which takes domain name as parameter and returns company name.

Some Examples

Url -> Comany name

xxx.amazon.com -> Amazon (.com domain)
xxx.amazon.co.uk -> Amazon (2-level country domain (.uk, .au., .il))
xxx.amazon.edu.uk -> amazon.edu.uk (other 2-level country domain)
xxx.amazon.ca -> Amazon (Single-level country domain (.ca, .de))
xxx.adap.tv -> adap.tv (other top-level domain (.edu, .tv))


Can any one share your ideas/opinion/algo/regex.

Thanks in advance.
 
Marshal
Posts: 79178
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Difficult to be certain, but maybe a job regular expression.

You either want the last . (remember you have to escape it as \. or \\.) or the last-but-one . and if you want the last-but-one . there are only a few combinations which can follow it (eg co. ac. org. me. ltd. gov.) which you should be able to find from the W3C website or similar. BTW: we don't use .edu in Britain, only .ac.uk.

And remember there are only a few countries eg .uk where we use two .s eg .co.uk.
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The problem is finding out which these are. I can think of 3 easily: .co.jp, .co.uk and .com.au. But do you know all the others? I doubt it.
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As an aside, tv is a *country* domain: http://en.wikipedia.org/wiki/.tv
 
Campbell Ritchie
Marshal
Posts: 79178
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I found this document about syntax from this Wikipedia page.
 
Campbell Ritchie
Marshal
Posts: 79178
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ilja Preuss:
As an aside, tv is a *country* domain.


Not any more; the people of Tuvalu sold it a few years ago for millions.
 
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have the same question 1 year later
Is there any support in JAVA to find out company name and top domain from an email address without hardcoding the .co.uk, com.au, com.tw etc.




reply
    Bookmark Topic Watch Topic
  • New Topic