Win a copy of Svelte and Sapper in Action this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Bear Bibeault
  • Junilu Lacar
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • salvin francis
  • Frits Walraven
Bartenders:
  • Scott Selikoff
  • Piet Souris
  • Carey Brown

Specific problem domains in which which mahout is best

 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Welcome to the Javaranch.

Question which I have is that for clustering we use mahout but are there any specific scenarios in which mahout gives the better performance?I mean to say the specific problems types for which it gives a better result over others?

Thanks
 
author
Posts: 21
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Gives better results than what? And "better" in the sense of faster, or "more accurate"?

The clustering algorithms in Mahout are fairly standard algorithms, not some special approach. So I think they perform as well as any other implementation of these standard algorithms in terms of quality.

In terms of performance -- they are implemented on Hadoop. This means it is much easier to scale up to very large data sets, but means you incur a lot of Hadoop overhead. For small data sets, you could probably find a faster implementation that is all on one machine, maybe something written in R. For very large data sets, where you can't apply non-distributed tools, I imagine it's about as good as anything else freely available out there. Honestly I'm not aware of another distributed clustering package to compare to.
 
Alok Bhandari
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello

Thanks for your reply. Yes I was asking in terms of the accuracy and performance both.

Thanks
 
Consider Paul's rocket mass heater.
    Bookmark Topic Watch Topic
  • New Topic