Win a copy of Zero to AI - A non-technical, hype-free guide to prospering in the AI era this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

Which one of these is a more commonly accepted approach for hadoop cluster on cloud

Ranch Foreman
Posts: 1770
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have seen the below two approaches regarding hadoop cluster on cloud:

Approach 1: Create Virtual Machines on cloud depending on the number of nodes you want. On those nodes ,install hadoop and create a hadoop cluster. Keep these Virtual machines up and running and thus keep paying the cost continously.

Approach 2: Create a Unix script having all the commands from creating Virtual machines to creating the hadoop cluster. Run this script to create the virtual machines and then the hadoop cluster.Use cluster to do your processing. After your work is done, shut down the virtual machines and delete them. Next,time you have to do work, run the script which will create the virtual machines and cluster ,and then do your processing.And so on.

This approach is cheaper because,cluster will be up and running only when required .

Which of the above approaches is more commonly accepted in the industry ?thanks .
money grubbing section goes here:
the value of filler advertising in 2020
    Bookmark Topic Watch Topic
  • New Topic