Win a copy of Svelte and Sapper in Action this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Bear Bibeault
  • Junilu Lacar
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • salvin francis
  • Frits Walraven
Bartenders:
  • Scott Selikoff
  • Piet Souris
  • Carey Brown

Version in HBase Coloumn family

 
Ranch Hand
Posts: 149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

The cells in HBase have a concept of Version. By default, we can have upto 3 versions for a cell. There are methods to get, put and scan the coloumn.

My questions are

HOW does my application benefit by the versions?

In RDBMS, there was only one version/cell. Now HBase has offered multiple versions. WHY does HBase have multiple versions per cell?

Thanks,
Rajesh
 
Bartender
Posts: 1210
25
Android Python PHP C++ Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The version concept comes from Google's BigTable paper, which was the basis for implementing HBase.

Google's search spider keeps visiting websites multiple times. Since websites may change between each visit, BigTable stores multiple
versions of the contents and perhaps relationships between sites. So it's easy to make a query like "get latest contents of <url>" or "get latest 2 versions of <url> and diff them".

It's like version control for data.
If a cell value can change but you need the history of changes later on - perhaps for auditing or diff'ing - use versions.

Whether it's useful to your application depends on what your application does.

For example, an editable wiki can store multiple versions of a wiki article in the same row and column. If you were using an RDBMS, it would require
multiple rows with different entries in the timestamp column.
 
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Rajesh,

You are right Hbase does have version support in column family. Although I am not sure about the number of version it support.

According to me, having version support is one of the key benefits of Hbase.
In RDBMS, you can maintain a backup of the database for case like failure or roll back. It will consume lot of space and you have to load the whole backup inorder to check the single change in column value.
With HBase, you can simply do it by writing a single code:
For example: -
- to return more than one version, see Get.setMaxVersions()

You can also check the values at given time:
- to return versions other than the latest, see Get.setTimeRange()

You can check the hbase version example here:
http://hbase.apache.org/0.94/book/versions.html

Hbase is typically used in Analytics now days. If you are able to check the value change in the same field which is very important aspect of analytic you can easily do it with Hbase.
If you look for google, you will find the multiple scenarios of the version support.
 
My previous laptop never exploded like that. Read this tiny ad while I sweep up the shards.
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
    Bookmark Topic Watch Topic
  • New Topic