aspose file tools*
The moose likes Hadoop and the fly likes only one cell scan. how all scan possible? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "only one cell scan. how all scan possible?" Watch "only one cell scan. how all scan possible?" New topic
Author

only one cell scan. how all scan possible?

Joseph Hwang
Greenhorn

Joined: Aug 17, 2013
Posts: 16
I use HBase 0.96 for hadoop 2.2. And I try to code mapreduce with TableMapReduceUtil.initTableMapperJob method.
The number of cells of my hbase column family is allmost 200. Below are my data format and map function codes

Data =========
Brazil column=INTLCTRY_DATA:age, timestamp=1396002150554, value=Aged 15-24
Brazil column=INTLCTRY_DATA:average, timestamp=1396002150554, value=3831000.0
Brazil column=INTLCTRY_DATA:data, timestamp=1377961200000, value=3831000.0 <=(This cell contains 200 timestamps)
Brazil column=INTLCTRY_DATA:freq, timestamp=1396002150554, value=\x00
Brazil column=INTLCTRY_DATA:sex, timestamp=1396002150554, value=All Persons
Brazil column=INTLCTRY_DATA:title, timestamp=1396002150554, value=Active Population

Driver class=====
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
Configuration config = HBaseConfiguration.create();
Job job = Job.getInstance(config,”HBase MapReduce Test”);
job.setJarByClass(MyDriver.class);

Scan scan = new Scan();
scan.setMaxResultSize(200);
scan.setCaching(1000);
scan.setCacheBlocks(false);
scan.setMaxVersions();

TableMapReduceUtil.initTableMapperJob( “INTLCTRY_TABLE”, scan, MyMapper.class, Text.class, FloatWritable.class, job );
….

Map class======
public class MyMapper extends TableMapper<Text, FloatWritable> {

private final byte[] COLUMN_FAMILY = “INTLCTRY_DATA”.getBytes();
private Text key = new Text();
private FloatWritable output = new FloatWritable();

@Override
public void map(ImmutableBytesWritable row, Result value, Context context) throws InterruptedException, IOException {
String bCntyName = new String(value.getRow());
String bTitle = new String(value.getValue(COLUMN_FAMILY, Bytes.toBytes(“title”)));
String bAgeRange = new String(value.getValue(COLUMN_FAMILY, Bytes.toBytes(“age”)));
String bSex = new String(value.getValue(COLUMN_FAMILY, Bytes.toBytes(“sex”)));
char bFreq = new String(value.getValue(COLUMN_FAMILY, Bytes.toBytes(“freq”))).charAt(0);

key.set(bCntyName+”,”+bTitle+”,”+bAgeRange+”,”+bSex+”,”+bUnit+”,”+bFreq+”,”+bSeasonalAdj+”,”+bUpdateDate);
System.out.println(“LENGTH : ” + value.listCells().size()); // Length is NOT 200, only 9

for (Cell c : value.rawCells()) {
String qualifier = new String(CellUtil.cloneQualifier(c));
if (qualifier.equals(“data”)) {
Float f = Float.parseFloat(new String(CellUtil.cloneValue(c))); // parsing only 1 value, not all values
output.set(f);
context.write(key, output);
}
}

It seems Result value contains only newest version cell, not all cells. How can i scan all cells in hbase map function?
Please, give me your advice! Thanks in advance.
 
jQuery in Action, 2nd edition
 
subject: only one cell scan. how all scan possible?