• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Pig - When will it support UTF-16 ?

 
Greenhorn
Posts: 12
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Alan,

Congrats on the release of the book!!!

I currently am working on information retrieval and processing on sanskrit documents and plan to use hadoop in the near future.

I went through the documentation and found that the chararray supports only UTF-8 strings.
When will Pig support UTF-16 characters?
does it help in information retrieval and processing ?

(As of now,I have very limited knowledge about Hadoop..)

Thanks,
Abhinandan
 
author
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You just need to read strings that are UTF-16 encoded? You should be able to write a loader (details are in the book and the online documentation) that assumes strings are UTF-16 encoded. Once they've been converted to a Java String Pig will handle them just fine, even though internally it will move them to UTF-8 when it stores them.

In your second question, "does it help in information retrieval and processing", is "it" Pig or UTF-16 support? I'm guessing Pig. Pig can be used to load data into Hadoop (again via custom load functions) but its main focus is processing data once the data is on Hadoop. Many people use other tools to get their data into Hadoop first, and then process it with Pig.

Alan.
reply
    Bookmark Topic Watch Topic
  • New Topic