File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Other Big Data and the fly likes How good is Mondrian in scaling up more and more data and where does fit in the Big data platform ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Other Big Data
Bookmark "How good is Mondrian in scaling up more and more data and where does fit in the Big data platform ?" Watch "How good is Mondrian in scaling up more and more data and where does fit in the Big data platform ?" New topic
Author

How good is Mondrian in scaling up more and more data and where does fit in the Big data platform ?

Anujit Chatterjee
Greenhorn

Joined: Apr 13, 2009
Posts: 25
Hi,

What are tools / techniques Mondrian is using to scale up like caches , etc ?
And where does it fit into the Big Data platform especially with respect to Hadoop, Hive ,etc ?
How easy it is to plug Mondrian with other Big data tools ?

Regards,
Anujit


SCJP 5.0
Complexity is Easy, Simplicity is Hard !
Bill Back
Author
Greenhorn

Joined: Aug 09, 2013
Posts: 7
I'll answer in two parts. First, the scaling. Mondrian has two general approaches to scaling (chapter 7). The first is using aggregate tables. These are tables that pre-aggregate the data. For example, suppose you are storing facts about sales at the hourly level, but you usually just do analysis at the daily or weekly level. You can create an aggregate table that is used at those levels. This reduces the data being returned.

The second technique is caching. Mondrian caches schema, members, and segments (the things that make up an aggregate). This means that once the data has been queried it is stored in memory. Additionally, Mondrian support external caches, such as Infinispan, that allow very large amounts of data to be stored in memory with persistence and failover.
Nicholas Goodman
Greenhorn

Joined: Oct 03, 2013
Posts: 4
I'll tack on the response to Hadoop/Hive. We cover how Mondrian fits in with Big Data systems in Chapter 11. In that chapter we note that Mondrian has experimental Hive support. However, given the latency of the most basic Hive queries (for generating the list of values for the "year" column) the overall performance will always be lackluster for direct access with a engine like Mondrian. The work of Impala, Drill, etc will improve this (making simple queries fast, and longer queries longer) over time.
Anujit Chatterjee
Greenhorn

Joined: Apr 13, 2009
Posts: 25

Thanks Bill . But I am now interested to know more about how level based , on-demand structure works . I ask this because I have faced situations in BI reporting where this was the structure that was required but was not there.

And Nicholas thanks for touching the latency issue. I am not aware of Impala but am eager to see how Mondrian plugs in with Drill.

Thanks a lot.

Regards,
Anujit
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How good is Mondrian in scaling up more and more data and where does fit in the Big data platform ?