File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes JDBC and Relational Databases and the fly likes Pentaho compared to other (open source) ETL products Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » JDBC and Relational Databases
Bookmark "Pentaho compared to other (open source) ETL products" Watch "Pentaho compared to other (open source) ETL products" New topic

Pentaho compared to other (open source) ETL products

Orjan Petersson

Joined: Oct 13, 2003
Posts: 12
As the title says: What are the strengths of Pentaho Data Integration compared to other ETL platforms (Open Source or not), for example Apatar or Talend?
I have not yet started to look into this but probably will in a couple of months so any pointers will be appreciated.
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 17410

I've been working with Pentaho for around 2 years now. The actual ETL engine is both performant and capable. I've struggled occasionally with with the UI designer (Spoon), but this is, after all open-source. So I made some enhancements to the Excel Input component and got them committed. They've been part of the system for the last year or so.

An IDE is no substitute for an Intelligent Developer.
darren hartford

Joined: May 17, 2010
Posts: 23
I've used Kettle, then PDI :-), for many years as well.

The strengths compared to previous ETL solutions most companies have in place (i.e. DTS/SSIS is the most common I've seen if you have MS SQL installed anywhere) includes:

*Strong declarative approach to ETL design.

*Database agnostic approach - don't have to worry about a particular ETL solution working 'great' for one database, and poorly for others.

*JDBC driver access - This is gonna sound odd to the non-JDBC users, but ODBC and ADO.NET providers I keep running into sporadic/unusual issues in driver configuration/server configuration/something else unknown on many different database setups. JDBC has been consistent and reliable, which for ETL is very important. Yes, someone will inevitabely say you loose some performance, well, loosing 1%-5% performance for rock solid reliability is an easy sell for me.

*Built-in warehousing support (dimensions), included, free, in the open source version.

*Customize/create your transformation step using Java (SSIS you can do this with .NET as well). I've created an X12-style EDI parser in about two weeks to solve a particular business need (it was very specific and not contributed unfortunately).

Compared to other open source ETL solutions (talend, clover, and several others I've reviewed in the past).

*LGPL license. You can use it in your business without worry.

*Commercial support. You can use it in your business without worry (Talend has this as well).

*Full-featured (Talend has this as well).

*It's got Matt Casters! :-)

Maria Carina Roldan

Joined: Oct 14, 2009
Posts: 19
I'm not very experienced in the use of other ETL tools. What I can assure you is that PDI (aka Kettle) will meet all your expectations. I've been used it for more than 3 years in different kinds of project (both DW and non DW related) and the tool was always capable of doing what I needed to do,

Maria Carina Roldan
Author of Pentaho 3.2 Data Integration: Beginner's Guide
I agree. Here's the link:
subject: Pentaho compared to other (open source) ETL products
It's not a secret anymore!