Hi
Looking at some professionally written code, Pipe is a class, it looks to me like importPipe is re-instantiated 3 times.
Don't the first essentially have no effect?
Thanks
Brian
public ImportCrawlDataAssembly(
String name )
{
// split the text line into "url" and "raw" with the default delimiter of tab
RegexSplitter regexSplitter = new RegexSplitter( new Fields( "url", "raw" ) );
1--> Pipe importPipe = new Each( name, new Fields( "line" ), regexSplitter );
// remove all pdf documents from the stream
2--> importPipe = new Each( importPipe, new Fields( "url" ), new RegexFilter( ".*\\.pdf$", true ) );
// replace ":nl" with a new line, return the fields "url" and "page" to the stream.
// discared the other fields in the stream
RegexReplace regexReplace = new RegexReplace( new Fields( "page" ), ":nl:", "\n" );
3--> importPipe = new Each( importPipe, new Fields( "raw" ), regexReplace, new Fields( "url", "page" ) );
setTails( importPipe );
}