aspose file tools*
The moose likes Testing and the fly likes how to test parsers of text files and insertion into db Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Engineering » Testing
Bookmark "how to test parsers of text files and insertion into db" Watch "how to test parsers of text files and insertion into db" New topic
Author

how to test parsers of text files and insertion into db

Dominique Von Hofcheirken
Greenhorn

Joined: Oct 02, 2009
Posts: 6
Hi there,

I'm working on a typical Big Ball of Mud software component, which is part of an extensive software system without tests and a clear validation method of the software releases.

This component has some classes Parser*.java for each kind of text file/tcp message to parse.

Parsing of each line of a text file and each tcp received message will produce a domain object which will be inserted into database via raw jdbc api.

I'm working on adding new features, solving bugs and huge refactoring and since there isn't any test (unit, nrt, acceptance etc. etc.) so I'm looking for a smart way in order to check that I didn't break anything and the messages are being inserted into the db in the expected way.

Could you suggest some tips, tricks, tools, frameworks etc. for this task???
I was looking at dbUnit to check that data were loaded in the correct manner but this solves one part of my problem.

I would ask to you if had any similar experience with an application which parses text file and insert data into db. I think this is one of the most common type of software application and I hope you found a similar problem and solved it!!


Many thanks in advance,

--
dn
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 5264
    
    9

I just want to say right now: I feel for you, brother. Good luck with that.

That said, the best piece of advice I can give you right now without thinking through the problem with you (and risking a spike in my stress level in the process) is to see if Michael Feather's book "Working Effectively with Legacy Code" has anything that you can leverage. This book is full of gems and there might just be a few techniques/approaches in there that you can use.

One other thing: you said "huge refactoring" -- big red flag right there, buddy. You have to start small and you have to curb your desire to make it better all at once. You have to be able to accept that it might get a little worse before it gets better. To fix up a house, you have to tear down some walls and make a mess. The trick is to find a way to not make such a big mess while refactoring that you have to make everybody move out while you're redoing things.


Junilu - [How to Ask Questions] [How to Answer Questions]
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 5264
    
    9

And Welcome to the Ranch!
Dominique Von Hofcheirken
Greenhorn

Joined: Oct 02, 2009
Posts: 6
Junilu Lacar wrote:I just want to say right now: I feel for you, brother. Good luck with that.

eheh Junilu...I really appreciate that! Thanks!


Junilu Lacar wrote:That said, the best piece of advice I can give you right now without thinking through the problem with you (and risking a spike in my stress level in the process) is to see if Michael Feather's book "Working Effectively with Legacy Code" has anything that you can leverage.

I know the book. I took a look at some draft versions on Internet and it's in my amazon wishlist since few months...I will purchase it for sure, since, as it's written in the first chapter, a consultant like me will spend 90% of his time working on legacy code...in my case was 99% and I should be really specialized in refactoring, actually...

Have read some pages, I'm afraid it's too optimistic about the quality of legacy code...for example, I've read about the Inflation Points technique in that draft version you can find online...but in the software I'm working on, it's not that easy to find those inflation points...Have you ever met an enterprise software where the main executable class has got one line: ??? Behind that instantiation, an entire world is hidden....


Junilu Lacar wrote:One other thing: you said "huge refactoring" -- big red flag right there, buddy. You have to start small and you have to curb your desire to make it better all at once. You have to be able to accept that it might get a little worse before it gets better. To fix up a house, you have to tear down some walls and make a mess. The trick is to find a way to not make such a big mess while refactoring that you have to make everybody move out while you're redoing things.


Thanks for this suggestion, Junilu. Yes, I admit to be tempted by make it better in big steps...since I feel the usual confidence to dominate the world that a software engineer has got when he wants to improve the design of something...I will be careful on that!

I will buy the Feather's book right now. Hope it will be useful.

Cheers!
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 5264
    
    9

Dominique Von Hofcheirken wrote:Have you ever met an enterprise software where the main executable class has got one line: ??? Behind that instantiation, an entire world is hidden....


(Taking out pipe) Heh! That's nothing compared to what I've seen. There's this one time, I was looking for a bug in PL/SQL stored procedure code when I ran into a mess of hard-coded HTML tags... the darned hired help--who had by that time moved on, God help whoever hired them--had the brilliant idea that they would just code up the presentation layer concerns in the bowels of the database layer. A JSP in Stored Procedure code!!! What kind of diabolical mind(s) can think up such evil things?!
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 5264
    
    9

Dominique Von Hofcheirken wrote:
This component has some classes Parser*.java for each kind of text file/tcp message to parse.

Parsing of each line of a text file and each tcp received message will produce a domain object which will be inserted into database via raw jdbc api.


These Parser* classes, is there any possibility of isolating them and writing unit tests around them? Are they instantiating--or perhaps worse, doing the work of--the classes that access the database? Ideally, you would have an interface that will abstract away the Repository and you would use Dependency Injection to provide the Parser class with an implementation of the Repository Service. This makes it straightforward to use a mocking framework like Mokito to mock out the Repository. With that you can verify whether or not the Parser is creating and trying to save the right domain objects when it reads its input.

I would start by trying to refactor towards that level of testing and testability. And start with the simplest Parser class that you have.

Good luck.
Dominique Von Hofcheirken
Greenhorn

Joined: Oct 02, 2009
Posts: 6
Junilu Lacar wrote:had the brilliant idea that they would just code up the presentation layer concerns in the bowels of the database layer. A JSP in Stored Procedure code!!! What kind of diabolical mind(s) can think up such evil things?!


I would have payed to see such a "thing"...
Dominique Von Hofcheirken
Greenhorn

Joined: Oct 02, 2009
Posts: 6
Junilu Lacar wrote:These Parser* classes, is there any possibility of isolating them and writing unit tests around them? Are they instantiating--or perhaps worse, doing the work of--the classes that access the database?


The current design is based on event listeners...the parsers fire an event to the listener after completed its job. The listener is an object in charge to call a kind of DAO class to insert the object. Each parser is associated to a configured channel which is responsible to retrieve raw data (channel's types are ftp, sftp, tcp, filesystem etc.).

Junilu Lacar wrote:Ideally, you would have an interface that will abstract away the Repository and you would use Dependency Injection to provide the Parser class with an implementation of the Repository Service. This makes it straightforward to use a mocking framework like Mokito to mock out the Repository. With that you can verify whether or not the Parser is creating and trying to save the right domain objects when it reads its input.
I would start by trying to refactor towards that level of testing and testability. And start with the simplest Parser class that you have.


Good advice Thanks
I will see how to implement DI there...I think it's possible (injection of a mock event listener???) but at this time I still have difficulties to understand some parts of the code...it's hardly readable and I have the suspect that a large amount of code is just unused. We are lucky that nowdays we have modern IDE to support us in refactoring (I refer to "find usages" functions, simple automated refactoring and pmd plugins)...

I will update this post as soon as I find a good solution.

Thanks for your (even moral) support!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: how to test parsers of text files and insertion into db