File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Other Open Source Projects and the fly likes Mearging two word documents using POI Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "Mearging two word documents using POI" Watch "Mearging two word documents using POI" New topic
Author

Mearging two word documents using POI

Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

I need to append one word document at the end of another word document. I trying to use HWPF POI to read and write the word documents.

Here is the code I tried out, (this code does not merge Files it just writing some simple text to the word document)


After I execute this code (No exceptions or warnings) the word document D:\results.doc is created but it is blank with nothing written in it.

I have also tried using the code at http://faq.javaranch.com/java/CreateWordDocument with the same result - A blank word document is created. (I commented out the custom document property section)

I'm using POI 3.2.

Any pointers to what I'm doing wrong.
Any pointers to any approach I could use to append one word document to another in Java?

Cheers - Sam.
Twisters - The new age Java Quiz || My Blog
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42264
    
  64
That's odd - the code works fine for me as it is (as does the one in the wiki, but that's no surprise since I wrote it :-)

The high-end solution would be to use OpenOffice in server mode (in which it can be accessed using a Java API), and have it insert one document at the end of the other. That API has a non-trivial learning curve, though.


Ping & DNS - my free Android networking tools app
Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

Ulf Dittmer wrote:That's odd - the code works fine for me as it is (as does the one in the wiki, but that's no surprise since I wrote it :-)


It worked for me too - It does not work if you create a new Word File i.e. the temp file (Right Click, New word Document) and run the program. However if you open the created word file and just save it, it works.


Ulf Dittmer wrote:
The high-end solution would be to use OpenOffice in server mode (in which it can be accessed using a Java API), and have it insert one document at the end of the other. That API has a non-trivial learning curve, though.


I don't think I'll be able to use OpenOffice for this particular activity so that's a little out for now. (Plus I have absolutely no Idea how it works and could not find anything on the net either - I not done a extensive search though )!
Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

I am able to copy plain text from one word document to another using the code below. I am able to merge the contents of Doc1 with Doc2 and get the results in the Result.doc. However I lose any formatting information that I have when I do so. Is there anyway I can keep the formatting when I copy from one word document to another?

I am able to copy one word document (with formatting) using the code below, but I cannot merge two documents. Only the first documents shows up in the result. In this case the size of the result document is almost the sum of the two merged documents - and if I open the word document in Textpad I can make out some characters from the second document.

I also noticed a couple of classes in org.apache.poi.hwpf.model.io.HWPFFileSystem and org.apache.poi.hwpf.model.io.HWPFOutputStream but I have absolutely no Idea how to use them or what they do. There is just no documentation on how it works.>
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42264
    
  64
The second approach is not going to work. DOC files are structured in a way so that they can't be appended like that.

As to the first approach, I'm sceptical whether it can be made to work, but it may be worth a try. "Range" objects are not the only information that needs to be transferred. If you look at org.apache.poi.hwpf.HWPFDocument you'll notice numerous other object types - images, styles, fonts, document properties etc., many of which may need to be copied as well.
Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

Ulf Dittmer wrote:As to the first approach, I'm sceptical whether it can be made to work, ...


I did have a look at what you suggested - yes I almost forgot that there were more objects that are there in a word document that need to be transferred. I'll give it a try though I'm skeptical if it will work ...

Will update once I try out ....

Thanks Ulf for all your help till now
kiran venkat m
Greenhorn

Joined: May 28, 2009
Posts: 4
Hey Sam,

Did you try using fonts and formats, Did it work. Can you please update the code if you have made any changes?

Thanks
Kiran Venkat
Saifuddin Merchant
Ranch Hand

Joined: Feb 08, 2009
Posts: 606

Well I won't say I worked a 100% on it - But I did figure out a few things.

First of all as Ulf suggested the first approach I was thinking of coping all Objects in addition to range Object does not work.
Simple reason is that while POI has API's to retrieve this Data is has no corresponding write API's. Nor can you identify the type - so if you end up reading all paragraphs - it would also contain images, tables etc which would show up as garbage Ascii characters.

Basically my conclusion is that you cannot 'really append' one word document to another using POI.
The best you could do is extract the text and text format and append it to another document. This will work only in case of really simple work documents.

On the other hand I am still working on this and if I do manage any break through I'll be posting it here.
kiran venkat m
Greenhorn

Joined: May 28, 2009
Posts: 4
Hi Sam,

Thanks for the quick response.
My basic requirement is to "create a word document from predefined template file"

its like we have a old doc and we need to convert it to new one with specific format and text being the old ones.

Thanks
Kiran Venkat

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Mearging two word documents using POI