File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Concatenate 2 RTF files into one Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Concatenate 2 RTF files into one" Watch "Concatenate 2 RTF files into one" New topic
Author

Concatenate 2 RTF files into one

Paulo Carvalho
Ranch Hand

Joined: Nov 12, 2008
Posts: 56
Hello I wrote a Java program that concatenates 2 RTF files into one single RTF file.
It reads the first line of the first RTF file, puts it on the generated file.
After it takes the content of the 2 files and puts it on the generated file.
Finally it takes the last line of the 2nd file and puts it on the generated file.

The RTF file is generated, it is open, but the page Break that I put to separate both documents is ignored.

Here is my code:


My 2 RTF files have a table and a text. They only have one page each.

What am I doing wrong?

Is there another better solution to do this?

Thanks

regards
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24166
    
  30

This program goes to rather elaborate lengths -- perhaps inadvertently -- to remove all the newlines from the original files, thereby breaking most of the formatting and probably mangling the text as well. Instead of reading using a RandomAccessFile, read using BufferedReader, which is designed to work with text. Instead of using FileOutputStream, use a PrintWriter, which again is designed to work with text, not binary data as FileOutputStream is. Instead of using write(string.getBytes()), use println(string), which will put the newline back at the end of the line.

Alternatively, you could avoid reading and writing lines altogether, and simply read the bytes of one file into the new file, and then the other -- i.e.,



That will of course preserve all the newlines as well.


[Jess in Action][AskingGoodQuestions]
Paulo Carvalho
Ranch Hand

Joined: Nov 12, 2008
Posts: 56
Thanks for your answer

You are right about reading the bytes instead of reading lines but, I must read the first line of the first file to get/keep the RTF header. If I dont do that, during the process, the 2nd file will replace the 1st one instead of being appended (if I am not wrong...)...
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24166
    
  30

That makes sense -- I actually don't know anything about RTF files per se. So I would just go with reading and writing using BufferedReader and PrintWriter, as described above.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
Note that RTF is a structured document format that contain headers, so the resulting file will have multiple headers. While the actual content -text, images, tables etc.- may be OK in the combined document, the headers (which contain style information, amongst other data) could clash. Just something to be aware of.


Ping & DNS - updated with new look and Ping home screen widget
Paulo Carvalho
Ranch Hand

Joined: Nov 12, 2008
Posts: 56
I tried your recommendation using PrintWriter and BufferedReader classes but the result is the same.

Both RTFs are merged, but, I dont know why the page break between them is ignored.

Here is the code now:



Any help is welcome!

Thanks
regards
Paulo Carvalho
Ranch Hand

Joined: Nov 12, 2008
Posts: 56
And the solution must also take in account that one RTF can have been created with notepad and other RTF have been created with word (so the header of both RTF documents will be different).
So I don't even know if it is possible to do such thing...
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24166
    
  30

I was curious too, so I changed the filenames to files on my system and ran your code. The resulting file worked for me in Microsoft Word -- the page break showed up. On the other hand, in Textedit on Mac OS X, there was no page break.

In any case, I had just assumed that you knew enough about RTF to know that this was possible, so I was just helping with the Java bits. But if this is not something that generally works with RTF files, then you may need to go another way, perhaps by using a library that reads and understands RTF and then lets you modify it. iText (http://www.itextpdf.com/) used to have some RTF capability, but it's apparently been removed. You may need to do some googling.
James Sabre
Ranch Hand

Joined: Sep 07, 2004
Posts: 781

Paulo Carvalho wrote:And the solution must also take in account that one RTF can have been created with notepad and other RTF have been created with word (so the header of both RTF documents will be different).
So I don't even know if it is possible to do such thing...


How an RTF file was created only matters as far as the character encoding used. The specification for RTF seems to only allows one character set attribute per document (\ansi, \mac, \pc or \pcaso) so it is going to be difficult to concatenate two documents without converting them to the same character encoding.

Concatenation of the documents is not as simple as just concatenating the files. One needs to create a new document with valid prefix and suffix sections, convert the body of each document to a common format then insert each into the new document.

A few years ago I worked with RFT documents in Java. I had the simple task of replacing macro type elements e.g. ${abcd} with values computed by the program. This simple task turned out be be much more difficult than I expected since it was possible for the '${a' to be in one format and the 'bcd}' to be in another with the two bits separated by the code to change the format. Not fun.


Retired horse trader.
 Note: double-underline links may be advertisements automatically added by this site and are probably not endorsed by me.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Concatenate 2 RTF files into one
 
Similar Threads
Overhead IO : The cryptic case of the extra bytes
Printable
Can't print the content to printer
how to solve this type of stackoverflowerror...
Problem in WebServer