aspose file tools*
The moose likes Java in General and the fly likes save url as mht or pdf Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "save url as mht or pdf" Watch "save url as mht or pdf" New topic
Author

save url as mht or pdf

Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 257
Hello. If anybody has a minute, i've been analyzing several alternatives to save a lotus notes record as either pdf, mht, or some other self-contained storage format. I seem to make a little headway in the analysis of any approach then a ~major~ barrier plops on the scene...all of it seems tied around having to reconstruct the source lotus notes record in a target pdf (or mht) from scratch...

1) iText...i really like it for field to text translations but the rich text seems to be an issue...placing the images in the target pdf where they need to go in context of the rest of the rich text could be a show stopper.

2) java mail api...mime...similar to above...i can't seem to get a handle on the embedded images. I try to create an input stream directly from the lotus notes rich text field and all i can stream out is the formatted text.

3) XSL-FOP...the rich text becomes problematic as with the other alternatives because i don't see how i can create an xsl for rich text that can contain anything (embedded jpegs, bitmaps, etc) within text and be formatted a zillion different ways

Also looked at some other alternatives that i forget now but with similar frustrations. What I am really jonesing to do is access the notes record via httpurlconnection...stream the url and launch some kinda shell that will mimic the IE "save as mht" option. I know they can do it, obviously, in asp c#. Thank you very much for reading this.
[ March 19, 2007: Message edited by: Tom Griffith ]
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8867
    
    8

OpenOffice does HTML to PDF and has a Java API.
There seems to be several Java MHT libraries available.
[ March 19, 2007: Message edited by: Joe Ess ]

"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 257
Hi Joe. Thank you very much for the information. Yeah, I had noticed all of the java mhtml stuff was third party and required licensing so i spent several days of frustration with the tools/api's that i indicated earlier trying to work it in with rich text...but it all comes back to painful/if not impossible rt translations. I'll check that OpenOffice out. I still think/hope there must be some way to leverage the IE dll that handles the "save to mhtml". Even Netscape has that "save as Web page (complete)" option that does the same thing...saves all the content and images in the right context as a local htm (not mht) file.
[ March 19, 2007: Message edited by: Tom Griffith ]
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 257
Hmm, the best i can tell of OpenOffice ~so far~..the documentation seems scant via google...is that it's a utility to convert office apps to pdf...sorta like a poi-iText combo. I don't really see anything where i can feed in a url and apply the api stuff to come out with a pdf. If anybdoy has done this, i'd really appreciate a confirmation or whatever. Thank you again for reading this. I'm kinda surprised converting rich text, like htmls, mail files, etc to a comprehensive storeage format like pdf or mhtml hasn't been a more prevalent issue or whatever. It looks really hard to do via java at this point or whatever.
[ March 19, 2007: Message edited by: Tom Griffith ]
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8867
    
    8

OpenOffice itself is a fully-featured office suite like Microsoft Office. It can be controlled programmatically via an API. One of the features of OpenOffice is that it can export to PDF. Another feature is that it can open a document from a URL. Put those two together and you have the functionality you want.
Download the API I pointed to earlier. It has numerous examples of how to perform conversions.
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 257
Joe, so the document that is converted does not have to be an office suite app, right..even via url? I'll download it and check it out. Thank you again for the lead or whatever.

Oh, if you have a minute, do i download an executable off the mirror site? It looks that way (windows 2000).
[ March 19, 2007: Message edited by: Tom Griffith ]
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8867
    
    8

Originally posted by Tom Griffith:
Joe, so the document that is converted does not have to be an office suite app, right..even via url?


I'm not sure what you are getting at here. Are you asking if OpenOffice can open an HTML document? Yes, it can.
Tom Griffith
Ranch Hand

Joined: Aug 06, 2004
Posts: 257
yeah, that sums up what i was asking...thank you again. I'm going to mess with it today. I kinda think it's going to have the same problem with embedded images in a formatted web page as the other alternatives...because i think the true source of my problem is the httpurlconnection stream...this approach, as the others, will rely on that stream to read the web page content...and the embedded images aren't serialized to the stream. Maybe i can do it through the domino backend...but i don't think the domino api digs deep enough to preserve it's rich text format. Round and round it goes. Thank you again.
[ March 20, 2007: Message edited by: Tom Griffith ]
 
 
subject: save url as mht or pdf