aspose file tools*
The moose likes JSP and the fly likes Save a webpage as PDF  or Excel file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » JSP
Bookmark "Save a webpage as PDF  or Excel file" Watch "Save a webpage as PDF  or Excel file" New topic
Author

Save a webpage as PDF or Excel file

Mike Yu
Ranch Hand

Joined: Nov 17, 2001
Posts: 175
Hi,

If I have a webpage, its content comes from database dynamically. If users want to save the page as a PDF file or Excel file. How can I make it?


Thanks,<br />Mike
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

There is nothing in JSP for doing that.
You'll have to read up on 3rd party products like iText and POI if you want to generate documents in other formats.

See:
http://faq.javaranch.com/java/JspAndExcel


Java API J2EE API Servlet Spec JSP Spec How to ask a question... Simple Servlet Examples jsonf
vishalraju shah
Greenhorn

Joined: Oct 06, 2007
Posts: 19
If you are populating collection from DB -> coming to servelt -> going to jsp than following can be one of the way you can achieve your requirement.

<%@ page contentType="text/csv"%>
<% response.setHeader("Content-Disposition","inline; attachment; filename=Data.csv");
response.setHeader( "Pragma", "public" );
response.setHeader("Cache-Control", "cache");
response.setHeader("Cache-Control", "must-revalidate");
String separator = System.getProperty("line.separator");
pageContext.setAttribute("separator",separator);
%>

// iterate over your collection and print values seperate by ,(comma).
// This will prompt user to open/save a csv file.


SCJP1.4 (92%), SCWCD (85%), SCBCD (81%), SCEA-I (In Progress)
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

Originally posted by vishalraju shah:
If you are populating collection from DB -> coming to servelt -> going to jsp than following can be one of the way you can achieve your requirement.

<%@ page contentType="text/csv"%>
<% response.setHeader("Content-Disposition","inline; attachment; filename=Data.csv");
response.setHeader( "Pragma", "public" );
response.setHeader("Cache-Control", "cache");
response.setHeader("Cache-Control", "must-revalidate");
String separator = System.getProperty("line.separator");
pageContext.setAttribute("separator",separator);
%>

// iterate over your collection and print values seperate by ,(comma).
// This will prompt user to open/save a csv file.


As the link I posted mentions. That is not really a full answer to the original poster's question. Your code creates a csv file (which Excel happens to be able to open). This doesn't allow the user to save a page in the binary format that Excel uses. It also doesn't convert anything to a PDF.

The link that I gave (as well as showing how to create a CSV file) mentions some 3rd party libraries for working with proprietary Office formats.
Abhinav Srivastava
Ranch Hand

Joined: Nov 19, 2002
Posts: 349

This is one way of Exporting to Excel:
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

Originally posted by Abhinav Srivastava:
This is one way of Exporting to Excel:


No it isn't.

That does nothing more than tell the browser that you're sending it an Excel file in the response. It does absolutely nothing to the data to transform (or export) it into a Microsoft Office format.
Abhinav Srivastava
Ranch Hand

Joined: Nov 19, 2002
Posts: 349

Excel recognizes HTML markups and the HTML content is opened as an Excel Document. I have done it myself many times. The key is you are only changing the headers not the content.

p.s. I'm not sure if it would work in any browser other than IE.
[ January 14, 2008: Message edited by: Abhinav Srivastava ]
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

I understand that and that is mentioned in the FAQ entry that I posted.

Setting the content header does not, however, 'export' the document to another format. To do that, you need to use a third party product, such as Apache POI, or iText (for PDF).

Just because Excel can open a CSV or HTML table doesn't make these things Excel documents. A true Excel document is a binary formatted file (newer versions are based on XML) that has features that can't be duplicated with simple csv or html files.

Similarly, you can open a plain text file with Word.
Doing so, doesn't convert the plain text document into a true Word formatted document.
vishalraju shah
Greenhorn

Joined: Oct 06, 2007
Posts: 19

"If users want to save the page as a PDF file or Excel file"


Ben ,
POI and iText are the more elegent solution and I agree to that. However if user here wanted to just save file in excel/pdf format which can be done by above way as per me.
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

The Content-Type header does nothing more than tell the browser what type of document is being returned in the response body.

If I handed you a dead seagull and told you it was a chicken, would my words make it so?
No.
If I could find a chef who was willing to cook it and serve it to customers of his who ordered chicken, would that prove my words made it so?
No.

It would still be seagull, not chicken.

Setting the content type does nothing to change the document being returned from the server.
Abhinav Srivastava
Ranch Hand

Joined: Nov 19, 2002
Posts: 349

POI and iText are the more elegent solution and I agree to that. However if user here wanted to just save file in excel/pdf format which can be done by above way as per me.


How would you create a pdf this way? You can at best provide the filename having "pdf" extension. For true PDF, you need a binary transformation.

Even with CSV (which Excel can noramlly render), you have to check if the requirement in not for ready-to-go native XLS format.

Other than iText and alike, there may be some softwares which can do the PDF transformation on the fly e.g. Adobe's Online HTML to PDF converter or some plugin in your browser which can do a "Save as...", but that is beyond the scope of this discussion.
Glenn Graham
Greenhorn

Joined: Feb 16, 2008
Posts: 1
Ben, nice analogy (the dead seagull, I like it!) but not quite correct. What happens when you change the header is more like changing the atomic bonding of all the molecules of the dead seagull, so in fact it becomes a chicken. If you put more than a <Table> or <Pre> tag as your outside container, not only will you have a chicken but some excess carbon residue as the browser's Excel/CSV interpreter doesn't understand any other outer tags. Don't use <HTML> or <Body> or anything else.

The reason this does work is that the browser now interprets all the tag and data content as Excel info. You can also do Excel formatting by including style info in the <td>.

Usually what will happen is the browser will display the spreadsheet in the browser. If you want to download it instead, you can either save it once it's displayed or you can right click on the prior link and choose "Save As".

Sometimes one will have a problem that after the first instance of saving a document, the browser doesn't return to prior state and keeps giving an error message on subsequent file downloads. Most often this is due to extraneous tags. Remember, only a table!

A further way to do it is to create an Excel file on the server or to stream the above data into a binary object on the server and once again setting the response header will always pop up the "save as" dialog box, with .XLS or .CSV as the selected file extension.

All the above does not require any third-party software and can be done in ASP or JSP.

To save as a PDF would require third-party software to create the PDF document. Anything from Ghostscript to many available utilities. I use abcPDF, which can construct a simple or complex PDF from any component elements.

Mike Yu, hope this helps you.

- Glenn
S Reddy
Ranch Hand

Joined: May 17, 2007
Posts: 45
Hi Mike, check the following link whether it will be of any use to you.
http://labnol.blogspot.com/2007/06/add-as-pdf-button-to-your-websites-and.html

@Glenn Graham: Even though this works for simple files, it is not actually an excel file, and you can't have advanced features of excel (like formulae, charts, etc) in your file without using external libraries like POI.
[ February 17, 2008: Message edited by: Srikanth Reddy Lankireddy ]
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

Originally posted by Glenn Graham:
The reason this does work is that the browser now interprets all the tag and data content as Excel info.


This is incorrect as well.
The browser will not interpret anything as Excel.
When it sees a content type header for which it has an associated application, it will spawn an instance of that application and pass the body of the response to it. Excel will then parse the HTML table or csv data and present it to the user. As I've mentioned several times in this thread, the example code in the FAQ entry to which I've linked that does just that.

Again, doing this will not create a valid Excel document.
Excel documents (with all the bells and whistles) are binary.
To create one, you need a third party library like Apache POI.

[Srikanth Reddy Lankireddy beat me to it and said it better]
[ February 17, 2008: Message edited by: Ben Souther ]
Jason Ferguson
Ranch Hand

Joined: Sep 16, 2007
Posts: 47
Okay, gotta go with Ben on this one. I need to remember the dead seagull analogy.

However, if you want a simple way to do this, check out the DisplayTag library at http://displaytag.sourceforge.net. While its primary purpose is to lay out information into HTML tables, read further into the docs for exporting to other formats, including Excel and PDF.
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

Thanks Jason.
I've added a link to the DisplayTag page in our JspAndExcel FAQ.
To re-enforce what I said earlier, DisplayTag uses Jakarta POI in order to create Excel binary files.
vishalraju shah
Greenhorn

Joined: Oct 06, 2007
Posts: 19
Ben ,

Thanks.

Things are pretty clear.It seems I was incorrectly using response.setContentType... and understanding that it exports as excel but the file won't be true excel in reality. (I did explored POI and got it clear

Thanks once again.
[ February 18, 2008: Message edited by: vishalraju shah ]
Ben Souther
Sheriff

Joined: Dec 11, 2004
Posts: 13410

vishalraju shah,
It's not 'incorrect' to return CSV data with a content type header suggesting that the browser open the file in Excel. What's important is that you understand the difference between this and the generation of a true Excel file.
vishalraju shah
Greenhorn

Joined: Oct 06, 2007
Posts: 19
Yup.The true excel files contains binary sort of data and doing the setContentType would not give that data, which makes diffrence in excel and asking browser to open it as csv/excel.

Thanks again.

[ February 18, 2008: Message edited by: vishalraju shah ]
[ February 18, 2008: Message edited by: vishalraju shah ]
 
wood burning stoves
 
subject: Save a webpage as PDF or Excel file