Here is what I am trying to implement. I have a list of result files(docs and pdfs) for my search. I need to open the file with either Adobe or registered pdf software and for docs with MSWord or Open office Document.
I found this in google, Runtime.getRuntime().exec("rundll32 url.dll,FileProtocolHandler" + filePath); Yet to try the above.
But I need to open the document or pdf with some highlighted search words. Is there a way to achieve this? Now I do this in a crude way like reading the text and highlight the words and displaying it as html.
Originally posted by Ananth Chellathurai: But I need to open the document or pdf with some highlighted search words. Is there a way to achieve this?
If there is it will be a feature of the word/pdf viewer. So before trying it from Java you first need to find out if it is actually possible. To do that you need to read the word/pdf viewer documentation or ask this on a suitable forum.
Alternatively you could write your own word/pdf document viewer in Java and then you can do whatever you want, but I doubt if this is a simple thing to do. Might be worth googling to see if there is any such code/program already available. Apache Commons is always a good place to check. [ August 14, 2008: Message edited by: Joanne Neal ]
Thanks for your comments. Acrobat Viewer JavaBean Java PDF viewer is a Adobe's PDF viewer for java. But still I never know whether this could help me to highlight search words.
Ananth Chellathurai
Joanne Neal
Rancher
Joined: Aug 05, 2005
Posts: 3011
9
posted
0
Originally posted by Ananth Chellathurai: Hey Joanne,
Thanks for your comments. Acrobat Viewer JavaBean Java PDF viewer is a Adobe's PDF viewer for java. But still I never know whether this could help me to highlight search words.
Ananth Chellathurai
Download the viewer. The zip file contains Javadocs. Look at the ViewerCommand interface - it provides options to do a search.
Be aware that the Acrobat Java Bean hasn't been updated in ages, and has lots of problems with recent PDF versions. I think the PDFRenderer library (on dev.java.net) is much more promising. [ August 14, 2008: Message edited by: Ulf Dittmer ]
I am struggling here to find a way to open a document with highlighted search words.
Even if there are APIs to open with the search highlights, my requirement is to highlight multiple words. So I am trying a work around to use Apache POI to extract the text and displaying it as html with matched words highlighted. Is this as good approach?
Ananth Chellathurai
narendra darlanka
Ranch Hand
Joined: Jun 17, 2005
Posts: 62
posted
0
Hi, we used Apache Lucene for search and text highlight with PDF's . i am not sure with ms word.
Can you help me to get started with. Here is my requirement, given a PDF document with path. Can it highlight the words provided? I have downloaded Lucense 2.3.2
Ananth Chellathurai
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35254
7
posted
0
Lucene can help with showing highlighted search terms, but only in text that is in the index, and which is then displayed by the search code in some fashion. It has no provisions of displaying Word or PDF documents, or of highlighting search terms in them.
I was spinning around to find whether Lucene has such provisions. So Ulf is my solution which I mentioned before two posts will be the best option. Or should I have to look for more APIs. What would you suggest?
Ananth Chellathurai
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35254
7
posted
0
Do you mean using POI to extract text and then displaying it? Not sure if it's the best approach; it certainly sounds like a feasible one.
You'll still need to use a different library for PDFs.
If this was my problem, I'd probably create a full text index using Lucene, and use that to help with displaying the text with the search terms highlighted. You can always provide a button that uses the rundll32 thing you mentioned to display the original document.
Here is what I have done. I have used PDFTextStripper class from PDFBox(as it is very easy to implement) to get text from pdf and displaying the results in html with highlighted words. I also have a link to open with, using rundll32.
I have to find a better API for pdf text extractor, as PDFBox released its last release on 10/12/2006.
Please update me if there is a better API for getting text from pdf.
Hi, sorry for confusing .our solution was similar to the one suggested ulf, we indexed the PDF's and then used the text highlighter in PDFbox to generate the highlight XML .we passed this xml to acrobat as a parameter.this works only with acrobat.http://www.pdfbox.org/userguide/highlighting.pdf
Highlighting texts in pdf is so simple and doesn't need any APIs. From your browser file:///D:/accesscontrol1.pdf#search="Security Management" along with the pdf name search words can be given as parameter to Acrobat.
This will open the pdf with search highlights. But when I tried this as it opens the pdf but with out search highlights.
I do not know what I miss here. Any help is appreciated.
I tried several things but unable to get exactly what I was looking for,
Found this URL http://office.microsoft.com/en-us/word/HP101640101033.aspx but it doesn't give an option to open parameters with search highlights. I tried to write the text with poi.hwpf.WordExtractor as a html and open through winword with this code.
It opens the word file but I see the format has gone in html asd it doesn't look like the original document.
Can any one suggest me a better way to do this? I need to highlight an existing doc file to be opened with some search words highlighted.
It didn't help me Joanne. I do not know how to pass search words as variables to macros. Even if there is a way to give dynamic data to macros, search highlight ALL can be done for only one word. My needs are to highlight multiple words.
There should be some other better way than macros. I am trying to figure this out. Any helps are appreciated.
Ananth Chellathurai
Joanne Neal
Rancher
Joined: Aug 05, 2005
Posts: 3011
9
posted
0
Originally posted by Ananth Chellathurai: It didn't help me Joanne. I do not know how to pass search words as variables to macros. Even if there is a way to give dynamic data to macros, search highlight ALL can be done for only one word. My needs are to highlight multiple words.
There should be some other better way than macros. I am trying to figure this out. Any helps are appreciated.
Ananth Chellathurai
Have you tried asking on an MS Word forum ? Until you have found out if what you want to do is possible, this is not really a Java question.
I have did the following, I have saved all my searched words in a doc file under a temp folder and downloaded a macro to highlight the words specified in the temp folder document, and opened the word document with /m option from Runtime.getRuntime().exec and it works flawlessly.
Using file where I had the search words to be highlighted will not help. As there will be multiple users using the application simultaneously. So I should pass the search words filepath as arguments to the word macro.
Its simple to pass arguments to word macros from VB or some other microsoft technologies.
But is there a solution to pass arguments to word macros from Java?
Struggling to get the documents at remote hosts within network. I used jcifs.smb for getting document text previously, now I need the whole document to be opened or copied. CopyTo method of SmbFile doesnt help me. I am missing something while using this method.
I tried seeral things like,
copy the file to my C:, which had some permission problems to do that. So tried D:\temp folder. One of the posts in Java ranch suggested me to have the destination folder also to be shared tried that too, but no use. What should be the format of the path for the destination file?
I tried the other way instead of copyto. From the SmbFile I received the inputstream and wrote to my local using outputstream. And it works.
I always get very quick response in this forum but not for this post. Any how thank you all, Joanne, Ulf and naren for your supports to make this business requirement implemented with your valuable thoughts.
All the above said solutions works fine when the application runs in the same machine. This doesn't work for a client server application. The server will not know the acrobat exe or msword path in the client machine.
I have posted a question in Sun Forum also for opening a pdf with open arguments specified in response header for response, kindly apologize as I am already running late to find a solution or report there is no solution for this in Java to my IT head.
I have tried like this,
But it takes the file name as filename.pdf_search=Security. Please help on this.
I concluded there is no solution for this in java. And I have picked an option to highlight my words in a html with the content read from the pdf or docs.
Ananth Chellathurai
stebin john
Greenhorn
Joined: Sep 09, 2008
Posts: 1
posted
0
Hi ananth n all
Even i had quite the same problem statement.Following this thread,i was able to solve my problem. Just one doubt. When you highlight search terms in acrobat (for pdf)using the #search parameter,How is it that i can search for a group of terms like for example #search="In this country", how is it that i could get the matches which only contain all these terms(exact match).
Thanks Uer help will be most appreciated. Thanks in advance
Hey not sure on whether we can specify exact search in search parameter. Did you try asking in adobe forums? Did you try the second option of PDFBox i.e. Generate a highlight XML document?
That could help your needs, still a lot of customization to be done. So I would suggest you to try in adobe forums too.
Ananth Chellathurai
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32712
4
posted
0
Welcome to JavaRanch
Since everybody else had no end of problems, please tell us how you sorted your problem out
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to
run our stuff on 16 servers instead of 3.
subject: Opening a File with the appropriate program with highlighted text