I am making a small project in Apache Lucene, that searches for words in different type of files: pdf, html, txt. I used the highlighter library to highlight the found words, but this library marks all the found words, in the whole content. I would like to do something like this, something similar to how google displays the results:
If I search for some words, I would like to display a part of the text that contains all of them, also highlighted. If they aren't close enough (5 words for ex), to display only the first appeareance of every
word: If I search for the 1st word the 2nd word, to display something like this:
xyz abcd <the 1st word>.........<the 2nd word> abcd abc
This is my code right now:
How could I do this? Thanks!