I have a task in which I have to scan the HTML source of web pages and extract some information depending upon the pattern. The extracted information will be save in the database for the business purpose. The amount of data being extracted in not much.
I am searching for an appropriate web crawler written in java.
Does anyone has some suggestion to give or any other inputs to be share.