I keep a list of less than 100 web sites that I visit a few times a year.
While there I look around, generally viewing 2 to 10 pages,
looking for statements about dates in the future.
Occasionally, I make notes about what I find.
Most of the time, I find that nothing has changed since my last visit.
It eventually dawned on me that much of this could be automated, if I kept some extra data
with the URL, such as the date I last visited the site, the latest date at the site,
the date when the site was last updated, and the date I last found anything worth noting at the site.
This suggests a flat file, a spreadsheet, or a relational database.
The information I gather fits a database nicely, about six relations, a few hundred to a few thousand
records in total, so performance is not an issue.
I get information from other sources as well, so I need a way to update the database independently.
While at one of the web sites I occasionally get two or many facts that should change the database.
None change the set of sites I am visiting.
Occasionally, I want some summary of the information in the database. Examples:
Who says anything about July or August, and what do they say?
Who says anything X next year, and when?
What topics are mentioned my most sites with dates next year?
The query to visit web sites will be flexible, but a common one might be to
skip sites skip sites with the farthest out date more than two months in the future.
The usual behavior of these sites is to look further into the future when they make a change.
Very rarely, they will change an earlier date. My approach might not detect such changes immediately.
I accept the slight risk.
All this leads me to consider a web based application using a browser with tabs and Java to do
the date manipulation and database operations. I think frame IDs can limit the proliferation of
tabs during maintenance and web scanning.
Is this a reasonable approach? If I am heading for a disaster, please warn me.
I'm happy to get some experience with Java and HTML, so I don't mind if this is not the
perfect way. There is not much Java in this request. If it is too little, just say so.
Also, the application is not secret. I figured almost nobody would be interested in it, so I
tried to make the description abstract and general. If you want/need the details to reply,
just ask. There will only be one user at a time, me, so the atomicity of the DB is not needed.
Actually there will probably be only one user ever, me, the developer, so error handling,
security, recovery procedures, and other factors will be less important than for most systems
Thanks.