Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Html to database

 
S Raman
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a html file containing Financial Data
I want to convert the html and insert it into Database(Mysql)
I have around 2000 companies file.
For each file the datafield is obvious from file the additional two fields will correspond to company name and the quarter

Please suggest the methodology to follow


the following file I have shown only few rows ie types of expenses income etc



<HTML>
<HEAD>
<TITLE>Raw Materials</TITLE>
<link rel="stylesheet" href="style.css">
</HEAD>
<body bgcolor="#FFFFFF" topmargin="0" leftmargin="0">
<table border=0 cellspacing="0" cellpadding="0" width="610">
<tr valign="top">
<td width="2%"> </td>
<td width="96%">
<table border=0 cellspacing="1" cellpadding="0" width="100%" >
<tr valign="top">

<td align=center><font face=arial size=5 color="#014bae">Quarterly Results</font></td>

</tr>
<tr><td> </td></tr>
<tr>
<td align="center"width="40%" bgcolor="#F3CC00" colspan = 2><strong><font face="Arial" size="2">3i Infotech Ltd.</font></strong></td>
</tr>
<tr><td> </td></tr>

<tr><td align=right><font size=2>(Rs in Cr.)</font></td></tr>

</table>
<table width='100%' border='0' cellspacing='1' cellpadding='1' bgcolor = #1863ad>
<tr bgcolor = #73b5ce>
<td class='fnt5' ><b> </b></td>
<td class='fnt5' valign='center' align='center' ><b>Dec '06 </b></td>
<td class='fnt5' valign='center' align='center' ><b>Sep '06 </b></td>
<td class='fnt5' valign='center' align='center' ><b>Jun '06 </b></td>
<td class='fnt5' valign='center' align='center' ><b>Mar '06 </b></td>
<td class='fnt5' valign='center' align='center' ><b>Dec '05 </b></td>
</tr>
<tr>
<td class='fnt6' bgcolor=#FFFFFF > </td>
<td class='fnt6'bgcolor=#FFFFFF > </td>
<td class='fnt6'bgcolor=#FFFFFF > </td>
<td class='fnt6'bgcolor=#FFFFFF > </td>
<td class='fnt6'bgcolor=#FFFFFF > </td>
<td class='fnt6'bgcolor=#FFFFFF > </td>
</tr><td class='fnt6' bgcolor=#FFFFFF>Sales </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 78.55</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 80.12</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 78.23</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 69.37</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 69.61</td>
</tr><td class='fnt6' bgcolor=#FFFFFF>Other Income </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 6.93</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 4.05</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 4.08</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 1.44</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 2.09</td>
</tr>
<td class='fnt6' bgcolor=#FFFFFF>Stock Adjustment </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
</tr>
<td class='fnt6' bgcolor=#FFFFFF>Raw Material </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td></tr>
<td class='fnt6' bgcolor=#FFFFFF>Power And Fuel </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
</tr>
<td class='fnt6' bgcolor=#FFFFFF>Employee Expenses </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 22.50</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
</tr><td class='fnt6' bgcolor=#FFFFFF>Excise </td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
<td class='fnt6'bgcolor=#FFFFFF align='right'> 0.00</td>
</tr>
</table>
</td>
<td width="2%"> </td>
</tr>
</table>
</body>
</html>
<HTML>
<HEAD>
<TITLE>Ashika</TITLE>
</HEAD>
<link rel="stylesheet" href="style.css">
<body>
<table width="778" border="0" cellspacing="0" cellpadding="0">

</table>
</body>
</HTML>
 
Bear Bibeault
Author and ninkuma
Marshal
Pie
Posts: 64822
86
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What does this have to do with XML?
 
S Raman
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I thought we can use xml and xslt technoligy to convert html to xml to database.
besides I would like to have data in xml format for future reference.
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HTML is not a well formed markup language, so you can't apply XSLT to it. Also, XSLT is not a technology you can use to access a database. You need JDBC for that.
 
Ulf Dittmer
Rancher
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can use a library like TagSoup to convert HTML to something that an XML parser can work with.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13061
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The JTidy toolkit will create a sort of DOM from ill-formed HTML. You might be able to use that.

Bill
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic