Using java.net package, I am trying to read a html page, which has Content-Type as
code:
<meta content="text/html;
charset=euc-kr" http-equiv="Content-Type" />
Now it is very critical for me to be able to read the charset which is mentioned in tag above.
Using urlConnection.getContentType(), urlConnection.getHeaderField("Content-Type") just returns "text/html", which I believe is because the above methods derive value from some other place rather than the <meta> tag shown above.
Is there a way of getting the values of <meta> tags beforehand so that one can determine what charset to use while reading ?.
I need to read a html page and write that to a already initialized response object. For that it is critical for me to determine the encoding of the html file.
Transferring bytes directly from InputStream to response OutputStream for which I need not care about encoding, is not working as the response.getWriter() has already been called and hence response.getOutputStream() throws
IllegalStateException !!!.
Someone please advise ways to resolve the problem