I am trying to convert a proprietary format file into an XML.
I have created a schema for the XML. However, the proprietary
format file contains special characters like 'τ' etc which if
left as it is, give an error during validation.
You must distinguish between the character encoding, and characters that are not allowed in xml. The disallowed characters are essentially the non-printing control characters (except for tab, CR, NL). Those characters have to be removed, or converted to allowable characters.
I'm not sure what your example character is supposed to be, but it isn't a non-printing control character. You need to find out what character encoding is in use, and insert a corresponding character encoding declaration into to xml declaration at the start of your generated xml file.
If your parser does not support that encoding, you need to convert the file to an encoding that is supported. Alternatively, if there are only a few characters like that in each file you could convert them to character references (like
. Just be sure that you remember that character references contain are unicode values, not encoded characters. Thus, an encoded value of 160 in your encoding may not be the same as the unicode #160. You have to translate to the right character.
If i were using a DTD i could have declared these characters
using the <!ENTITY.... tag but how do i declare these entites
in a schema ? or in the converted XML ?
There is no problem about putting a DTD into a schema document. In fact, the schema for schemas given in the XML Schema Recommendation does that very thing. But it is unlikely to solve your problem in this case, because you problem is either incorrect encoding or illegal characters. No tricks with the DTD can solve either of these problems.