What i did was, to use the Pattern, Matcher and did find() on the matcher object.
That worked for <AAA> tag or <BBB> tag , as there is only one string value in them.
But since <CCC> has string with new lines, so it is not able to retrieve it. And the compiler complains
with error message "String literal is not properly closed by a double-quote"
But i wish to retrieve the <CCC>tag data as it looks above.
This will get you only characters in the same line with <CCC> and </CCC>.
What you need to pick is: any character, colon and whitespace (space, tab, new line, ...) between those two. So:
I would recommend you do it this way anytime you work with regular expressions. First say it in English (or whatever your native language is) what you want to match, and then jump to creating regex. Here it's pretty straightforward: any word character ( \w ), colon ( : ) and whitespace ( \s ). You just need to add asterisk ( * ) to denote that you're looking for those zero or more times (Pattern API). This will give you the desired result, also with opening and closing "CCC" tag that you can remove easily in your next step.
The original pattern is correct, but the Pattern object needs to be compiled with at least the Pattern.DOTALL flag. That makes the ".*" part include the line breaks, which isn't the case by default.
As for the "String literal is not properly closed by a double-quote" error message, you can't create String literals with line breaks in them like that. You need to replace each line break with \n (or \r\n):