Last week, we had the author of TDD for a Shopping Website LiveProject. Friday at 11am Ranch time, Steven Solomon will be hosting a live TDD session just for us. See for the agenda and registration link
I am new to regex and need some help.
I have a file path and file name in a String. This file path and file name are accessed from a Windows machine as well as Mac machine. So, when created via the Mac machine, file path and file name can have special characters like * @ $ ? etc., but that gives trouble to my code running on Windows machine. So, I would like to write a piece of code that checks for the presence of special characters (pretty much everything other than a-z, A-Z, 0-9, _, -) in the file path or file name; and accordingly perform rest of the activities.
The part where I seek your help is writing the regex expression:
Point to note: The filePath is the path of the file plus the file name and extension. Example: \\prod1\customer1\title1\myFile.jpg
And it can be any level deeper, so I have to validate the whole path and the file name.
According to my understanding, below will be the code to perform the above:
As I mentioned, I am new to regex, if there is any mistake in my regex code, please point that out too.
There's a shortcut for that Pattern / Matcher code - String.matches can be used in exactly the same way:
There's another shortcut, the static Pattern.matches method, but that's being used by String.matches.
If you want to do it on the basis of allowed characters, you could do this:
or if you just want to exclude certain characters
"matches" is a method on String.
In the regex, "." is any character and "*" means "0 or more times". So .*[ ].* would mean a space surrounded by any other characters
Square brackets  match any of the alternative characters within them
^ (as first character inside the square brackets) means "NOT these characters"
\w means word-characters, i.e. A-Za-z0-9 and _
Certain characters like [ and ] need to be escaped with a \
Thanks so much Luigi for such detailed explanation.
Well, I still have one doubt. With the code snippet that you have given, I can perform the check on individual file name or folder name. But I want to validate the whole file path, which contains '\' as file separator. For that, do you suggest, I split my filePath String on '\' character and perform the check on each part or include '\' character in my regex expression?
I would just include the "\" character in the regular expression. You will need to escape it by putting another "\" before it.
Which reminds me... you will need to escape all the "\" characters in the above examples so that Java correctly incorporates them into the String. In the case of including the "\" character in the regular expression, you will need to escape both of them, so you have "\\\\".
It's a bit confusing, because "\" is required both to escape characters in the Java String, and also within the expression itself.