Hi Im new to java regexes. I need to build a regex that detects all words in a paragraph that look like a string of amino acids.
So for example :
Ala-Cys-Ala, A-C-A, and ACA all represent possible amino acid sequences of alanine, cystein and alanine. Is there a way to build a regex in java that represents this ? Currently Im doing it with nested for loops. Ive tried [A|Ala|V|Val|L|Lys|M|Met|W|Trp|P|S|T|Thr|C|Y|Tyr|N|Asn|-|Q|D|E|K|R|H|X]++ but it returns false positive matches... for example GAVs is returned as group(0) using the java matcher, even though the 's' character is not in the expression..?
Ala A Arg R Asn N Asp D Cys C His H Ile I Leu L Lys K Met M Phe F Pro P Ser S Thr T Trp W Tyr Y Val V