Regex: Infinite recurssion when extracting columns from SQL
ver sd
Greenhorn
Joined: Mar 24, 2008
Posts: 4
posted
0
I am trying to get all the columns within a SQL query (including the sub selects). When the code hits matcher.find(). i get the following exception:
Exception in thread "main" java.lang.StackOverflowError at java.util.regex.Pattern$Branch.match(Pattern.java: 4530) at java.util.regex.Pattern$GroupHead.match(Pattern.ja va:4570)
I am not sure where in the regex is causing the inifinite recursion. Can anyone shed light on this?
String regex = "^select(([^\\(]|\\([^\\(]+\\))*)from"; Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE | Pattern.DOTALL | Pattern.MULTILINE); Matcher matcher = pattern.matcher(sql); if (matcher.find()) { // Get all groups for this match for (int i = 0; i <= matcher.groupCount(); i++) { String groupStr = matcher.group(i); } }
Roger Chung-Wee
Ranch Hand
Joined: Sep 29, 2002
Posts: 1683
posted
0
What does your sql variable look like?
SCJP 1.4, SCWCD 1.3, SCBCD 1.3
ver sd
Greenhorn
Joined: Mar 24, 2008
Posts: 4
posted
0
I was googling for this problem, I came accross few posting where the string that i matching is large and if there is an alternation in the regex, it would cause stack over flow exception.
Here is the SQL i am using:
SELECT P4.NAME REGION, P3.NAME AREA, P2.NAME AS MARKET, P1.NAME AS CHANNEL, SR.NO AS SR_NUMBER, SR.NAME AS SR_NAME, SR.ID SRID, P.NO SITE_NUMBER, P.NAME SITE_NAME, P.ID SITEID, S.NAME AS STATUS_NAME, T.NAME AS TYPE_NAME, AD.STREET AS ADDRESS, AD.CITY_NAME AS CITY_NAME, AD.STATE_NAME AS STATE_NAME, AD.ZIP_CODE AS ZIP_CODE, AD.LATITUDE_COORDINATE AS LATITUDE, AD.LONGITUDE_COORDINATE AS LONGITUDE, AD.COUNTRY_NAME AS COUNTRY_NAME, AD.COUNTY AS COUNTY_NAME, TEXT17 AS POINT_TO_POINT_TYPE, A_STR.TEXT1 AS POINT_TO_POINT_DATA, (select blah.id from blah where name = P4.NAME) Test, TEXT18 AS FACILITY_TYPE, BOOLEAN1 AS LAND_LEASED, BOOLEAN2 AS TOWER_LEASED, BOOLEAN3 AS BUILDING_LEASED, BOOLEAN4 AS SPECTRUM_LEASED, TEXT2 AS LEGACY_SITE_ID FROM GEMINI.PRP_PROPERTIES P, GEMINI.PRP_PROPERTIES_EXT_STRING PES, GEMINI.PRP_PROPERTIES_EXT_DATE PED, GEMINI.PRP_PROPERTIES_EXT_BOOLEAN PEB, GEMINI.PRP_PROPERTIES_EXT_NUMBER PEN, GEMINI.PRP_PROPERTY_STATUSES S, GEMINI.PRP_PROPERTY_TYPES T, GEMINI.GEN_ADDRESSES AD, GEMINI.PRP_PROPERTIES_ORG_UNITS SP, GEMINI.HUR_ORGANIZATION_UNITS P1, GEMINI.HUR_ORGANIZATION_UNITS P2, GEMINI.HUR_ORGANIZATION_UNITS P3, GEMINI.SCR_SEARCH_RINGS SR WHERE A_STR.ID = P.ID AND A_BOOL.ID = P.ID AND S.ID = P.STATUS_ID AND T.ID = P.TYPE_ID AND AD.ID = P.PRIMARY_ADDRESS_ID AND SR.ID = P.SEARCH_RING_ID AND SP.PROPERTY_ID (+) = P.ID AND P1.ID (+) = SP.ORGANIZATION_UNIT_ID AND P2.ID (+) = P1.PARENT_ORGANIZATION_UNIT_ID AND P3.ID (+) = P2.PARENT_ORGANIZATION_UNIT_ID AND P4.ID (+) = P3.PARENT_ORGANIZATION_UNIT_ID AND P.IS_DELETED = 0 AND P.DOMAIN_ID = <%DOMAIN_ID%>
Ilja Preuss
author
Sheriff
Joined: Jul 11, 2001
Posts: 14112
posted
0
Yeah, it's probably not an infinite recursion, but just a finite, very deep one - too deep for your stack size. Have you tried enlarging your stack?
The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
ver sd
Greenhorn
Joined: Mar 24, 2008
Posts: 4
posted
0
Thanks for the reply, i am not sure IT guys will allow me to increase the stack size. The other bad thing is some times the sql i am parsing exceeds 2000 lines. Can you please let me know how to increase the stack size.
Originally posted by ver sd: Thanks for the reply, i am not sure IT guys will allow me to increase the stack size. The other bad thing is some times the sql i am parsing exceeds 2000 lines. Can you please let me know how to increase the stack size.
Thanks.
The stack is controlled by the "-Xss" switch on the command line. For example, to have a megabyte stack for each thread, add "-Xss1024k" to the command line.
I don't think increasing the stack size will help. You need to rewrite the regex so as to minimize the amount of backtracking it has to do. Try this one:
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to
run our stuff on 16 servers instead of 3.
subject: Regex: Infinite recurssion when extracting columns from SQL