GeeCON Prague 2014*
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes FBN: criteriaFind(String criteria) method; is this ok ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "FBN: criteriaFind(String criteria) method; is this ok ? " Watch "FBN: criteriaFind(String criteria) method; is this ok ? " New topic
Author

FBN: criteriaFind(String criteria) method; is this ok ?

Akash Singh
Ranch Hand

Joined: Aug 21, 2003
Posts: 80
/**
* Returns array of DataInfo object for the criteria given in the String
* as "Carrier='SpeedyAir',Origin='SFO'".
* Criteria will be in format of "FieldName1=<i>FieldValue1</i>,
* FieldName2=</i>FieldValue2</i>".
* Returns zero length array , if no matching record could be found.
* This method does extact search. FieldName and field value are
* case sensitive.
*
* @param criteria finding criteria in form of
* "FieldName1=<i>FieldValue1</i>, FieldName2=<i>FieldVale2</i>".
*
* @return DataInfo[] - array of DataInfo object.
*
* @exception DatabaseException if an I/O error occurs while accessing
* database.
*/
public DataInfo[] criteriaFind(String criteria)
throws DatabaseException {
final DataInfo[] NO_RECORD_FOUND = new DataInfo[0];
StringTokenizer tokenizer = new StringTokenizer(criteria, ",='");
int noOfTokens = tokenizer.countTokens(); /* no of tokens in criteria */
/* if no of tokens are odd , or if tokens are zero */
if ((noOfTokens%2 != 0) || (noOfTokens == 0)){
return NO_RECORD_FOUND; /* No record found */
}
/*
* Get field names and field values from the criteria
* using tokens extracted by StringTokenizer.
* Odd token nos will be field name and even token nos will be
* value of the field in the search criteria.
* for ex. "Carrier='SpeedyAir',Origin='SFO'"
*/
/* fieldName holds field names given in criteria */
String[] fieldName = new String[noOfTokens/2];
/* fieldValue holds field values given in criteria */
String[] fieldValue = new String[noOfTokens/2];
for (int i = 0, j = noOfTokens/2; i < j; i++){
fieldName[i] = tokenizer.nextToken().trim();
fieldValue[i] = tokenizer.nextToken();
}
/*
* Check if field names given in criteria matches with the field names
* in database. If any field given in the criteria does not match
* with the field name in database return NO_RECORD_FOUND .
*/
for (int i = 0; i < fieldName.length; i++){
boolean found = false;
for (int j = 0; j < description.length; j++){
if (fieldName[i].equals(description[j].getName())){
found = true;
}
}
if (!(found)) return NO_RECORD_FOUND;
}
/*
* If it did not return as NO_RECORD_FOUND, means criteria given
* is valid, now search for matching records starts below.
*/
DataInfo rv = null;
String[] values = null;
FieldInfo[] fields = null;
/* create ArrayList object to hold DataInfo object for return */
List dataInfoObjects = new ArrayList();
/* create ArrayList object to hold all current records from db. */
List valuesContainer = new ArrayList(recordCount);
synchronized(this){ // lock this object to block other thread
for (int i = 1; i <= recordCount; i++){
rv = getRecord(i); /* get the DataInfo object */
if (fields == null) /* field names are same for all recs */
fields = rv.getFields(); /* get FieldInfo object. */
values = rv.getValues(); /* get Values */
valuesContainer.add(values);
}
}
/*
* check with each field name and values given
* in the criteria with the field name and
* values for the DataInfo info object.
*/
for (int i = 0; i < recordCount; i++) {
boolean matched = false;
for (int j = 0; j < fieldName.length; j++){
for (int k = 0; k < fields.length; k++){
/*
* if field name given in criteria
* matches with the field name of the DataInfo object.
*/
if (fieldName[j].equals(fields[k].getName())) {
/*
* If field value for the field name
* matches with the field value of the
* DataInfo object for that field.,
* or "any" (if "any" has been given
* in the criteria for the value of that
* field.
*/
if (fieldValue[j].equals(
((String[]) valuesContainer.get(i))[k])){
matched = true;
break;
}
else {
matched = false;
break;
}
}
}
}
if (matched) dataInfoObjects.add(rv);
}
return (DataInfo[])
dataInfoObjects.toArray((DataInfo[]) new DataInfo[0]);
}
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17250
    
    6

Hi Akash.
In our posts there is a CODE tag that makes posting code look nice. It keeps the formatting and tabs that you have in your code. The tag is simple [CODE] and its closing tag with a / in front of the word code.
There is a button below when posting that also automatically puts these tags in your post, then you just copy your code into the tags and it makes the code look readable.
Mark


Perfect World Programming, LLC - Two Laptop Bag - Tube Organizer
How to Ask Questions the Smart Way FAQ
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17250
    
    6

Well, the best way to determine if your code is ok, is to see if you get the correct results set from your queries.
Now I would suggest breaking your criteriaFind method into multiple methods to make it easier to read. Maybe one method for Tokenizing your criteria string. Another to make an Arraylist of all the values in the database, and another to check to see if each record is a match or not.
Mark
Akash Singh
Ranch Hand

Joined: Aug 21, 2003
Posts: 80
Mark Spritzler, Thank you so much for your suggestions. This forum is excellent. I will do appropriate changes as suggested by you. Can i implment binary search to improve performance, although records are not enough to see the significant difference between linear search and binary search ?
Here is my old code
Ian Roberts
Ranch Hand

Joined: Aug 20, 2003
Posts: 46
I agree with Mark. At present, the method is very long and requires quite alot of attention to understand exactly what is going on. By breaking the method into a number of logical methods the code would appear much more simple and easier to read, which is one of the assignments requirements.

Good luck.
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17250
    
    6

"Can i implment binary search to improve performance"
Akash. As we know in the real world, things like performance can and should be a requirements basis. Meaning, if I am creating a new application, I want to know up front what are the speed/performance requirements and make sure that my algorithms meet those requirements.
However, in the SCJD assignments, there are no such requirements. So by changing the search to be a binary search might speed things up a little, it isn't necessary for the assignment.
If for your own benefit, you want to see if you can implement it, then do so, but I wouldn't include that bit of code in the submission.
Good luck.
Mark
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi Akash,
As I made the effort to read your code, I'll make the effort to comment it. My comments are in uppercase so you can find them easily.

Now I disagree with Mark and Ian about the length of your method.
Ian
I agree with Mark. At present, the method is very long and requires quite alot of attention to understand exactly what is going on. By breaking the method into a number of logical methods the code would appear much more simple and easier to read, which is one of the assignments requirements.

As a Java newbie (but Delphi and PL/SQL expert), I got this problem with Java : it does not support nested methods. It means that when you have some complex method like this (in pseudo code) :

you have no other choice in Java between :

or :

Well, if do_this, do_that, this_happens and do_something_special processes are never called in another context than complex_stuff, I do prefer the second way of coding, even if complexStuff() is bigger. Simply because do_this, do_that (...) code chunks are far more difficult to read and understand if they are decoupled of their running context. In Delphi or PL/SQL I would have used nested procedures in such a case, but while waiting for Java 3 or 4 , I would code this as you do.
I am pretty sure (and hope so) that Andrew will have a comment to do here.
Best,
Phil.
[ August 26, 2003: Message edited by: Philippe Maquet ]
Miguel Roque
Ranch Hand

Joined: Oct 24, 2002
Posts: 126
Hi Akash.
I agree with what Mark as writen here. The method is very long and you should split it in at least two methods. One that parses the criteria string and the second one that compares the criteria with the readed record (similar to equals), and of course the main method.
You must remember one thing from the requirements that is that the code must be clear so that it could be understood by a novice programmer.
Miguel
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi Akash,
A few more words to tell you that I don't think you need any binarySearch in that method. I guess you were thinking of speeding the three nested "for" loops at the end of your method, but as they are useless it isn't worth while to optimize them.
Now I disagree with Mark about the performance issue.
Akash. As we know in the real world, things like performance can and should be a requirements basis. Meaning, if I am creating a new application, I want to know up front what are the speed/performance requirements and make sure that my algorithms meet those requirements.
However, in the SCJD assignments, there are no such requirements. So by changing the search to be a binary search might speed things up a little, it isn't necessary for the assignment.
If for your own benefit, you want to see if you can implement it, then do so, but I wouldn't include that bit of code in the submission.

Yes, there are no speed/performance requirements in our instructions, as so often in the real world BTW. Does it mean that worries about performance must be put aside ? I don't think so. If, in every class you design and every method you write you think about a possible performance issue and make a reasonable tradeoff between simplicity and performance, you should be OK.
There is a limitation in complexity stated in our instructions, under the title "Clarity and Maintability" :
A clear design, such as will be readily understood by junior programmers, will be preferred to a complex one, even if the complex one is a little more efficient. Code complexity, including nesting depth, argument passing, and the number of classes and interfaces, should be reasonable.

What does it mean ? IMO that you must avoid (as in real world) to reinvent the wheel. For example, don't try to write MyOwnHashMap class, even you think that you might gain some CPU cycles... . But feel free to use the whole Java API (as far as it doesn't go against a specific "must-be" requirement) : complex stuffs may be written in one line of code, very understandable by a junior programmer. For example, if you think that sorting a given array to be able to binary-search it a couple of time would offer you better performance, don't hesitate to do it. People at Sun wrote the many Arrays.sort methods and the many Arrays.binarySearch methods just for that purpose. Using them in an appropriate context just shows that you're mastering the Java API and ... it's the goal of SJCD, isn't it ?
Regards,
Phil.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11465
    
  94

Hi Akash,
[Philippe]: Well, if do_this, do_that, this_happens and do_something_special processes are never called in another context than complex_stuff, I do prefer the second way of coding, even if complexStuff() is bigger. Simply because do_this, do_that (...) code chunks are far more difficult to read and understand if they are decoupled of their running context. In Delphi or PL/SQL I would have used nested procedures in such a case, but while waiting for Java 3 or 4 , I would code this as you do.
I am pretty sure (and hope so) that Andrew will have a comment to do here.

Personally I find that I prefer it if code like this is broken into logical blocks, as suggested by others here. If I can start looking at the criteriaFind() method and see that the first thing it does is call some method that splits the provided string into a format that is easier to work with, then I can just concentrate on whether the criteriaFind does find data correctly - I do not have to read through 'x' lines of parsing, before getting to the actual "find" stuff.
Or, to put it another way, what is the one thing that this method does? Is your answer: "parses the input AND validates field names AND finds matching records AND retrieves them into a DataInfo array"? Does this sound like one thing? If not, are you trying to get your method to do too much? And get the examiner (junior programmer) to understand too much in one go?
As to performance issues: I think that you (not all the people doing the new assignments) do have to be at least a little concerned about performance issues here. From the marking guidelines to Fly By Night Services:
search algorithm: clarity and efficiency (15)

Philippe has already made some suggestions about efficiency. I would just add that stepping through every field in every record and checking if it is a field that you are interested in is not very efficient - perhaps you could work out which fields you are interested in before looping through all the records?
Regards, Andrew


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17250
    
    6

I think the point that some are missing is the intent of the certification assignment. You will be wasting time if you want the most efficient and fastest code. It will not get you any more points, and the complexity could have you lose points.
I have seen this so many times over the past 2 years. You want to go overboard and it costs you. I only make suggestions like this for everyone to get the highest possible points and passing the assignment.
Mark
James Hook
Greenhorn

Joined: Jan 20, 2003
Posts: 18
I see that you have used the String Tokenizer to split the String up..
but what if the criteria string takes the form?
FieldName='sgjhgdsj,gjgjhg=dfgjgjhfdg'
has anybody else used regular expressions to check that the criteria is valid?
James Hook
Greenhorn

Joined: Jan 20, 2003
Posts: 18
Sorry, only me again..
Phillipe, I see in your commentary of the code
that you are worried that a large data set would flood
the available memory of the JVM. Given that the
return type of of criteriaFind method is DataInfo[], is
there any way of streaming data back to the caller?
Think I may investigate throwing an exception if this happens. Or is this overkill because the db.db file is so small anyway?
Thanks
James
Ian Roberts
Ranch Hand

Joined: Aug 20, 2003
Posts: 46
One point to remember when creating methods is that, from my experience on several Sun courses, the instructors frown upon long drawn out methods that can be separated into logical methods and I have a feeling that some of the people checking the assignments will either be Sun trained or the occasional instructor. From a commercial perspective, someone who has been called out of bed at 2am will not be too pleased if the code he is attempting to understand is unnecessary complicated.
OOP encourages short logical methods!
Svetlana Koshkina
Ranch Hand

Joined: Jul 08, 2003
Posts: 108
Originally posted by James Hook:
I see that you have used the String Tokenizer to split the String up..
but what if the criteria string takes the form?
FieldName='sgjhgdsj,gjgjhg=dfgjgjhfdg'
has anybody else used regular expressions to check that the criteria is valid?

I thought it would be reasonable if the client took care of proper syntax. Customer's typing in the search string - one thing, your program's generating the string from input values - is another. Server can simply reject bad syntax in numerous ways without any implications. Enforcement of the syntax, i think, is off topic because it does not relate to search algorithm itself.
The requirements are very strange - for one they will 'punish' for obvious decisions as well as overly complex ones. So, it's matter of taste, probably, but who knows?
I'm still waiting for the answer for my submission and am very cautious at
suggesting anything. I did not used any declared algorithm but utilized some of the clollections framework capabilities in my application that allowed me to justify some of my design decisions.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11465
    
  94

Hi everyone,
[Mark]: You will be wasting time if you want the most efficient and fastest code.

I agree that you should not spend too much time or effort into getting a super efficient/fast design. After all, one of the marking criteria for this section is "clarity", and clarity can be lost in the search for efficiency / speed.
But at the same time, I think that it cannot hurt to try and get some simple improvements. I think the suggestions made by Philippe would improve both efficiency and clarity.
[James]: has anybody else used regular expressions to check that the criteria is valid?

Quite a few people.
Regards, Andrew
Philippe Maquet
Bartender

Joined: Jun 02, 2003
Posts: 1872
Hi James,
Philipe, I see in your commentary of the code
that you are worried that a large data set would flood
the available memory of the JVM.

My worry was worst than that. In Akash's code, the whole database was read in memory within its find method, meaning that if a few clients concurrently performed a search, the system could have as many copies of the whole database in memory. That was the non-sense I pointed out.
IMO, you have no other choices than those :
  • No cache at all. Why not ? Many people here (including Mark) tell us that performance is not an issue and that the least you do the best is. Mmh, BTW, this could be a nice introductory sentence in a choice.txt file : "The least I do the best it is".
  • Cache the whole database in memory. Why not ? It's simple, and we shouldn't expect so many records to deal with and it's fast.
  • Cache the database with an optional maximum size. Defendable too : we don't know how our database system will be used in the future (additional tables ?) so we may worry about a possible available memory issue. But it's more work...


  • Given that the return type of of criteriaFind method is DataInfo[], is there any way of streaming data back to the caller?

    No. But if you encounter a memory issue when your application is in production, it will be easy to impose that at least one search criteria is given to perform the search. BTW, from the CSR's point of view, if your database contains 100000 records, it does not help the CSR a lot to have the possibility of choosing on screen an available room to sell among all of them !
    Think I may investigate throwing an exception if this happens. Or is this overkill because the db.db file is so small anyway?

    It's not overkill, but if you run out of memory the JVM will throw the exception for you.

    Best,
    Phil.
    [ August 27, 2003: Message edited by: Philippe Maquet ]
    [ August 27, 2003: Message edited by: Philippe Maquet ]
    Mark Spritzler
    ranger
    Sheriff

    Joined: Feb 05, 2001
    Posts: 17250
        
        6

    Let me state what I mean in a different way.
    So you want to improve the speed? From what to what? From 1/2 second to 1/4 of a second? When I run my assignment, it takes less than a second to do a search. So is the effort to change your code to save 1/4 of a second, which the user will never really notice? Is this worth the possiblity of losing points?
    I can't and won't ever say that your ideas for speeding things up are bad. They are actually very good. It's just not necessary for this assignment.
    Mark
     
    GeeCON Prague 2014
     
    subject: FBN: criteriaFind(String criteria) method; is this ok ?