• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Devaka Cooray
  • Tim Cooke
Sheriffs:
  • Rob Spoor
  • Liutauras Vilda
  • paul wheaton
Saloon Keepers:
  • Tim Holloway
  • Tim Moores
  • Mikalai Zaikin
  • Carey Brown
  • Piet Souris
Bartenders:
  • Stephan van Hulst

need some help with recursion (Connect 4 program)

 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello all-

In my programming class at school, I'm in a group and our assignment is to create a Connect 4 program. I'm in charge of handling computer AI. I did some research and found out that a MINMAX search would be best for figuring out the next move. (For the rules of Connect 4, try a Google search on it- basically you win by having four of your pieces in a row) The following is the pseudocode for the Connect 4 AI search, with credit to http://personal.vsnl.com/erwin/connect4.htm.

int Goodness(player, depth)
{
if CheckWin(-player)
return -128
if depth=0
return 0

max = -200
for i = 1 to 7
if column i is not full
{
drop coin into column i
value = -Goodness(-player,depth-1) / 2
remove coin from column i
if value > max
max = value
}
return max
}

First, I'll try to explain it as I see it. This way, if I'm wrong, then please let me know so I can understand this search algorithm better.
The parameters are the current player (computer) and the depth of the search (so that it doesn't run for 2 weeks). For each column, it drops (temporarily) a coin into it. Then it calls itself with the parameters being the other player (the human) and depth-1. Now, the program is acting as the human and recognizes that the computer dropped a coin into every slot so it will now act on that and drop a coin into every slot (as the human), then the method recurses. The sign of "value" is flipped every time because a good move for the human is a bad move for the computer, and vice versa.

Now, here are my questions:
1. Why is "value" divided by two each time?

2. I'm thinking that this is a helper method to the real AI method. This returns the value of the best move available. But how do I know where to move? I could have a
int bestValue=Goodness(computer, 3);
but that would return some high number representing the value of the best move (I think) whereas I need a number between 0-7 so I can put a coin there (in my [][]array). In other words, how can I use this method to figure out where the computer needs to move?
 
Ranch Hand
Posts: 3061
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It looks like the algorithm keeps track of the column number as a counter in the for loop. You need to figure out some way to keep just the column number that corresponds to max. You also need need to figure out some way to return both max and this column number. Perhaps a class that holds these two values will help.

I think I'll just leave it at that and let you think on it. Let us know what you come up with.

Good Luck and Keep Coding! (TM)

Layne
[ November 07, 2004: Message edited by: Layne Lund ]
 
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I did the same thing for a checkers AI, and it sounds like you understand the min-max algorithm well enough. The psuedocode, however, is trying to merge two separate concerns. Keep in mind, this is how I was taught to do it, so YMMV.

Your main goal is to choose the move for the computer:Using the min-max algorithm will require two methods: one to recursively make pretend moves and another to evaluate the board position. The code you pasted above merges those together, making it more difficult.

Try to write a dead-simple board evaluation method, perhaps counting runs of 2 and 3 pieces, giving slightly higher weights if both ends are open rather than only one. Clearly, a closed-off run of any length is worthless. Think of the value as "the likelihood that the player in that position will win."

Once you get the full algorithm working with the computer maknig (dumb) moves, then you can look to improve your evaluation method. Here's its signature.With this concern stripped from min-max itself, the algorithm becomesHopefully I've helped enough to get you going without doing too much. In the end you'll find that it's the board evaluation method that can be really tricky. Oh, and during the recursion you'll need to detect moves that result in a win and not recurse. This can be done by choosing something like Integer.MAX_INT for the board value.
 
Jason Robers
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
David-

I looked at your code and it makes a little bit more sense than what I posted. I rewrote the pseudocode below using the requirements of the teacher that were handed out today (he says that everybody's AI algorithm must be the same so he can test fairly). This is what I got.



I can understand your code, but something about his doesn't quite click with me. Can you help me with that? For example, in your code, you send player to an evaluation method whereas in his code, the evaluation process is somewhat fuzzy to me.
[ November 08, 2004: Message edited by: Jason Robers ]
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There's no reason you cannot put the evaluation and move choosing code into one method as he's done, but it's certainly less clear as you've found.

If you recall, in min-max you follow all possible moves to a certain depth and evaluate the board position at the leaves (the final positions) only [1]. There is one exception: if you detect a win somewhere along the way [2], you don't pursue any deeper but return some "really really good" board position value.

He's done exactly that:The first if test handles the winning position case [2], and the second one handles the position evaluation at the leaves [1]. It's just that his board position evaluation function is trivial: any non-winning position is valued at 0 while winning positions are -128 (negative since it's testing whether the other player has won).

This gets to your previous question: why is he dividing the result by 2 each time. The reason is that the board position value now is 128 if I've just won. If I can win in two moves, it's value now is 32. Thus the value is lessened based on how far into the future the projected win is. The value is negated at the same time because my evaluated position value is always the negative of your evaluation.

The thing that is still missing is remembering which move had the best value. It returns the value, but it also needs to return the move that resulted in that valuation -- the whole point of the AI.
 
Jason Robers
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
David-

Ah, it all makes sense now. I did some println statements with value and max changing and I'm able to understand it better now.

As for returning the move, I kinda took the easy way out and added a global int initialized to -1. Whenever value>max, max=value and said int changes to x (the counter of columns). It works fine, I just found one problem with it. The teacher showed me his grading algorithm and it goes like this:

Start 1 player game.
Set recursion depth to 4.
Human drops coin into column 4. (columns 0-6)
Computer drops into column 0. //column 0's value is greater than max so the global int is changed to 0, no other columns' value can beat it so it stays 0
Human drops into column 3.
This should register with the AI that I can win in two more moves so it should begin to block me. The problem is, his algorithm has the computer drop into column 2 (block), whereas mine drops into column 5 (same block). I looked again at my code and noticed that column 2 and column 5 should have the same value. Do you know what could be wrong here?

edit: should be (columns 0-6)
[ November 10, 2004: Message edited by: Jason Robers ]
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If I understand the situation, you have two moves with the same value. Given how your code works (if value > max, not if value >= max), the first move encountered should be taken. Of course, what you're saying is reverse of what would happen. Is you check greater-or-equal or simply greater-than?

Basically, you're dealing with the issue of two or more "best moves" for a single move. You can pick the first, pick one at random or follow any other scheme you like. It seems he's picking the first.
 
money grubbing section goes here:
New web page for Paul's Rocket Mass Heaters movies
https://coderanch.com/t/785239/web-page-Paul-Rocket-Mass
reply
    Bookmark Topic Watch Topic
  • New Topic