File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Beginning Java and the fly likes Search Problem Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Search Problem" Watch "Search Problem" New topic
Author

Search Problem

Ekrem Altintas
Greenhorn

Joined: Dec 20, 2005
Posts: 14
I'm trying to find http links in a web page.I am writing a code for it.I'm searching line by line but if the web page code, has more than one http link, my code finds only one link.What should I do?
This is my code;

import java.net.*;
import java.lang.*;
import java.util.regex.*;
import java.util.*;

public class Linkler2{

public static void main(String args[]) throws IOException
{
String sayi;
int abc;

URL local = new URL("http://www.google.com.tr");
URLConnection baglanti = local.openConnection();

BufferedReader oku = new BufferedReader(new InputStreamReader(baglanti.getInputStream()));
PrintWriter dataOut = new PrintWriter(new FileWriter("abc2.txt"),true);

try
{

int deger=0;

int kelime=0;

while ((sayi = oku.readLine()) != null)
{
System.out.println(sayi);
kelime= sayi.indexOf("http://");
System.out.println(kelime);
if (kelime!=-1)
{
deger=deger+1;
}

dataOut.println(sayi);

}

System.out.println(deger);
}
marc weber
Sheriff

Joined: Aug 31, 2004
Posts: 11343

You're using String's indexOf(str) method, which finds the first occurrence of the str argument. You will probably want to use String's indexOf(str, int) method, which begins searching at the indicated index.

So, for example, if you find your first occurrence at index 108, then you will want to start searching for your next occurrence at index 109.


"We're kind of on the level of crossword puzzle writers... And no one ever goes to them and gives them an award." ~Joe Strummer
sscce.org
Shankar Narayana
Ranch Hand

Joined: Jan 08, 2003
Posts: 134
Probably you might be getting whole page soure in one line and the method indexOf("http://") finds the first occurence and gets out.

cheers,
shankar.


"Failure is not when you fall down; its only when you fail to get up again."
Stuart Ash
Ranch Hand

Joined: Oct 07, 2005
Posts: 637
Use HTMLUnit. It's amazing how OOPy your code can get with HTMLUnit, instead of all this raw String manipulation.


ASCII silly question, Get a silly ANSI.
 
Consider Paul's rocket mass heater.
 
subject: Search Problem