I would like to use java to get the source of a website (secure) and then parse that website for links that are in it. I have found how to connect to that url, but then how
Probably you could get better resutls from Pete's or sktrdie options. Here's an additional way if you would like to know how to do it "by had"
I'm not very good at regex so in this case it returns the last link in a line. Well, it's a start.
import java.io.*;
import java.net.*;
import java.util.regex.*;
public class Links {
public static void main( String [] args ) throws IOException {
URL url = new URL( args[0] );
InputStream is = url.openConnection().getInputStream();
BufferedReader reader = new BufferedReader( new InputStreamReader( is ) );
String line = null;
String regExp = ".*.*";
Pattern p = Pattern.compile( regExp, Pattern.CASE_INSENSITIVE );
while( ( line = reader.readLine() ) != null ) {
Matcher m = p.matcher( line );
if( m.matches() ) {
System.out.println( m.group(1) );
}
}
reader.close();
}
}
EDIT
Ooops I totally missed the "secure" part. Anyway I couldn't help it, I had to write this sample :P