问题
I'm trying to get anime-list in this site, https://ww1.gogoanime.io
this is the code,
org.jsoup.Connection.Response usage = Jsoup.connect("https://ww1.gogoanime.io/anime-list-A")
.header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
.header("accept-encoding", "gzip, deflate, sdch, br")
.header("accept-language", "en-US,en;q=0.8")
.header("cache-control", "max-age=0")
.header("user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36")
.header("upgrade-insecure-requests", "1")
.ignoreHttpErrors(true)
.followRedirects(true)
.method(Connection.Method.GET)
.timeout(30000)
.execute();
System.out.println(usage.parse());
This code works for other websites, however with this site the result is Cloudflare DDOS protection I have added all the headers, but chrome can access this url without any problem.
Btw, if I didn't set,
ignoreHttpErrors(true)
to true, this will throw an exception 503. No matter what I do it won't go away until I change this to true. So I'm stuck at ddos protection page, which says will redirect to the website in 5 seconds.
I tried the below code too,
org.jsoup.Connection.Response usage = Jsoup.connect("https://ww1.gogoanime.io/anime-list-A")
.header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
.header("accept-encoding", "gzip, deflate, sdch, br")
.header("accept-language", "en-US,en;q=0.8")
.header("cache-control", "max-age=0")
.header("user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36")
.header("upgrade-insecure-requests", "1")
.ignoreHttpErrors(true)
.followRedirects(true)
.method(Connection.Method.GET)
.timeout(30000)
.execute();
Thread.sleep(5000);
org.jsoup.Connection.Response usg = Jsoup.connect("https://ww1.gogoanime.io/anime-list-A")
.header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
.header("accept-encoding", "gzip, deflate, sdch, br")
.header("accept-language", "en-US,en;q=0.8")
.header("cache-control", "max-age=0")
.header("user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36")
.header("upgrade-insecure-requests", "1")
.ignoreHttpErrors(true)
.followRedirects(true)
.cookies(usage.cookies())
.method(Connection.Method.GET)
.timeout(30000)
.execute();
This didn't work either. My browser access this url without any problem. So I think it's related to jsoup?
btw, I thought it was something about certificates, so I used this too.but it didn't work too.
TrustManager[] trustAllCerts = new TrustManager[] { new X509TrustManager() {
public java.security.cert.X509Certificate[] getAcceptedIssuers() {
return null;
}
public void checkClientTrusted(java.security.cert.X509Certificate[] certs, String authType) {
}
public void checkServerTrusted(java.security.cert.X509Certificate[] certs, String authType) {
}
} };
// Install the all-trusting trust manager
try {
SSLContext sc = SSLContext.getInstance("SSL");
sc.init(null, trustAllCerts, new java.security.SecureRandom());
HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
} catch (Exception e) {
throw new RuntimeException(e);
}
回答1:
Okay so i have finally found out how to successfully do this.... here is how i have done it and it should help others too....
in your project create CloudFlare.java class like this :
public class Cloudflare {
private String mUrl;
private String mUser_agent;
private cfCallback mCallback;
private int mRetry_count;
private URL ConnUrl;
private List<HttpCookie> mCookieList;
private CookieManager mCookieManager;
private HttpURLConnection mCheckConn;
private HttpURLConnection mGetMainConn;
private HttpURLConnection mGetRedirectionConn;
private static final int MAX_COUNT = 5;
private static final int CONN_TIMEOUT = 60000;
private static final String ACCEPT = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;";
private boolean canVisit = false;
public Cloudflare(String url) {
mUrl = url;
}
public Cloudflare(String url, String user_agent) {
mUrl = url;
mUser_agent = user_agent;
}
public String getUser_agent() {
return mUser_agent;
}
public void setUser_agent(String user_agent) {
mUser_agent = user_agent;
}
public void getCookies(final cfCallback callback){
new Thread(new Runnable() {
@Override
public void run() {
urlThread(callback);
}
}).start();
}
private void urlThread(cfCallback callback){
mCookieManager = new CookieManager();
mCookieManager.setCookiePolicy(CookiePolicy.ACCEPT_ALL); //接受所有cookies
CookieHandler.setDefault(mCookieManager);
HttpURLConnection.setFollowRedirects(false);
while (!canVisit){
if (mRetry_count>MAX_COUNT){
break;
}
try {
int responseCode = checkUrl();
if (responseCode==200){
canVisit=true;
break;
}else {
getVisiteCookie();
}
} catch (IOException | InterruptedException e) {
if (mCookieList!=null){
mCookieList.clear();
}
e.printStackTrace();
} finally {
closeAllConn();
}
mRetry_count++;
}
if (callback!=null){
Looper.prepare();
if (canVisit){
callback.onSuccess(mCookieList);
}else {
e("Get Cookie Failed");
callback.onFail();
}
}
}
private void getVisiteCookie() throws IOException, InterruptedException {
ConnUrl = new URL(mUrl);
mGetMainConn = (HttpURLConnection) ConnUrl.openConnection();
mGetMainConn.setRequestMethod("GET");
mGetMainConn.setConnectTimeout(CONN_TIMEOUT);
mGetMainConn.setReadTimeout(CONN_TIMEOUT);
if (!TextUtils.isEmpty(mUser_agent)){
mGetMainConn.setRequestProperty("user-agent",mUser_agent);
}
mGetMainConn.setRequestProperty("accept",ACCEPT);
mGetMainConn.setRequestProperty("referer", mUrl);
if (mCookieList!=null&&mCookieList.size()>0){
mGetMainConn.setRequestProperty("cookie",listToString(mCookieList));
}
mGetMainConn.setUseCaches(false);
mGetMainConn.connect();
switch (mGetMainConn.getResponseCode()){
case HttpURLConnection.HTTP_OK:
e("MainUrl","visit website success");
return;
case HttpURLConnection.HTTP_FORBIDDEN:
e("MainUrl","IP block or cookie err");
return;
case HttpURLConnection.HTTP_UNAVAILABLE:
InputStream mInputStream = mCheckConn.getErrorStream();
BufferedReader mBufferedReader = new BufferedReader(new InputStreamReader(mInputStream));
StringBuilder sb = new StringBuilder();
String str;
while ((str = mBufferedReader.readLine()) != null){
sb.append(str);
}
mInputStream.close();
mBufferedReader.close();
mCookieList = mCookieManager.getCookieStore().getCookies();
str = sb.toString();
getCheckAnswer(str);
break;
default:
break;
}
}
/**
* 获取值并跳转获得cookies
* @param str
*/
private void getCheckAnswer(String str) throws InterruptedException, IOException {
String jschl_vc = regex(str,"name=\"jschl_vc\" value=\"(.+?)\"").get(0); //正则取值
String pass = regex(str,"name=\"pass\" value=\"(.+?)\"").get(0); //
double jschl_answer = get_answer(str);
e(String.valueOf(jschl_answer));
Thread.sleep(3000);
String req = String.valueOf("https://"+ConnUrl.getHost())+"/cdn-cgi/l/chk_jschl?"
+"jschl_vc="+jschl_vc+"&pass="+pass+"&jschl_answer="+jschl_answer;
e("RedirectUrl",req);
getRedirectResponse(req);
}
private void getRedirectResponse(String req) throws IOException {
HttpURLConnection.setFollowRedirects(false);
mGetRedirectionConn = (HttpURLConnection) new URL(req).openConnection();
mGetRedirectionConn.setRequestMethod("GET");
mGetRedirectionConn.setConnectTimeout(CONN_TIMEOUT);
mGetRedirectionConn.setReadTimeout(CONN_TIMEOUT);
if (!TextUtils.isEmpty(mUser_agent)){
mGetRedirectionConn.setRequestProperty("user-agent",mUser_agent);
}
mGetRedirectionConn.setRequestProperty("accept",ACCEPT);
mGetRedirectionConn.setRequestProperty("referer", req);
if (mCookieList!=null&&mCookieList.size()>0){
mGetRedirectionConn.setRequestProperty("cookie",listToString(mCookieList));
}
mGetRedirectionConn.setUseCaches(false);
mGetRedirectionConn.connect();
switch (mGetRedirectionConn.getResponseCode()){
case HttpURLConnection.HTTP_OK:
mCookieList = mCookieManager.getCookieStore().getCookies();
break;
case HttpURLConnection.HTTP_MOVED_TEMP:
mCookieList = mCookieManager.getCookieStore().getCookies();
break;
default:throw new IOException("getOtherResponse Code: "+
mGetRedirectionConn.getResponseCode());
}
}
private int checkUrl()throws IOException {
URL ConnUrl = new URL(mUrl);
mCheckConn = (HttpURLConnection) ConnUrl.openConnection();
mCheckConn.setRequestMethod("GET");
mCheckConn.setConnectTimeout(CONN_TIMEOUT);
mCheckConn.setReadTimeout(CONN_TIMEOUT);
if (!TextUtils.isEmpty(mUser_agent)){
mCheckConn.setRequestProperty("user-agent",mUser_agent);
}
mCheckConn.setRequestProperty("accept",ACCEPT);
mCheckConn.setRequestProperty("referer",mUrl);
if (mCookieList!=null&&mCookieList.size()>0){
mCheckConn.setRequestProperty("cookie",listToString(mCookieList));
}
mCheckConn.setUseCaches(false);
mCheckConn.connect();
return mCheckConn.getResponseCode();
}
private void closeAllConn(){
if (mCheckConn!=null){
mCheckConn.disconnect();
}
if (mGetMainConn!=null){
mGetMainConn.disconnect();
}
if (mGetRedirectionConn!=null){
mGetRedirectionConn.disconnect();
}
}
public interface cfCallback{
void onSuccess(List<HttpCookie> cookieList);
void onFail();
}
private double get_answer(String str) { //取值
double a = 0;
try {
List<String> s = regex(str,"var s,t,o,p,b,r,e,a,k,i,n,g,f, " +
"(.+?)=\\{\"(.+?)\"");
String varA = s.get(0);
String varB = s.get(1);
StringBuilder sb = new StringBuilder();
sb.append("var a=");
sb.append(regex(str,varA+"=\\{\""+varB+"\":(.+?)\\}").get(0));
sb.append(";");
List<String> b = regex(str,varA+"\\."+varB+"(.+?)\\;");
for (int i =0;i<b.size()-1;i++){
sb.append("a");
sb.append(b.get(i));
sb.append(";");
}
e("add",sb.toString());
V8 v8 = V8.createV8Runtime();
a = v8.executeDoubleScript(sb.toString());
List<String> fixNum = regex(str,"toFixed\\((.+?)\\)");
if (fixNum!=null){
a = Double.parseDouble(v8.executeStringScript("String("+String.valueOf(a)+".toFixed("+fixNum.get(0)+"));"));
}
a += new URL(mUrl).getHost().length();
v8.release();
}catch (IndexOutOfBoundsException e){
e("answerErr","get answer error");
e.printStackTrace();
}
catch (MalformedURLException e) {
e.printStackTrace();
}
return a;
}
/**
* 正则
* @param text 本体
* @param pattern 正则式
* @return List<String>
*/
private List<String> regex(String text, String pattern){
try {
Pattern pt = Pattern.compile(pattern);
Matcher mt = pt.matcher(text);
List<String> group = new ArrayList<>();
while (mt.find()) {
if (mt.groupCount() >= 1) {
if (mt.groupCount()>1){
group.add(mt.group(1));
group.add(mt.group(2));
}else group.add(mt.group(1));
}
}
return group;
}catch (NullPointerException e){
Log.i("MATCH","null");
}
return null;
}
/**
* 转换list为 ; 符号链接的字符串
* @param list
* @return
*/
public static String listToString(List list ) {
char separator = ";".charAt(0);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < list.size(); i++) {
sb.append(list.get(i)).append(separator);
}
return sb.toString().substring(0, sb.toString().length() - 1);
}
/**
* 转换为jsoup可用的Hashmap
* @param list HttpCookie列表
* @return Hashmap
*/
public static Map<String,String> List2Map(List<HttpCookie> list){
Map<String, String> map = new HashMap<>();
try {
if (list != null) {
for (int i = 0; i < list.size(); i++) {
String[] listStr = list.get(i).toString().split("=");
map.put(listStr[0], listStr[1]);
}
Log.i("List2Map", map.toString());
} else {
return map;
}
} catch (IndexOutOfBoundsException e) {
e.printStackTrace();
}
return map;
}
private void e(String tag,String content){
Log.e(tag,content);
}
private void e(String content){
Log.e("cloudflare",content);
}
now to use use this simply call on the above class like this and convert the cookies to a Map for use with jsoup :
Cloudflare cf = new Cloudflare("YOUR URL HERE");
cf.setUser_agent("YOUR USER AGENT HERE");
cf.getCookies(new Cloudflare.cfCallback() {
@Override
public void onSuccess(List< HttpCookie > cookieList) {
//convert the cookielist to a map
Map<String, String> cookies = Cloudflare.List2Map(cookieList);
Log.d("COOKIES : ", cookies.toString());
}
@Override
public void onFail() {
Log.d("OMG IT FAILED!!!");
}
});
Now in onSuccess start your async task and use the cookies and same useragent in your jsoup request in your doinbackground something like this :
try {
Connection.Response response = Jsoup.connect("YOUR URL HERE").userAgent("YOUR USER AGENT HERE").cookies(cookies).execute();
Document doc = response.parse();
Log.d("THE DOCUMENT : ", doc.toString());
} catch (Exception M){
M.printStackTrace();
}
I really hope i am not too late as i know this is an old thread but this has been and still is working for me...
EDIT: my bad forgot to say in the cloudflare class i need these imports
import android.os.Looper;
import android.text.TextUtils;
import android.util.Log;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.CookieHandler;
import java.net.CookieManager;
import java.net.CookiePolicy;
import java.net.HttpCookie;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import com.eclipsesource.v8.V8;
then in .gradle[app] add this to dependencies :
implementation 'com.eclipsesource.j2v8:j2v8_android:3.0.5@aar'
来源:https://stackoverflow.com/questions/43453491/how-to-bypass-cloudflare-ddos-or-redirect-after-5-seconds-using-jsoup