999 Error Code on HEAD request to LinkedIn

非 Y 不嫁゛ 提交于 2019-11-26 11:06:45

问题


We\'re using a curl HEAD request in a PHP application to verify the validity of generic links. We check the status code just to make sure that the link the user has entered is valid. Links to all websites have succeeded, except LinkedIn.

While it seems to work locally (Mac), when we attempt the request from any of our Ubuntu servers, LinkedIn returns a 999 status code. Not an API request, just a simple curl like we do for every other link. We\'ve tried on a few different machines and tried altering the user agent, but no dice. How do I modify our curl so that working links return a 200?

A sample HEAD request:

curl -I --url https://www.linkedin.com/company/linkedin

Sample Response on Ubuntu machine:

HTTP/1.1 999 Request denied
Date: Tue, 18 Nov 2014 23:20:48 GMT
Server: ATS
X-Li-Pop: prod-lva1
Content-Length: 956
Content-Type: text/html

To respond to @alexandru-guzinschi a little better. We\'ve tried masking the User Agents. To sum up our trials:

  • Mac machine + Mac UA => works
  • Mac machine + Windows UA => works
  • Ubuntu remote machine + (no UA change) => fails
  • Ubuntu remote machine + Mac UA => fails
  • Ubuntu remote machine + Windows UA => fails
  • Ubuntu local virtual machine (on Mac) + (no UA change) => fails
  • Ubuntu local virtual machine (on Mac) + Windows UA => works
  • Ubuntu local virtual machine (on Mac) + Mac UA => works

So now I\'m thinking they block any curl requests that dont provide an alternate UA and also block hosting providers?

Is there any other way I can check if a link to linkedin is valid or if it will lead to their 404 page, from an Ubuntu machine using PHP?


回答1:


It looks like they filter requests based on the user-agent:

$ curl -I --url https://www.linkedin.com/company/linkedin | grep HTTP
HTTP/1.1 999 Request denied

$ curl -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -I --url https://www.linkedin.com/company/linkedin | grep HTTP
HTTP/1.1 200 OK



回答2:


I found the workaround, important to set accept-encoding header:

curl --url "https://www.linkedin.com/in/izman" \
--header "user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36" \
--header "accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" \
--header "accept-encoding:gzip, deflate, sdch, br" \
| gunzip



回答3:


Seems like LinkedIn filter both user agent AND ip address. I tried this both at home and from an Digital Ocean node:

curl -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -I --url https://www.linkedin.com/company/linkedin

From home I got a 200 OK, from DO I got 999 Denied...

So you need a proxy service like HideMyAss or other (haven't tested it so I couldn't say if it's valid or not). Here is a good comparison of proxy services.

Or you could setup a proxy on your home network, for example use a Raspberry PI to proxy your requests. Here is a guide on that.




回答4:


Proxy would work, but I think there's another way around it. I see that from AWS and other clouds that it's blocked by IP. I can issue the request from my machine and it works just fine.

I did notice that in the response from the cloud service that it returns some JS that the browser has to execute to take you to a login page. Once there, you can login and access the page. The login page is only for those accessing via a blocked IP.

If you use a headless client that executes JS, or maybe go straight to the subsequent link and provide the credentials of a linkedin user, you may be able to bypass it.



来源:https://stackoverflow.com/questions/27231113/999-error-code-on-head-request-to-linkedin

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!