Retrieve openid bearer token using headless browser setup

混江龙づ霸主 提交于 2020-01-25 08:34:18

问题


Using OkHttp3 I was happily scraping a website for quite some time now. However, some components of the website have been upgraded and are now using an additional OpenID bearer authentication.

I am 99.9% positive my requests are failing due to this bearer token because when I check with Chrome dev tools, I see the bearer token popping up only for these parts. Moreover, a couple of requests request are going to links that end with ".well-known/openid-configuration". In addition, when I hardcode the bearer token from my browser in my OkHttp3 code, everything works. Without the code, I get an 401 non authorized message.

I figured that my browser emulation was not close enough to the real situation so I decided to use a headless browser setup that is doing some javascript invocations. Since I am using Java, I used HtmlUnit. Using this tool I could quickly get to the point where I could successfully scrape parts of the website (just as with OkHttp3) but it would again fail with the newly updated parts. I checked but couldn't find the bearer token in any of the responses (nor in the headers or in the cookies).

Is there any chance this approach (using a headless browser) could work? Or are there perhaps alternative approaches I could check.

来源:https://stackoverflow.com/questions/58973021/retrieve-openid-bearer-token-using-headless-browser-setup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!