How to check programatically if url of page is redirecting?

前端 未结 2 1244
刺人心
刺人心 2021-01-19 08:12

I am trying to extract the content of a webpage A. Using groovy I\'ve tried the following

......
String urlStr = \"url-of-webpage-A\"
String pageText = urlSt         


        
2条回答
  •  一个人的身影
    2021-01-19 08:39

    In groovy, you could do what Joachim suggests by doing:

    String location = "url-of-webpage-A"
    boolean wasRedirected = false
    String pageContent = null
    
    while( location ) {
      new URL( location ).openConnection().with { con ->
        // We'll do redirects ourselves
        con.instanceFollowRedirects = false
    
        // Get the response code, and the location to jump to (in case of a redirect)
        location = con.getHeaderField( "Location" )
        if( !wasRedirected && location ) {
          wasRedirected = true
        }
    
        // Read the HTML and close the inputstream
        pageContent = con.inputStream.withReader { it.text }
      }
    }
    
    println "wasRedirected:$wasRedirected contentLength:${pageContent.length()}"
    

    If you don't want to be redirected, and want the contents of the first page, you simply need to do:

    String location = "url-of-webpage-A"
    String pageContent = new URL( location ).openConnection().with { con ->
      // We'll do redirects ourselves
      con.instanceFollowRedirects = false
    
      // Get the location to jump to (in case of a redirect)
      location = con.getHeaderField( "Location" )
    
      // Read the HTML and close the inputstream
      con.inputStream.withReader { it.text }
    }
    
    if( location ) { 
      println "Page wanted to redirect to $location"
    }
    println "Content was:"
    println pageContent    
    

提交回复
热议问题