InputStream from a URL

后端 未结 6 388
死守一世寂寞
死守一世寂寞 2020-12-02 14:41

How do I get an InputStream from a URL?

for example, I want to take the file at the url wwww.somewebsite.com/a.txt and read it as an InputStream in Java

6条回答
  •  臣服心动
    2020-12-02 15:21

    Here is a full example which reads the contents of the given web page. The web page is read from an HTML form. We use standard InputStream classes, but it could be done more easily with JSoup library.

    
        javax.servlet
        javax.servlet-api
        3.1.0
        provided
    
    
    
    
        commons-validator
        commons-validator
        1.6
      
    

    These are the Maven dependencies. We use Apache Commons library to validate URL strings.

    package com.zetcode.web;
    
    import com.zetcode.service.WebPageReader;
    import java.io.IOException;
    import java.nio.charset.StandardCharsets;
    import javax.servlet.ServletException;
    import javax.servlet.ServletOutputStream;
    import javax.servlet.annotation.WebServlet;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
    
    @WebServlet(name = "ReadWebPage", urlPatterns = {"/ReadWebPage"})
    public class ReadWebpage extends HttpServlet {
    
        @Override
        protected void doGet(HttpServletRequest request, HttpServletResponse response)
                throws ServletException, IOException {
    
            response.setContentType("text/plain;charset=UTF-8");
    
            String page = request.getParameter("webpage");
    
            String content = new WebPageReader().setWebPageName(page).getWebPageContent();
    
            ServletOutputStream os = response.getOutputStream();
            os.write(content.getBytes(StandardCharsets.UTF_8));
        }
    }
    

    The ReadWebPage servlet reads the contents of the given web page and sends it back to the client in plain text format. The task of reading the page is delegated to WebPageReader.

    package com.zetcode.service;
    
    import java.io.BufferedReader;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.InputStreamReader;
    import java.net.URL;
    import java.nio.charset.StandardCharsets;
    import java.util.logging.Level;
    import java.util.logging.Logger;
    import java.util.stream.Collectors;
    import org.apache.commons.validator.routines.UrlValidator;
    
    public class WebPageReader {
    
        private String webpage;
        private String content;
    
        public WebPageReader setWebPageName(String name) {
    
            webpage = name;
            return this;
        }
    
        public String getWebPageContent() {
    
            try {
    
                boolean valid = validateUrl(webpage);
    
                if (!valid) {
    
                    content = "Invalid URL; use http(s)://www.example.com format";
                    return content;
                }
    
                URL url = new URL(webpage);
    
                try (InputStream is = url.openStream();
                        BufferedReader br = new BufferedReader(
                                new InputStreamReader(is, StandardCharsets.UTF_8))) {
    
                    content = br.lines().collect(
                          Collectors.joining(System.lineSeparator()));
                }
    
            } catch (IOException ex) {
    
                content = String.format("Cannot read webpage %s", ex);
                Logger.getLogger(WebPageReader.class.getName()).log(Level.SEVERE, null, ex);
            }
    
            return content;
        }
    
        private boolean validateUrl(String webpage) {
    
            UrlValidator urlValidator = new UrlValidator();
    
            return urlValidator.isValid(webpage);
        }
    }
    

    WebPageReader validates the URL and reads the contents of the web page. It returns a string containing the HTML code of the page.

    
    
        
            Home page
            
        
        
            

    Finally, this is the home page containing the HTML form. This is taken from my tutorial about this topic.

提交回复
热议问题