How to read the text from image (captcha) by using Selenium WebDriver with Java

后端 未结 7 1014
囚心锁ツ
囚心锁ツ 2020-12-15 02:52

I have registration webpage but in last captcha is displaying..

I am not able to read the text from image. I am going to mention the code and output ..

<         


        
7条回答
  •  忘掉有多难
    2020-12-15 03:32

    I have a solution which will work for a specific website. You can get a snapshot of the whole page and get the image of captcha. Then divide the whole width of the captcha image by total number of characters (in a captcha generally it's usually constant). Now we have the individual characters of the captcha image. Collect all the possible characters of the captcha by reloading the page.

    Once you have all the possible characters then given any captcha image you can compare its characters with the images that we have and decide which letter or number it is.

    Steps to follow:

    1. Collect captcha image and divide it into individual characters.

      private static BufferedImage cropImage(File filePath, int x, int y, int w,
                  int h) {
      
              try {
                  BufferedImage originalImgage = ImageIO.read(filePath);
                  BufferedImage subImgage = originalImgage.getSubimage(x, y, w, h);
      
                  return subImgage;
              } catch (IOException e) {
                  e.printStackTrace();
                  return null;
              }
          }
      
      1. Keep all possible images in a folder

      2. Now read each character image of the captcha and compare it with all other images in above folder. You can compare two images using pixel values public static float getDiff(File f1, File f2, int width, int height) throws IOException { BufferedImage bi1 = null; BufferedImage bi2 = null; bi1 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB); bi2 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);

                bi1 = ImageIO.read(f1);
                bi2 = ImageIO.read(f2);
                float diff = 0;
                for (int i = 0; i < width; i++) {
                    for (int j = 0; j < height; j++) {
                        int rgb1 = bi1.getRGB(i, j);
                        int rgb2 = bi2.getRGB(i, j);
        
                        int b1 = rgb1 & 0xff;
                        int g1 = (rgb1 & 0xff00) >> 8;
                        int r1 = (rgb1 & 0xff0000) >> 16;
        
                        int b2 = rgb2 & 0xff;
                        int g2 = (rgb2 & 0xff00) >> 8;
                        int r2 = (rgb2 & 0xff0000) >> 16;
        
                        diff += Math.abs(b1 - b2);
                        diff += Math.abs(g1 - g2);
                        diff += Math.abs(r1 - r2);
                    }
                }
                return diff;
            }
        
    2. Whichever images having less diff value that is the actual match. Append its name to a string.
    3. After reading all images of the captcha return string 1: https://i.stack.imgur.com/FYPhd.png

    In above picture image name specifies the digit or character.

    This works only for simple captcha like [enter image description here1

提交回复
热议问题