Google OCR working on specific area

问题

I am currently using a SurfaceView and CameraSource from com.google.android.gms.vision to capture detected text on the image, but since it captures everything on the SurfaceView area, I need to discard some of the recovered things.

The goal is making the SurfaceView to work like in the next image, ignoring all detected text in red crossed area, and giving to me only the things on the blue square.

Is this even possible?

Here is the layout (nothing special):

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout
    xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    android:layout_width="match_parent"
    android:layout_height="match_parent">

    <SurfaceView
        android:id="@+id/fragment_surface"
        android:layout_width="0dp"
        android:layout_height="0dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toTopOf="parent" />
</android.support.constraint.ConstraintLayout>

And here you have the OCR related code on the class:

public class CameraActivity extends AppCompatActivity {

    private SurfaceView surfaceView;
    private CameraSource cameraSource;
    private StringBuilder builder;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_camera);

        surfaceView = (SurfaceView) findViewById(R.id.fragment_surface);

        TextRecognizer recognizer = new TextRecognizer.Builder(getApplicationContext()).build();
        if (recognizer.isOperational()) {

            cameraSource = new CameraSource.Builder(getApplicationContext(), recognizer)
                    .setFacing(CameraSource.CAMERA_FACING_BACK)
                    .setRequestedPreviewSize(1280, 1024)
                    .setRequestedFps(15.0f)
                    .setAutoFocusEnabled(true)
                    .build();

            surfaceView.getHolder().addCallback(new SurfaceHolder.Callback() {
                @Override
                public void surfaceCreated(SurfaceHolder holder) {
                    if (ActivityCompat.checkSelfPermission(getApplicationContext(), Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
                        ActivityCompat.requestPermissions(CameraActivity.this, new String[]{Manifest.permission.CAMERA}, 100);
                        return;
                    }
                    try {
                        cameraSource.start(surfaceView.getHolder());
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }

                @Override
                public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
                    //
                }

                @Override
                public void surfaceDestroyed(SurfaceHolder holder) {
                    cameraSource.stop();
                }
            });
            recognizer.setProcessor(new Detector.Processor<TextBlock>() {
                @Override
                public void release() {
                    //
                }

                @Override
                public void receiveDetections(Detector.Detections<TextBlock> detections) {
                    final SparseArray<TextBlock> items = detections.getDetectedItems();
                    if (items.size() != 0) {
                        builder = new StringBuilder();
                        for (int i = 0; i < items.size(); i++) {
                            TextBlock it = items.valueAt(i);
                            builder.append(it.getValue());
                        }
                        String read = builder.toString().trim().replace(" ", "").replace("\n", "");

                        //It continues doing other things here
                    }
                }
            });
        }
    }

    @Override
    public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
        switch (requestCode) {
            case 100:
                if (grantResults[0] == PackageManager.PERMISSION_GRANTED) {
                    try {
                        if (ActivityCompat.checkSelfPermission(getApplicationContext(), Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
                            return;
                        }
                        cameraSource.start(surfaceView.getHolder());
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
                break;
        }
    }
}

回答1:

While the device is in portrait mode (like in your picture) the non-red-area should be a cropped piece of the Camera Preview (which fills the whole screen), so:

if you want to show the whole Camera Preview and just execute OCR on the cropped area: then you have to get SurfaceView's "screenshot" (full area) and then crop that region to get desired pixels on-the-fly
intead, if you want to JUST show the cropped area (because the red-area is filled with other interfaces, buttons, TextViews, etc..): then you have to play with SurfaceView to render only a desired piece of the Camera Preview in that specific place by playing with its Matrix parameter

I suggest you to "upgrade" to TextureView that is a bit more difficult to manage/use but allows cropping, zoommming and scaling the Preview as you want by using a Matrix on its internal Texture.

回答2:

Intro

I'm trying to make minimal changes to your existing code, so just scan your whole image as you do now, and filter-out the resulting words (or blocks) that are out-of-bounds.

Code

Find the bounding-box of a word and see if it is inside your rectangle (rect) using rect.intersect:

Rect yourRect = new Rect(10, 20, 30, 40);
rect.intersect(yourRect);//also see Drawable d = d.getBounds();

Try adding this code to your @Override public void receiveDetections() method:

        //Loop through each `Block`
        foreach (TextBlock textBlock in blocks)
        {
            IList<IText> textLines = textBlock.Components; 

            //loop Through each `Line`
            foreach (IText currentLine in textLines)
            {
                IList<IText>  words = currentLine.Components;

                //Loop through each `Word`
                foreach (IText currentword in words)
                {
                    //Get the Rectangle/BoundingBox of the word
                    RectF rect = new RectF(currentword.BoundingBox);

                   // Check if the word boundingBox is inside the area required

                  // using: rect.intersect(yourRect);
                  //...
                }

so it looks like this:

        recognizer.setProcessor(new Detector.Processor<TextBlock>() {
            @Override
            public void release() {
                //
            }

            @Override
            public void receiveDetections(Detector.Detections<TextBlock> detections) {
                final SparseArray<TextBlock> items = detections.getDetectedItems();
                if (items.size() != 0) {
                    builder = new StringBuilder();
                    for (int i = 0; i < items.size(); i++) {
                        TextBlock it = items.valueAt(i);
                        builder.append(it.getValue());
                    }
                    String read = builder.toString().trim().replace(" ", "").replace("\n", "");

        List<TextBlock> blocks = new List<TextBlock>();

        TextBlock myItem = null;
        for (int i = 0; i < items.Size(); ++i)
        {
            myItem = (TextBlock)items.ValueAt(i);

            //Add All TextBlocks to the `blocks` List
            blocks.Add(myItem);

        }

        //Loop through each `Block`
        foreach (TextBlock textBlock in blocks)
        {
            IList<IText> textLines = textBlock.Components; 

            //loop Through each `Line`
            foreach (IText currentLine in textLines)
            {
                IList<IText>  words = currentLine.Components;

                //Loop through each `Word`
                foreach (IText currentword in words)
                {
                    //Get the Rectangle/BoundingBox of the word
                    RectF rect = new RectF(currentword.BoundingBox);

                   // Check if the word boundingBox is inside the area required

                  // using: rect.intersect(yourRect);
                  //put the word in a filtered list...
                }

            }

                    //It continues doing other things here
                }
            }
        });

Only six lines of code!

来源：https://stackoverflow.com/questions/44918705/google-ocr-working-on-specific-area

标签

android

android-camera

ocr

surfaceview