text-recognition

Text mining pdf files/issues with word frequencies

三世轮回 提交于 2020-01-04 03:59:04
问题 I am trying to mine a pdf of an article with rich pdf encodings and graphs. I noticed that when i mine some pdf documents i get the high frequency words to be phi, taeoe,toe,sigma, gamma etc. It works well with some pdf documents but i get these random greek letters with others. Is this the problem with character encoding? (Btw all the documents are in english). Any suggestions? # Here is the link to pdf file for testing # www.sciencedirect.com/science/article/pii/S0164121212000532 library(tm

MLKit Text detection on iOS working for photos taken from Assets.xcassets, but not the same photo taken on camera/uploaded from camera roll

£可爱£侵袭症+ 提交于 2019-12-31 03:33:14
问题 I'm using Google's Text detection API from MLKit to detect text from images. It seems to work perfectly on screenshots but when I try to use it on images taken in the app (using AVFoundation) or on photos uploaded from camera roll it spits out a small number of seemingly random characters. This is my code for running the actual text detection: func runTextRecognition(with image: UIImage) { let visionImage = VisionImage(image: image) textRecognizer.process(visionImage) { features, error in

android - recognized text from tess-two library is wrong

醉酒当歌 提交于 2019-12-23 01:28:11
问题 I am trying to use the tess-two library to recognize text from imagae. Here is my code: load.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { // recognize text Bitmap temp = loadJustTakenImage(); //loads taken image from sdcard Bitmap rotatedImage = rotateIfNeeded(temp); // rotate method i found in some tutorial String text1 = recognizeText(rotatedImage); } }); Recognize text method: (tessdata folder is in Download with the eng.traineddata and other

Advise filters to improve text visibility on photo

廉价感情. 提交于 2019-12-19 10:55:10
问题 I need filters to improve text visibility on photo, since it has some noise. Which filters (algorithms) do you know for this purpose? Now, I use monochrome filter but it doesn't improve image quality. I need to filter can determine medium background of little area and make image monochrome depending on medium background. For example almost all picture background is white and grey characters but some areas has darker color (grey) and black characters. I need to algorithm can understand that

How to use google cloud vision along with unity for recognising text using mobile camera?

ぐ巨炮叔叔 提交于 2019-12-13 03:26:31
问题 I am testing on a project on how to read text from objects and pictures using google cloud vision.Using mobile camera(iphone,ipad preferably or android phones)I would like to get the required text.Samsung bixby application is an example.After some reading I found out about OpenCV for unity and Google cloud vision.OpenCV for unity is around 95$.For testing I cannot use it.So I took the other option. I downloaded this project. Github project .I created a google cloud vision api key and added to

How to improve Google Vision results while detecting a text on an image if we know the language of

二次信任 提交于 2019-12-13 03:19:21
问题 How to modify the following Python code to return results in German? Is it possible? Thank you. def detect_text_uri(uri): """Detects text in the file located in Google Cloud Storage or on the Web. """ client = vision.ImageAnnotatorClient() image = types.Image() image.source.image_uri = uri response = client.text_detection(image=image) texts = response.text_annotations print('Texts:') for text in texts: print('\n"{}"'.format(text.description)) vertices = (['({},{})'.format(vertex.x, vertex.y)

Problem with CountVectorizer from scikit-learn package

人走茶凉 提交于 2019-12-12 14:11:41
问题 I have a dataset of movie reviews. It has two columns: 'class' and 'reviews' . I have done most of the routine preprocessing stuff, such as: lowering the characters, removing stop words, removing punctuation marks. At the end of preprocessing, each original review looks like words separated by space delimiter. I want to use CountVectorizer and then TF-IDF in order to create features of my dataset so i can do classification/text recognition with Random Forest. I looked into websites and i

how can I detect all the text that inside a block with Google Vision Api

狂风中的少年 提交于 2019-12-11 15:37:07
问题 I'm trying to extract text from an image with google vision api, it works. But I just want to detect part of the image to get certain text. this is the image I used I just want to extract all the text from maybank2u.com until From Account: I know there are some tutorials to do this trick by using block but those tutorials are different programming languages. My code: <div class="row"> <div class="col-12"> <ol> <?php foreach ($text as $key => $texts): ?> <li><h6> <?php echo ucfirst($texts-

How to read one column texts with Google Cloud Vision API

蹲街弑〆低调 提交于 2019-12-11 06:59:19
问题 I have the next document image When I try to convert the image to text, the result is the next: Top Text Ref: Rad: Dte: Ddo: Ejecutivo 76520400300 Banco de Bogotá Luz Adriana Botton Text The problem is Google API recongnize it like two columns so, How can I config the Google API in order to obtain one column text? My goal is obtain: Top Text Ref:Ejecutivo Rad: 76520400300 Dte: Banco de Bogotá Ddo:Luz Adriana Botton Text 回答1: Cloud Vision API doesn't have a specific request property to specify

TextRecognizer isOperational API always returns false

天大地大妈咪最大 提交于 2019-12-11 05:09:21
问题 I need to capture characters of an image so i am using TextRecognizer . My code is given below TextRecognizer textRecognizer = new TextRecognizer.Builder(mActivity.getGalleryApplication().getAndroidContext()).build(); if (!textRecognizer.isOperational()) { new AlertDialog.Builder(mActivity.getAndroidContext()) .setMessage("Text recognizer could not be set up :(").show(); return; } textRecognizer.release(); I have added dependencies in build.gradle as below: dependencies { compile 'com.google