ocr

超全的OCR数据集

拥有回忆 提交于 2020-02-02 23:57:15
作者:Tom Hardy Date:2020-02-02 来源: 超全的OCR数据集 1、SynthText in the Wild dataset 数据集下载链接: http://www.robots.ox.ac.uk/~vgg/data/scenetext/ 数据集介绍:一个综合生成的数据集,其中单词实例放置在自然场景图像中,同时考虑场景布局。数据集由大约80万个合成词实例的800万个图像组成。每个文本实例都使用其文本字符串、字级和字符级边界框进行注释。 2、Google FSNS 数据集下载链接: http://rrc.cvc.uab.es/?ch=6&com=downloads 数据集介绍:Google FSNS数据集包含了100多万张从法国Google街景图片中截取的街道名称标志图片。每个图像包含同一街道名称标志的四个视图。路标上的文字最多可以跨越三行。每一个路标都有一个规范的抄本。 3、COCO-Text 数据集下载链接: https://vision.cornell.edu/se3/coco-text-2/ 数据集介绍:63686个图像,145859个文本实例,3个细粒度文本属性。 此数据集基于MSCOCO数据集。 主要内容: 来源: CSDN 作者: 3D视觉工坊 链接: https://blog.csdn.net/Yong_Qi2015/article

How to generate a tiff/box file from an image to train Tesseract in Windows

可紊 提交于 2020-02-01 19:57:27
问题 I'm trying to train Tesseract in Windows and for that I need a pair tiff/box file and I'm trying to create it using jTessBoxEditor but it doesn't accept images as input. I've also tried boxFactory but it doesn't run properly. Does anyone know what is the best tool to create the pair from images? Thanks 回答1: If you have jTessBoxEditor, then you have Tesseract bin files. Go to the tesseract-ocr subfolder of jTessBoxEditor and run the following command : tesseract.exe D:\testocr\TestImage.tif D:

Extracting particular text associated value from an image

孤街浪徒 提交于 2020-01-31 18:21:27
问题 I have an image, and from the image I want to extract key and value pair details. As an example, I want to extract the value of "MASTER-AIRWAYBILL NO:" I have written to extract the entire text from the image using python opencv and OCR, but I don't have any clue how to extract only the value for "MASTER-AIRWAYBILL NO:" from the entire result text of the image. Please find the code: import cv2 import numpy as np import pytesseract from PIL import Image print ("Hello") src_path = "C:\\Users

Extracting particular text associated value from an image

不羁岁月 提交于 2020-01-31 18:21:26
问题 I have an image, and from the image I want to extract key and value pair details. As an example, I want to extract the value of "MASTER-AIRWAYBILL NO:" I have written to extract the entire text from the image using python opencv and OCR, but I don't have any clue how to extract only the value for "MASTER-AIRWAYBILL NO:" from the entire result text of the image. Please find the code: import cv2 import numpy as np import pytesseract from PIL import Image print ("Hello") src_path = "C:\\Users

Fuzzy Text Search: Regex Wildcard Search Generator?

冷暖自知 提交于 2020-01-31 18:13:07
问题 I'm wondering if there is some kind of way to do fuzzy string matching in PHP. Looking for a word in a long string, finding a potential match even if its mis-spelled; something that would find it if it was off by one character due to an OCR error. I was thinking a regex generator might be able to do it. So given an input of "crazy" it would generate this regex: .*((crazy)|(.+razy)|(c.+azy)|cr.+zy)|(cra.+y)|(craz.+)).* It would then return all matches for that word or variations of that word.

记一次百度OCR的使用

☆樱花仙子☆ 提交于 2020-01-30 16:58:43
title: 记一次百度OCR的使用 copyright: true tags: python abbrlink: 8d4a5af0 date: 2018-11-12 11:04:27 --- 恰巧用到了OCR批量识别,鉴于准确度没有使用在本地训练的TensorFlow-OCR,而是选择了百度OCR,可选的方式多种多样,比如Google文字识别,腾讯OCR等等,不一一列举 很简单的demo,参照开发文档 http://ai.baidu.com/docs#/OCR-Python-SDK/80d64770 先去控制台注册一个开发者账号,并创建一个文字识别应用,在管理应用中可以看到 AppID 等相关信息 安装SDK pip install baidu-aip 新建一个python文件 from aip import AipOcr from glob import glob from docx import Document import os import json """ 你的 APPID AK SK """ APP_ID = '你的 App ID' API_KEY = '你的 Api Key' SECRET_KEY = '你的 Secret Key' client = AipOcr(APP_ID, API_KEY, SECRET_KEY) root_path = os.getcwd

Google OCR language hints

青春壹個敷衍的年華 提交于 2020-01-30 09:09:10
问题 It says on the documentation page here: https://cloud.google.com/vision/docs/ocr that you can specify language hints to help OCR more accurately detect text in the image. Does anyone know where I would specify a language hint in my code? I am programming it using a .net console application. using Google.Cloud.Vision.V1; using System; namespace GoogleCloudSamples { public class QuickStart { public static void Main(string[] args) { // Instantiates a client var client = ImageAnnotatorClient

Google OCR language hints

冷暖自知 提交于 2020-01-30 09:07:41
问题 It says on the documentation page here: https://cloud.google.com/vision/docs/ocr that you can specify language hints to help OCR more accurately detect text in the image. Does anyone know where I would specify a language hint in my code? I am programming it using a .net console application. using Google.Cloud.Vision.V1; using System; namespace GoogleCloudSamples { public class QuickStart { public static void Main(string[] args) { // Instantiates a client var client = ImageAnnotatorClient

OCR应用案例--商场室内导航

泪湿孤枕 提交于 2020-01-30 01:54:45
目前较好的自然场景下的文本检测技术有CTPN,FTSN,EAST,STN等,随着这些OCR技术的发展,目前各种形式的文本(保护竖直文本、倾斜文本、弯曲文本等)都可以较为精准的定位。 OCR技术的应该也是非常广泛,本文想要介绍的是将OCR技术、文本识别技术以及路径规划算法相结合,并应用于室内导航的例子(这里将以 商城 作为室内的主要载体。。。),以下将通过三个步骤来说明该应用的思路与可行性; 1 作品设计思路 由于 GPS无法应用在室内导航 中,又因目前市面上应用于室内导航的WIFI,蓝牙及红外线技术存在受信号干扰的不稳定因素。本文创新性地介绍一种“拍图”导航技术。将文本检测识别技术与路径规划结合,直接避免上述信号问题,进行室内导航。 因在商场中能体现店铺位置最好的方式就是店铺招牌,这是最具标志性的定位信息且具有识别可行性。所以本作品提出了如图 1-1 所示的设计思路。当系统接收到用户拍摄的起点与终点的图像( 停车位图像于进入商城前拍下 )。由于店主在店铺的门牌设计时,会考虑到艺术性、以及吸引客户的多种设计元素,通常我们所获取的图像中的文字信息如文本大小、方向(横竖文本)、内容(中文、英文、特殊符号)也不近相同。因此,我们采用了 CTPN 网络与 EAST 网络相结合的方式来保证店铺招牌检测的准确性;再采用 DenseNet 网络进行文本字符识别,当存在少数文字错误时

Unable to extract scanned pdf using TesseractOCRConfig Apache Tika

余生长醉 提交于 2020-01-29 18:00:03
问题 My pdf contains scanned images and I want to extract text from it. What I tried : I tried with AutoDetectParsers but no output. I followed the solution provided in Apache Tika extract scanned PDF files and also Apache Tika Jira at https://issues.apache.org/jira/browse/TIKA-1729 but getting empty string without any error. My configuration : Win 7 64-bit OS, JDK 1.8.0_45. Any kind of help is welcome. 回答1: Steps to follow to solve this : Install Tesseract in your system using 'tesseract-ocr