ocr images from list of urls and store the results in spreadsheet

帅比萌擦擦* 提交于 2020-05-18 05:15:13

问题


Hello I have a list of image URLs that contain numbers and I want to OCR them and store the results in google spreadsheet I've found these google scripts to ocr images

1- https://gist.github.com/tagplus5/07dde5ca61fe8f42045d

2- https://ctrlq.org/code/20128-extract-text-from-image-ocr

But I didn't know how to create a request variable so I've replaced request variable with URL variable like this:

function doGet(url) {
  if (url != undefined && url != "") {
    var imageBlob = UrlFetchApp.fetch(url).getBlob();
    var resource = {
      title: imageBlob.getName(),
      mimeType: imageBlob.getContentType()
    };
    var options = {
      ocr: true
    };

    var docFile = Drive.Files.insert(resource, imageBlob, options);
    var doc = DocumentApp.openById(docFile.id);
    var text = doc.getBody().getText().replace("\n", "");
    Drive.Files.remove(docFile.id);
    return ContentService.createTextOutput(text);
  }
  else {
    return ContentService.createTextOutput("request error");
  }
}

the problem is when I call the function like this doGet(B1) where B1 contain the url to the image in google spreadsheet to do the OCR and get the resulted text in the cell C1 it says Drive variable is undefined

Hope get answered soon


回答1:


Okay, I modified your script and made a sheet to show an example. The sheet is here(anyone can edit) and its script is below. Drive API(v2) of Advanced Drive Service should be enabled to run this script.

function onOpen() {
  var ss = SpreadsheetApp.getActive();
  var menuItems = [
    {name: 'RUN', functionName: 'doGet2'}
  ];

  ss.addMenu('OCR', menuItems);
}    


function doGet2() {
  var ROW_START = 3;
  var URL_COL = 1;
  var TEXT_COL = 2;

  var ss = SpreadsheetApp.getActive();
  var sheet = ss.getActiveSheet();

  var urls = sheet.getRange(ROW_START,URL_COL, sheet.getLastRow()-ROW_START+1,1).getValues();
  var texts = [];
  for(var i=0; i<urls.length; i++) {
    var url = urls[i];
    if(url != undefined && url != "") {
      var imageBlob = UrlFetchApp.fetch(url).getBlob();
      var resource = {
        title: imageBlob.getName(),
        mimeType: imageBlob.getContentType()
      };
      var options = {
        ocr: true
      };

      var docFile = Drive.Files.insert(resource, imageBlob, options);
      var doc = DocumentApp.openById(docFile.id);
      var text = doc.getBody().getText().replace("\n", "");

      texts.push([text]);
      Drive.Files.remove(docFile.id);
    }
    else {
      texts.push("request error");
    }
  }
  sheet.getRange(ROW_START,TEXT_COL, urls.length,1).setValues(texts);
}



回答2:


The code is okay. V2 API is still alive. See this documentation. All you need is to enable Advanced Drive Service. In the script editor, select Resources > Advanced Google services and turn on Drive API (only v2 is selectable). Then your code actually works.



来源:https://stackoverflow.com/questions/42152604/ocr-images-from-list-of-urls-and-store-the-results-in-spreadsheet

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!