How to get the selected text from an embedded pdf in a web page?

南楼画角 提交于 2020-08-25 04:25:06

问题


Here's an example of a pdf document from which I need to extract the user's selection http://www.ada.gov/hospcombrprt.pdf . If we look in the page source we will see smth like:

<html>
  <body marginwidth="0" marginheight="0" style="background-color: rgb(38,38,38)">  
     <embed width="100%" height="100%" name="plugin"        
     src="http://www.ada.gov/hospcombrprt.pdf" type="application/pdf">
  </body>
</html>

How can we get a user's selection from this embedded pdf?

I found a post about extracting the whole text from a pdf doc here and a similar to mine post here where it's written that there are no such possibilities.

But there should be some way out. Probably it's possible to extract the whole text and then somehow determine what's been selected? Or determine the selection through the mouse cursor position on the mouse-down and up events? Would appreciate any ideas.


回答1:


I doubt this is possible - and if it is, there will be no generic solution, since every PDF viewer is different.

Not everyone uses Adobe's own Acrobat plugin. Foxit is popular. Both of these are plugins that most likely do not provide an interface to access this information.

And some Browsers such as Chrome and Firefox now provide a built in PDF viewer, which work completely different than the plugins.

Also, are you accessing a PDF on a different domain? In that case same-origin policy would prevent accessing such information anyway.

And finally you need to consider that not every user likes using (or even is allowed to use) a PDF browser plugin, so your "solution" won't work in those cases.

One more point: the fact that you are using the vastly outdated embed element instead of object suggests you are working with very old knowledge.

You may need to take a step back and really reconsider what you are trying to do here. What is the bigger picture? What are you trying to achieve?




回答2:


I too wanted a way to get selected text from a pdf on the webpage and I came across pdftron, which is not a native method of course. You can get selected text from a pdf using pdftron's webviewer using the following method:

var selectedText = myWebViewer.getInstance().docViewer.getSelectedText();


来源:https://stackoverflow.com/questions/19765844/how-to-get-the-selected-text-from-an-embedded-pdf-in-a-web-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!