Running a JavaScript command from MATLAB to fetch a PDF file

前端 未结 3 1361
予麋鹿
予麋鹿 2020-12-19 13:40

I\'m currently writing some MATLAB code to interact with my company\'s internal reports database. So far I can access the HTML abstract page using code which looks like this

相关标签:
3条回答
  • 2020-12-19 13:48

    I think you should take a look at the JavaScript that is being called and see what the final request to the webserver looks like.

    You can do this quite easily in Firefox using the FireBug plugin.

    https://addons.mozilla.org/en-US/firefox/addon/1843

    Once you have found the real server request then you can just request this URL or post to this URL instead of trying to run the JavaScript.

    0 讨论(0)
  • 2020-12-19 14:04
    wb=com.mathworks.mde.webbrowser.WebBrowser.createBrowser;
    wb.executeScript('javascript:alert(''Some code from a link'')');
    desk=com.mathworks.mde.desk.MLDesktop.getInstance;
    desk.removeClient(wb);
    
    0 讨论(0)
  • 2020-12-19 14:11

    Once you have gotten the correct URL (a la the answer from pjp), your next problem is to "get the contents of the PDF file into a MATLAB variable". Whether or not this is possible may depend on what you mean by "contents"...


    If you want to get the raw data in the PDF file, I don't think there is a way currently to do this in MATLAB. The URLREAD function was the first thing I thought of to read content from a URL into a string, but it has this note in the documentation:

    s = urlread('url') reads the content at a URL into the string s. If the server returns binary data, s will be unreadable.

    Indeed, if you try to read a PDF as in the following example, s contains some text intermingled with mostly garbage:

    s = urlread('http://samplepdf.com/sample.pdf');
    

    If you want to get the text from the PDF file, you have some options. First, you can use URLWRITE to save the contents of the URL to a file:

    urlwrite('http://samplepdf.com/sample.pdf','temp.pdf');
    

    Then you should be able to use one of two submissions on The MathWorks File Exchange to extract the text from the PDF:

    • Extract text from a PDF document by Dimitri Shvorob
    • PDF Reader by Tom Gaudette

    If you simply want to view the PDF, you can just open it in Adobe Acrobat with the OPEN function:

    open('temp.pdf');
    
    0 讨论(0)
提交回复
热议问题