Programmatically get a screenshot of a page

别等时光非礼了梦想. 提交于 2019-11-26 05:00:38

问题


I\'m writing a specialized crawler and parser for internal use, and I require the ability to take a screenshot of a web page in order to check what colours are being used throughout. The program will take in around ten web addresses and will save them as a bitmap image.

From there I plan to use LockBits in order to create a list of the five most used colours within the image. To my knowledge, it\'s the easiest way to get the colours used within a web page, but if there is an easier way to do it please chime in with your suggestions.

Anyway, I was going to use ACA WebThumb ActiveX Control until I saw the price tag. I\'m also fairly new to C#, having only used it for a few months. Is there a solution to my problem of taking a screenshot of a web page in order to extract the colour scheme?


回答1:


https://screenshotlayer.com/documentation is the only free service I can find lately...

You'll need to use HttpWebRequest to download the binary of the image. See the provided url above for details.

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...



回答2:


A quick and dirty way would be to use the WinForms WebBrowser control and draw it to a bitmap. Doing this in a standalone console app is slightly tricky because you have to be aware of the implications of hosting a STAThread control while using a fundamentally asynchronous programming pattern. But here is a working proof of concept which captures a web page to an 800x600 BMP file:

namespace WebBrowserScreenshotSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.Threading;
    using System.Windows.Forms;

    class Program
    {
        [STAThread]
        static void Main()
        {
            int width = 800;
            int height = 600;

            using (WebBrowser browser = new WebBrowser())
            {
                browser.Width = width;
                browser.Height = height;
                browser.ScrollBarsEnabled = true;

                // This will be called when the page finishes loading
                browser.DocumentCompleted += Program.OnDocumentCompleted;

                browser.Navigate("https://stackoverflow.com/");

                // This prevents the application from exiting until
                // Application.Exit is called
                Application.Run();
            }
        }

        static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // Now that the page is loaded, save it to a bitmap
            WebBrowser browser = (WebBrowser)sender;

            using (Graphics graphics = browser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
                browser.DrawToBitmap(bitmap, bounds);
                bitmap.Save("screenshot.bmp", ImageFormat.Bmp);
            }

            // Instruct the application to exit
            Application.Exit();
        }
    }
}

To compile this, create a new console application and make sure to add assembly references for System.Drawing and System.Windows.Forms.

UPDATE: I rewrote the code to avoid having to using the hacky polling WaitOne/DoEvents pattern. This code should be closer to following best practices.

UPDATE 2: You indicate that you want to use this in a Windows Forms application. In that case, forget about dynamically creating the WebBrowser control. What you want is to create a hidden (Visible=false) instance of a WebBrowser on your form and use it the same way I show above. Here is another sample which shows the user code portion of a form with a text box (webAddressTextBox), a button (generateScreenshotButton), and a hidden browser (webBrowser). While I was working on this, I discovered a peculiarity which I didn't handle before -- the DocumentCompleted event can actually be raised multiple times depending on the nature of the page. This sample should work in general, and you can extend it to do whatever you want:

namespace WebBrowserScreenshotFormsSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.IO;
    using System.Windows.Forms;

    public partial class MainForm : Form
    {
        public MainForm()
        {
            this.InitializeComponent();

            // Register for this event; we'll save the screenshot when it fires
            this.webBrowser.DocumentCompleted += 
                new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted);
        }

        private void OnClickGenerateScreenshot(object sender, EventArgs e)
        {
            // Disable button to prevent multiple concurrent operations
            this.generateScreenshotButton.Enabled = false;

            string webAddressString = this.webAddressTextBox.Text;

            Uri webAddress;
            if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress))
            {
                this.webBrowser.Navigate(webAddress);
            }
            else
            {
                MessageBox.Show(
                    "Please enter a valid URI.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Exclamation);

                // Re-enable button on error before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // This event can be raised multiple times depending on how much of the
            // document has loaded, if there are multiple frames, etc.
            // We only want the final page result, so we do the following check:
            if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete &&
                e.Url == this.webBrowser.Url)
            {
                // Generate the file name here
                string screenshotFileName = Path.GetFullPath(
                    "screenshot_" + DateTime.Now.Ticks + ".png");

                this.SaveScreenshot(screenshotFileName);
                MessageBox.Show(
                    "Screenshot saved to '" + screenshotFileName + "'.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);

                // Re-enable button before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void SaveScreenshot(string fileName)
        {
            int width = this.webBrowser.Width;
            int height = this.webBrowser.Height;
            using (Graphics graphics = this.webBrowser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(width, height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, width, height);
                this.webBrowser.DrawToBitmap(bitmap, bounds);
                bitmap.Save(fileName, ImageFormat.Png);
            }
        }
    }
}



回答3:


This question is old but, alternatively, you can use nuget package Freezer. It's free, uses a recent Gecko webbrowser (supports HTML5 and CSS3) and stands only in one dll.

var screenshotJob = ScreenshotJobBuilder.Create("https://google.com")
              .SetBrowserSize(1366, 768)
              .SetCaptureZone(CaptureZone.FullPage) 
              .SetTrigger(new WindowLoadTrigger()); 

 System.Drawing.Image screenshot = screenshotJob.Freeze();



回答4:


There is a great Webkit based browser PhantomJS which allows to execute any JavaScript from command line.

Install it from http://phantomjs.org/download.html and execute the following sample script from command line:

./phantomjs ../examples/rasterize.js http://www.panoramio.com/photo/76188108 test.jpg

It will create a screenshot of given page in JPEG file. The upside of that approach is that you don't rely on any external provider and can easily automate screenshot taking in large quantities.




回答5:


I used WebBrowser and it doesn't work perfect for me, specially when needs to waiting for JavaScript rendering complete. I tried some Api(s) and found Selenium, the most important thing about Selenium is, it does not require STAThread and could run in simple console app as well as Services.

give it a try :

class Program
{
    static void Main()
    {
        var driver = new FirefoxDriver();

        driver.Navigate()
            .GoToUrl("http://stackoverflow.com/");

        driver.GetScreenshot()
            .SaveAsFile("stackoverflow.jpg", ImageFormat.Jpeg);

        driver.Quit();
    }
}



回答6:


Check this out. This seems to do what you wanted and technically it approaches the problem in very similar way through web browser control. It seems to have catered for a range of parameters to be passed in and also good error handling built into it. The only downside is that it is an external process (exe) that you spawn and it create a physical file that you will read later. From your description, you even consider webservices, so I dont think that is a problem.

In solving your latest comment about how to process multiple of them simultaneously, this will be perfect. You can spawn say a parallel of 3, 4, 5 or more processes at any one time or have the analysis of the color bit running as thread while another capturing process is happening.

For image processing, I recently come across Emgu, havent used it myself but it seems fascinating. It claims to be fast and have a lot of support for graphic analysis including reading of pixel color. If I have any graphic processing project on hand right now I will give this a try.




回答7:


you may also have a look at QT jambi http://qt.nokia.com/doc/qtjambi-4.4/html/com/trolltech/qt/qtjambi-index.html

they have a nice webkit based java implementation for a browser where you can do a screenshot simply by doing sth like:

    QPixmap pixmap;
    pixmap = QPixmap.grabWidget(browser);

    pixmap.save(writeTo, "png");

Have a look at the samples - they have a nice webbrowser demo.



来源:https://stackoverflow.com/questions/1981670/programmatically-get-a-screenshot-of-a-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!