I\'m trying to make image of webpage, but some pages shows me as white page.
In Registry editor browse \\HKEY_CURRENT_USER\\Software\\Microsoft\\Internet Explorer\
try to set User Agent like this
browser.Navigate(url, null, null, "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0");
In order to print the Html content of a WebBrowser Control, there are a few points that need to be considered:
A single Document may (will) contain more that one sub-Document, usually contained inside Frames/IFrames. Each IFrame contains its own Document: when a Document contained in an IFrame is loaded, the DocumentCompleted
is reaised. This means that the event can and will be raised multiple times when the WebBrowser navigates to a URL.
The notes here explain more: How to get an HtmlElement value inside Frames/IFrames?
The managed properties of the WebBrowser don't always reflect the DOM's real values. For example, the actual dimensions of the Html Document, when the rendering is completed, are not reflected anywhere, so we need to get those measures from the DOM ourselves. The current DOM rendered dimensions are referenced by:
[WebBrowser].Document.DomDocument.documentElement.scrollHeight;
[WebBrowser].Document.DomDocument.documentElement.scrollWidth;
See: Measuring Element Dimension and Location with CSSOM in Windows Internet Explorer
The WebBrowser Control DrawToBitmap() method is derived from Control
but it's not actually implemented as we could expect. The same applies to other Controls: the RichTextBox is known to print blank content when this method is used.
To proceed, first subscribe to DocumentCompleted
event of the WebBrowser Control.
A Dictionary<Uri, Bitmap>
is used here to store the Bitmap representing the Html content of URLs visited in a session.
When the DocumentCompleted
event is raised, we add a new element to the Dictionary when the current URL has never been visited before.
If the Uri
is already stored, we updated the related Bitmap object, so only the most recent snapshot of a Html Document is present in the collection.
I'm using a support class to handle the Bitmaps creation and to declare the native COM Interface used to generate the Bitmap from the current ISurfacePresenter.
Since the WebBrowser control is forced to use VIEW_OBJECT_COMPOSITION_MODE_LEGACY
as the CompositionMode for all sites, the internal GetPrintBitmap method calls the IViewObject Interface Draw()
method in this situation, so do we.
To print the content (all the content) of the current Html Document, call the
DrawContent(WebBrowser browser)
static method of the WebBrowserExtender
class:
Dictionary<Uri, Bitmap> browserShots = new Dictionary<Uri, Bitmap>();
private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var browser = sender as WebBrowser;
if (browser.ReadyState != WebBrowserReadyState.Complete) return;
var bitmap = WebBrowserExtender.DrawContent(browser);
if (bitmap != null) {
if (!browserShots.ContainsKey(browser.Url)) {
browserShots.Add(browser.Url, bitmap);
}
else {
browserShots[browser.Url]?.Dispose();
browserShots[browser.Url] = bitmap;
}
// Show the Bitmap in a PictureBox control, eventually
[PictureBox].Image = browserShots[browser.Url];
}
}
The WebBrowserExtender support class:
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;
using System.Windows.Forms;
public class WebBrowserExtender
{
public static Bitmap DrawContent(WebBrowser browser)
{
if (browser.Document == null) return null;
Size docSize = Size.Empty;
Graphics g = null;
var hDc = IntPtr.Zero;
try {
docSize.Height = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollHeight;
docSize.Width = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollWidth;
var screenWidth = Screen.FromHandle(browser.Handle).Bounds.Width;
docSize.Width = Math.Max(Math.Min(docSize.Width, screenWidth), 1);
docSize.Height = Math.Max(Math.Min(docSize.Height, 32750), 1);
var previousSize = browser.ClientSize;
browser.ClientSize = new Size(docSize.Width, docSize.Height);
var bitmap = new Bitmap(docSize.Width, docSize.Height, PixelFormat.Format32bppArgb);
g = Graphics.FromImage(bitmap);
var rect = new RECT(0, 0, bitmap.Width, bitmap.Height);
hDc = g.GetHdc();
var view = browser.ActiveXInstance as IViewObject;
view.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, hDc, ref rect, IntPtr.Zero, IntPtr.Zero, 0);
browser.ClientSize = previousSize;
return bitmap;
}
catch {
// This catch block is like this on purpose: nothing to do here
return null;
}
finally {
if (hDc != null) g?.ReleaseHdc(hDc);
g?.Dispose();
}
}
[ComImport]
[Guid("0000010D-0000-0000-C000-000000000046")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
interface IViewObject
{
void Draw(uint dwAspect, int lindex, IntPtr pvAspect, [In] IntPtr ptd,
IntPtr hdcTargetDev, IntPtr hdcDraw, ref RECT lprcBounds,
[In] IntPtr lprcWBounds, IntPtr pfnContinue, uint dwContinue);
}
[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct RECT
{
public int Left;
public int Top;
public int Right;
public int Bottom;
public RECT(int left, int top, int width, int height)
{
Left = left; Top = top; Right = width; Bottom = height;
}
}
}
This is how it works:
The full Document is captured. Of course, the Bitmap can also be limited to a specific maximum/minimum size, to capture just a section of the Html Document.
:
Sample WinForms Project on Google Drive.