HTML - 如何知道何时加载所有帧?

时间:2022-06-15 05:46:32

I'm using .NET WebBrowser control. How do I know when a web page is fully loaded?

我正在使用.NET WebBrowser控件。我怎么知道网页何时满载?

I want to know when the browser is not fetching any more data. (The moment when IE writes 'Done' in its status bar...).

我想知道浏览器什么时候不提取任何数据。 (当IE在状态栏中写'完成'时......)。

Notes:

  • The DocumentComplete/NavigateComplete events might occur multiple times for a web site containing multiple frames.
  • 对于包含多个框架的网站,DocumentComplete / NavigateComplete事件可能会多次发生。

  • The browser ready state doesn't solve the problem either.
  • 浏览器就绪状态也不能解决问题。

  • I have tried checking the number of frames in the frame collection and then count the number of times I get DocumentComplete event but this doesn't work either.
  • 我已经尝试检查帧集合中的帧数,然后计算我获得DocumentComplete事件的次数,但这也不起作用。

  • this.WebBrowser.IsBusy doesn't work either. It is always 'false' when checking it in the Document Complete handler.
  • this.WebBrowser.IsBusy也不起作用。在Document Complete处理程序中检查它时总是“假”。

12 个解决方案

#1


2  

My approach to doing something when page is completely loaded (including frames) is something like this:

我在页面完全加载(包括框架)时做某事的方法是这样的:

using System.Windows.Forms;
    protected delegate void Procedure();
    private void executeAfterLoadingComplete(Procedure doNext) {
        WebBrowserDocumentCompletedEventHandler handler = null;
        handler = delegate(object o, WebBrowserDocumentCompletedEventArgs e)
        {
            ie.DocumentCompleted -= handler;
            Timer timer = new Timer();
            EventHandler checker = delegate(object o1, EventArgs e1)
            {
                if (WebBrowserReadyState.Complete == ie.ReadyState)
                {
                    timer.Dispose();
                    doNext();
                }
            };
            timer.Tick += checker;
            timer.Interval = 200;
            timer.Start();
        };
        ie.DocumentCompleted += handler;
    }

From my other approaches I learned some "don't"-s:

从我的其他方法中我学到了一些“不要”-s:

  • don't try to bend the spoon ... ;-)
  • 不要试图弯曲勺子; ;-)

  • don't try to build elaborate construct using DocumentComplete, Frames, HtmlWindow.Load events. Your solution will be fragile if working at all.
  • 不要尝试使用DocumentComplete,Frames,HtmlWindow.Load事件构建精细的构造。如果工作,你的解决方案将是脆弱的。

  • don't use System.Timers.Timer instead of Windows.Forms.Timer, strange errors will begin to occur in strange places if you do, due to timer running on different thread that the rest of your app.
  • 不要使用System.Timers.Timer而不是Windows.Forms.Timer,如果你这样做,奇怪的错误将开始在奇怪的地方发生,因为计时器在不同的线程上运行你的应用程序的其余部分。

  • don't use just Timer without DocumentComplete because it may fire before your page even begins to load and will execute your code prematurely.
  • 不要只使用没有DocumentComplete的Timer,因为它可能会在你的页面开始加载之前触发,并且会过早地执行你的代码。

#2


2  

Here's how I solved the problem in my application:

以下是我在应用程序中解决问题的方法:

private void wbPost_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    if (e.Url != wbPost.Url)
        return;
    /* Document now loaded */
}

#3


2  

Here's my tested version. Just make this your DocumentCompleted Event Handler and place the code that you only want be called once into the method OnWebpageReallyLoaded(). Effectively, this approach determines when the page has been stable for 200ms and then does its thing.

这是我测试的版本。只需将其作为DocumentCompleted事件处理程序,并将您只想调用一次的代码放入OnWebpageReallyLoaded()方法中。实际上,这种方法可以确定页面何时稳定200毫秒然后完成它的工作。

// event handler for when a document (or frame) has completed its download
Timer m_pageHasntChangedTimer = null;
private void webBrowser_DocumentCompleted( object sender, WebBrowserDocumentCompletedEventArgs e ) {
    // dynamic pages will often be loaded in parts e.g. multiple frames
    // need to check the page has remained static for a while before safely saying it is 'loaded'
    // use a timer to do this

    // destroy the old timer if it exists
    if ( m_pageHasntChangedTimer != null ) {
        m_pageHasntChangedTimer.Dispose();
    }

    // create a new timer which calls the 'OnWebpageReallyLoaded' method after 200ms
    // if additional frame or content is downloads in the meantime, this timer will be destroyed
    // and the process repeated
    m_pageHasntChangedTimer = new Timer();
    EventHandler checker = delegate( object o1, EventArgs e1 ) {
        // only if the page has been stable for 200ms already
        // check the official browser state flag, (euphemistically called) 'Ready'
        // and call our 'OnWebpageReallyLoaded' method
        if ( WebBrowserReadyState.Complete == webBrowser.ReadyState ) {
            m_pageHasntChangedTimer.Dispose();
            OnWebpageReallyLoaded();
        }
    };
    m_pageHasntChangedTimer.Tick += checker;
    m_pageHasntChangedTimer.Interval = 200;
    m_pageHasntChangedTimer.Start();
}

OnWebpageReallyLoaded() {
    /* place your harvester code here */
}

#4


1  

Here's what finally worked for me:

这是最终对我有用的:

       public bool WebPageLoaded
    {
        get
        {
            if (this.WebBrowser.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete)
                return false;

            if (this.HtmlDomDocument == null)
                return false;

            // iterate over all the Html elements. Find all frame elements and check their ready state
            foreach (IHTMLDOMNode node in this.HtmlDomDocument.all)
            {
                IHTMLFrameBase2 frame = node as IHTMLFrameBase2;
                if (frame != null)
                {
                    if (!frame.readyState.Equals("complete", StringComparison.OrdinalIgnoreCase))
                        return false;

                }
            }

            Debug.Print(this.Name + " - I think it's loaded");
            return true;
        }
    }

On each document complete event I run over all the html element and check all frames available (I know it can be optimized). For each frame I check its ready state. It's pretty reliable but just like jeffamaphone said I have already seen sites that triggered some internal refreshes. But the above code satisfies my needs.

在每个文档完成事件中,我遍历所有html元素并检查所有可用的帧(我知道它可以被优化)。对于每个帧,我检查其就绪状态。它非常可靠,但就像jeffamaphone说我已经看到了引发一些内部刷新的网站。但上面的代码满足了我的需求。

Edit: every frame can contain frames within it so I think this code should be updated to recursively check the state of every frame.

编辑:每个帧都可以包含其中的帧,所以我认为应该更新此代码以递归检查每个帧的状态。

#5


0  

How about using javascript in each frame to set a flag when the frame is complete, and then have C# look at the flags?

如何在每个帧中使用javascript在帧完成时设置标志,然后让C#查看标志?

#6


0  

I don't have an alternative for you, but I wonder if the IsBusy property being true during the Document Complete handler is because the handler is still running and therefore the WebBrowser control is technically still 'busy'.

我没有替代方案,但我想知道在文档完成处理程序中IsBusy属性是否为真是因为处理程序仍在运行,因此WebBrowser控件在技术上仍然“忙”。

The simplest solution would be to have a loop that executes every 100 ms or so until the IsBusy flag is reset (with a max execution time in case of errors). That of course assumes that IsBusy will not be set to false at any point during page loading.

最简单的解决方案是在每隔100 ms左右执行一次循环,直到IsBusy标志复位(如果出现错误,则执行时间最长)。当然,这假设在页面加载期间的任何时候都不会将IsBusy设置为false。

If the Document Complete handler executes on another thread, you could use a lock to send your main thread to sleep and wake it up from the Document Complete thread. Then check the IsBusy flag, re-locking the main thread is its still true.

如果文档完成处理程序在另一个线程上执行,您可以使用锁定将主线程发送到休眠状态并从文档完成线程中将其唤醒。然后检查IsBusy标志,重新锁定主线程仍是如此。

#7


0  

I'm not sure it'll work but try to add a JavaScript "onload" event on your frameset like that :

我不确定它是否可行但是尝试在你的框架集上添加一个JavaScript“onload”事件:

function everythingIsLoaded() { alert("everything is loaded"); }
var frameset = document.getElementById("idOfYourFrameset");
if (frameset.addEventListener)
    frameset.addEventListener('load',everythingIsLoaded,false); 
else
    frameset.attachEvent('onload',everythingIsLoaded); 

#8


0  

Can you use jQuery? Then you could easily bind frame ready events on the target frames. See this answer for directions. This blog post also has a discussion about it. Finally there is a plug-in that you could use.

你能用jQuery吗?然后,您可以轻松地在目标帧上绑定帧就绪事件。有关说明,请参阅此答案。这篇博文也有关于它的讨论。最后有一个你可以使用的插件。

The idea is that you count the number of frames in the web page using:

我们的想法是使用以下方法计算网页中的帧数:

$("iframe").size()

and then you count how many times the iframe ready event has been fired.

然后计算iframe就绪事件被触发的次数。

#9


0  

You will get a BeforeNavigate and DocumentComplete event for the outer web page, as well as each frame. You know you're done when you get the DocumentComplete event for the outer webpage. You should be able to use the managed equivilent of IWebBrowser2::TopLevelContainer() to determine this.

您将获得外部网页以及每个框架的BeforeNavigate和DocumentComplete事件。当您获得外部网页的DocumentComplete事件时,您就知道已经完成了。您应该能够使用IWebBrowser2 :: TopLevelContainer()的托管等效值来确定这一点。

Beware, however, the website itself can trigger more frame navigations anytime it wants, so you never know if a page is truly done forever. The best you can do is keep a count of all the BeforeNavigates you see and decrement the count when you get a DocumentComplete.

但要注意,网站本身可以随时触发更多的帧导航,因此您永远不会知道页面是否真的永远完成。您可以做的最好的事情是保留您看到的所有BeforeNavigates的计数,并在获得DocumentComplete时减少计数。

Edit: Here's the managed docs: TopLevelContainer.

编辑:这是托管文档:TopLevelContainer。

#10


0  

I just use the webBrowser.StatusText method. When it says "Done" everything is loaded! Or am I missing something?

我只使用webBrowser.StatusText方法。当它说“完成”时,一切都已加载!或者我错过了什么?

#11


0  

Checking for IE.readyState = READYSTATE_COMPLETE should work, but if that's not proving reliable for you and you literally want to know "the moment when IE writes 'Done' in its status bar", then you can do a loop until IE.StatusText contains "Done".

检查IE.readyState = READYSTATE_COMPLETE应该可以工作,但是如果这对你来说不可靠并且你真的想知道“IE在其状态栏中写'完成'的那一刻”,那么你可以做一个循环,直到IE.StatusText包含“完成”。

#12


0  

Have you tried WebBrowser.IsBusy property?

你试过WebBrowser.IsBusy属性吗?

#1


2  

My approach to doing something when page is completely loaded (including frames) is something like this:

我在页面完全加载(包括框架)时做某事的方法是这样的:

using System.Windows.Forms;
    protected delegate void Procedure();
    private void executeAfterLoadingComplete(Procedure doNext) {
        WebBrowserDocumentCompletedEventHandler handler = null;
        handler = delegate(object o, WebBrowserDocumentCompletedEventArgs e)
        {
            ie.DocumentCompleted -= handler;
            Timer timer = new Timer();
            EventHandler checker = delegate(object o1, EventArgs e1)
            {
                if (WebBrowserReadyState.Complete == ie.ReadyState)
                {
                    timer.Dispose();
                    doNext();
                }
            };
            timer.Tick += checker;
            timer.Interval = 200;
            timer.Start();
        };
        ie.DocumentCompleted += handler;
    }

From my other approaches I learned some "don't"-s:

从我的其他方法中我学到了一些“不要”-s:

  • don't try to bend the spoon ... ;-)
  • 不要试图弯曲勺子; ;-)

  • don't try to build elaborate construct using DocumentComplete, Frames, HtmlWindow.Load events. Your solution will be fragile if working at all.
  • 不要尝试使用DocumentComplete,Frames,HtmlWindow.Load事件构建精细的构造。如果工作,你的解决方案将是脆弱的。

  • don't use System.Timers.Timer instead of Windows.Forms.Timer, strange errors will begin to occur in strange places if you do, due to timer running on different thread that the rest of your app.
  • 不要使用System.Timers.Timer而不是Windows.Forms.Timer,如果你这样做,奇怪的错误将开始在奇怪的地方发生,因为计时器在不同的线程上运行你的应用程序的其余部分。

  • don't use just Timer without DocumentComplete because it may fire before your page even begins to load and will execute your code prematurely.
  • 不要只使用没有DocumentComplete的Timer,因为它可能会在你的页面开始加载之前触发,并且会过早地执行你的代码。

#2


2  

Here's how I solved the problem in my application:

以下是我在应用程序中解决问题的方法:

private void wbPost_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    if (e.Url != wbPost.Url)
        return;
    /* Document now loaded */
}

#3


2  

Here's my tested version. Just make this your DocumentCompleted Event Handler and place the code that you only want be called once into the method OnWebpageReallyLoaded(). Effectively, this approach determines when the page has been stable for 200ms and then does its thing.

这是我测试的版本。只需将其作为DocumentCompleted事件处理程序,并将您只想调用一次的代码放入OnWebpageReallyLoaded()方法中。实际上,这种方法可以确定页面何时稳定200毫秒然后完成它的工作。

// event handler for when a document (or frame) has completed its download
Timer m_pageHasntChangedTimer = null;
private void webBrowser_DocumentCompleted( object sender, WebBrowserDocumentCompletedEventArgs e ) {
    // dynamic pages will often be loaded in parts e.g. multiple frames
    // need to check the page has remained static for a while before safely saying it is 'loaded'
    // use a timer to do this

    // destroy the old timer if it exists
    if ( m_pageHasntChangedTimer != null ) {
        m_pageHasntChangedTimer.Dispose();
    }

    // create a new timer which calls the 'OnWebpageReallyLoaded' method after 200ms
    // if additional frame or content is downloads in the meantime, this timer will be destroyed
    // and the process repeated
    m_pageHasntChangedTimer = new Timer();
    EventHandler checker = delegate( object o1, EventArgs e1 ) {
        // only if the page has been stable for 200ms already
        // check the official browser state flag, (euphemistically called) 'Ready'
        // and call our 'OnWebpageReallyLoaded' method
        if ( WebBrowserReadyState.Complete == webBrowser.ReadyState ) {
            m_pageHasntChangedTimer.Dispose();
            OnWebpageReallyLoaded();
        }
    };
    m_pageHasntChangedTimer.Tick += checker;
    m_pageHasntChangedTimer.Interval = 200;
    m_pageHasntChangedTimer.Start();
}

OnWebpageReallyLoaded() {
    /* place your harvester code here */
}

#4


1  

Here's what finally worked for me:

这是最终对我有用的:

       public bool WebPageLoaded
    {
        get
        {
            if (this.WebBrowser.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete)
                return false;

            if (this.HtmlDomDocument == null)
                return false;

            // iterate over all the Html elements. Find all frame elements and check their ready state
            foreach (IHTMLDOMNode node in this.HtmlDomDocument.all)
            {
                IHTMLFrameBase2 frame = node as IHTMLFrameBase2;
                if (frame != null)
                {
                    if (!frame.readyState.Equals("complete", StringComparison.OrdinalIgnoreCase))
                        return false;

                }
            }

            Debug.Print(this.Name + " - I think it's loaded");
            return true;
        }
    }

On each document complete event I run over all the html element and check all frames available (I know it can be optimized). For each frame I check its ready state. It's pretty reliable but just like jeffamaphone said I have already seen sites that triggered some internal refreshes. But the above code satisfies my needs.

在每个文档完成事件中,我遍历所有html元素并检查所有可用的帧(我知道它可以被优化)。对于每个帧,我检查其就绪状态。它非常可靠,但就像jeffamaphone说我已经看到了引发一些内部刷新的网站。但上面的代码满足了我的需求。

Edit: every frame can contain frames within it so I think this code should be updated to recursively check the state of every frame.

编辑:每个帧都可以包含其中的帧,所以我认为应该更新此代码以递归检查每个帧的状态。

#5


0  

How about using javascript in each frame to set a flag when the frame is complete, and then have C# look at the flags?

如何在每个帧中使用javascript在帧完成时设置标志,然后让C#查看标志?

#6


0  

I don't have an alternative for you, but I wonder if the IsBusy property being true during the Document Complete handler is because the handler is still running and therefore the WebBrowser control is technically still 'busy'.

我没有替代方案,但我想知道在文档完成处理程序中IsBusy属性是否为真是因为处理程序仍在运行,因此WebBrowser控件在技术上仍然“忙”。

The simplest solution would be to have a loop that executes every 100 ms or so until the IsBusy flag is reset (with a max execution time in case of errors). That of course assumes that IsBusy will not be set to false at any point during page loading.

最简单的解决方案是在每隔100 ms左右执行一次循环,直到IsBusy标志复位(如果出现错误,则执行时间最长)。当然,这假设在页面加载期间的任何时候都不会将IsBusy设置为false。

If the Document Complete handler executes on another thread, you could use a lock to send your main thread to sleep and wake it up from the Document Complete thread. Then check the IsBusy flag, re-locking the main thread is its still true.

如果文档完成处理程序在另一个线程上执行,您可以使用锁定将主线程发送到休眠状态并从文档完成线程中将其唤醒。然后检查IsBusy标志,重新锁定主线程仍是如此。

#7


0  

I'm not sure it'll work but try to add a JavaScript "onload" event on your frameset like that :

我不确定它是否可行但是尝试在你的框架集上添加一个JavaScript“onload”事件:

function everythingIsLoaded() { alert("everything is loaded"); }
var frameset = document.getElementById("idOfYourFrameset");
if (frameset.addEventListener)
    frameset.addEventListener('load',everythingIsLoaded,false); 
else
    frameset.attachEvent('onload',everythingIsLoaded); 

#8


0  

Can you use jQuery? Then you could easily bind frame ready events on the target frames. See this answer for directions. This blog post also has a discussion about it. Finally there is a plug-in that you could use.

你能用jQuery吗?然后,您可以轻松地在目标帧上绑定帧就绪事件。有关说明,请参阅此答案。这篇博文也有关于它的讨论。最后有一个你可以使用的插件。

The idea is that you count the number of frames in the web page using:

我们的想法是使用以下方法计算网页中的帧数:

$("iframe").size()

and then you count how many times the iframe ready event has been fired.

然后计算iframe就绪事件被触发的次数。

#9


0  

You will get a BeforeNavigate and DocumentComplete event for the outer web page, as well as each frame. You know you're done when you get the DocumentComplete event for the outer webpage. You should be able to use the managed equivilent of IWebBrowser2::TopLevelContainer() to determine this.

您将获得外部网页以及每个框架的BeforeNavigate和DocumentComplete事件。当您获得外部网页的DocumentComplete事件时,您就知道已经完成了。您应该能够使用IWebBrowser2 :: TopLevelContainer()的托管等效值来确定这一点。

Beware, however, the website itself can trigger more frame navigations anytime it wants, so you never know if a page is truly done forever. The best you can do is keep a count of all the BeforeNavigates you see and decrement the count when you get a DocumentComplete.

但要注意,网站本身可以随时触发更多的帧导航,因此您永远不会知道页面是否真的永远完成。您可以做的最好的事情是保留您看到的所有BeforeNavigates的计数,并在获得DocumentComplete时减少计数。

Edit: Here's the managed docs: TopLevelContainer.

编辑:这是托管文档:TopLevelContainer。

#10


0  

I just use the webBrowser.StatusText method. When it says "Done" everything is loaded! Or am I missing something?

我只使用webBrowser.StatusText方法。当它说“完成”时,一切都已加载!或者我错过了什么?

#11


0  

Checking for IE.readyState = READYSTATE_COMPLETE should work, but if that's not proving reliable for you and you literally want to know "the moment when IE writes 'Done' in its status bar", then you can do a loop until IE.StatusText contains "Done".

检查IE.readyState = READYSTATE_COMPLETE应该可以工作,但是如果这对你来说不可靠并且你真的想知道“IE在其状态栏中写'完成'的那一刻”,那么你可以做一个循环,直到IE.StatusText包含“完成”。

#12


0  

Have you tried WebBrowser.IsBusy property?

你试过WebBrowser.IsBusy属性吗?