Google Answers: for theta-ga Visual C# Programming Question

View Question

Q: for theta-ga Visual C# Programming Question ( Answered, 0 Comments )

Question

Subject: for theta-ga Visual C# Programming Question
Category: Computers > Programming
Asked by: futures_trade-ga
List Price: $40.00

Posted: 09 Sep 2006 23:30 PDT
Expires: 09 Oct 2006 23:30 PDT
Question ID: 763843

Please provide a method to read the contents out of a window whose
only child is of the class "Internet Explorer_Server" as identified by
WinSpy++.

What would I need to do get this code to read these contents?

To clarify, I want to read the contents out of a plain window that
contains no controls, only text. The method is probably the same as
just reading text out of an Internet Explorer window.

Answer

Subject: Re: for theta-ga Visual C# Programming Question
Answered By: theta-ga on 25 Sep 2006 14:39 PDT

Hi futures_trade-ga,
    I was out of town when you posted this question and I'm afraid I
missed its posting on Google Answers. I just noticed it yesterday
while I was browsing through the older questions. My apologies for the
delay.
    The solution for this question builds upon the answer to your
previous question. I have updated the sample application I built for
the previous question to now read from Internet Explorer Windows as
well. You can download the updated sample app from: [
http://rapidshare.de/files/34445128/GA-ScreenScraper.zip.html ]

    The sample app contains two new buttons. One to retrieve the text
from an IE window, and the other to retrieve the underlying HTML code.
When you press any of the buttons, the sample app will launch
'news.google.com' in a new IE window - wait 3 seconds for the website
to load - and then display the text or the html for the page in a
textbox.

    The functionality to read IE text is provided by three new methods
in the ScreenScraper class. They are listed below:
===================================================================
// Gets the Internet Explorer IHTMLDocument2 object for the given
// IE Server control window handle
public IHTMLDocument2 GetIEDocumentFromWindowHandle(IntPtr hWnd)
{
    UIntPtr lResult;
    uint lMsg;
    IHTMLDocument2 htmlDocument=null;
    
    if (hWnd != IntPtr.Zero)
    {
        // Register the WM_HTML_GETOBJECT message so it can be used
        // to communicate with the Internet Explorer instance
        lMsg = Win32.RegisterWindowMessage("WM_HTML_GETOBJECT");
        // Sends the above registered message to the IE window and
        // waits for it to process it
        Win32.SendMessageTimeout (hWnd, lMsg, UIntPtr.Zero, UIntPtr.Zero,
                                
Win32.SendMessageTimeoutFlags.SMTO_ABORTIFHUNG, 1000, out lResult);
        if (lResult != UIntPtr.Zero)
        {
            // Casts the value returned by the IE window into 
            //an IHTMLDocument2 interface
            htmlDocument = Win32.ObjectFromLresult(lResult,
typeof(IHTMLDocument).GUID, IntPtr.Zero) as IHTMLDocument2;
            if (htmlDocument == null)
            {
                throw new COMException("Unable to cast to an object of
type IHTMLDocument");
            }
        }
    }
    return htmlDocument;
}


private string ScrapeIEHtmlContent(IntPtr handle)
{
    IHTMLDocument2 htmlDoc = GetIEDocumentFromWindowHandle(handle);
    return htmlDoc.body.innerHTML;
}


private string ScrapeIETextContent(IntPtr handle)
{
    IHTMLDocument2 htmlDoc = GetIEDocumentFromWindowHandle(handle);
    return htmlDoc.body.innerText;
}
==============================================================

The most important of the three new methods is
GetIEDocumentFromWindowHandle. This method, given the handle to the
'Internet Explorer_Server' window, retrieves an object implementing
the IHtmlDocument2 interface from it. We can then use this object to
retrieve the text or the html from the page body, which is what the
other two new methods 'ScrapeIEHtmlContent' and 'ScrapeIETextContent'
do.

Note that since we now use the IHTMLDocument2 interface, you will have
to add a reference to 'Microsoft.Mshtml' library to the project. This
library will already be available on your system.

The code requires the use of three new Win32 methods. They are:
   1. RegisterWindowMessage
[http://windowssdk.msdn.microsoft.com/en-us/library/ms644947.aspx]
   2. SendMessageTimeout
[http://windowssdk.msdn.microsoft.com/en-us/library/ms644952.aspx]
   3. ObjectFromLresult
[http://windowssdk.msdn.microsoft.com/en-us/library/ms697301.aspx]

Related Articles:
=================
   - Protect your IM (Instant Messenger) conversations by encrypting them
     [http://www.codeproject.com/csharp/imencryptor.asp]

   - Retrieving Conversations from Yahoo Messenger
     [http://www.codeproject.com/cpp/yahoochattext.asp] 


---------------------------------------------------------------------------

Hope this helps!
If you need any clarifications, just ask.

Regards,
Theta-ga
:)




======================================================================
Google Search Terms Used:

WM_HTML_GETOBJECT c#
WM_HTML_GETOBJECT ObjectFromLresult c#

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy