NetTalk Central

Author Topic: Problem fetching a Web Page with WebClient  (Read 9534 times)

Alberto

  • Hero Member
  • *****
  • Posts: 1849
    • MSN Messenger - alberto-michelis@hotmail.com
    • View Profile
    • ARMi software solutions
    • Email
Problem fetching a Web Page with WebClient
« on: November 25, 2009, 11:42:44 AM »
Hi,
When you fetch a page with the Return Text Only param, in some cases the
columns of a table are joined in a single row.
Example of how to reproduce it:

Run the Webdemo, choose Web Client enter :
http://www.ravaonline.com/v2/precios/panel.php?m=DOW30
check the Return Text Only and click on Go Fetch

Search the text for
Intel Corporation
the price, which must be below the name, like all others are at the right in
the same row.

This is happening in only one page/row, may be it could be fixed.

I´m using the Text only mode to parse the table column into a queue.

Thanks
-----------
Regards
Alberto

Bruce

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 11194
    • View Profile
Re: Problem fetching a Web Page with WebClient
« Reply #1 on: November 25, 2009, 10:38:37 PM »
I'm not sure it can be changed. At least not by me. You could override the method there if you wanted to.
The method involved is _StripOutWhiteSpace in the netwww.clw file. In your procedure you could add your own code to this method, before the parent call, and do a RETURN before the parent call.
You can start by Cut & Pasting the code from netwww into your procedure, and then editing it from there.

Actually you could start by just adding a RETURN before the parent call. It's possibly you don't need this step at all with the text you have.

First a short explanation of what's happening. If you look at the web page you're fetching (with html markup) you'll see the line
<td align="left">Intel Corporation  </td>
Notice the spaces after the name. (This appears to be the only company with trailing spaces).

The current StripOutWhiteSpace method has code which removes the CRLF if it follows a space. In other words, lines that have an "artificial" line ending (where the artificiality is determined by the trailing space) are concatenated.

So by deriving your own flavor of the method (as described above) you can change this "rule".

If I change it then it changes for everyone, and there may be some who _want_ the rule.

cheers
Bruce


Alberto

  • Hero Member
  • *****
  • Posts: 1849
    • MSN Messenger - alberto-michelis@hotmail.com
    • View Profile
    • ARMi software solutions
    • Email
Re: Problem fetching a Web Page with WebClient
« Reply #2 on: November 26, 2009, 02:37:36 AM »
Bruce,
Thank you very much, a crystal clear explanation, but...
Adding a RETURN to the method, before the parent call, the procedure freezes.
Any other idea?
Thanks
-----------
Regards
Alberto

Bruce

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 11194
    • View Profile
Re: Problem fetching a Web Page with WebClient
« Reply #3 on: November 26, 2009, 06:00:04 AM »
you'll need to debug - no idea why it's freezing.

cheers
Bruce