NetTalk Central

NetTalk Web Server => Web Server - Ask For Help => Topic started by: stephen j ryan on September 27, 2013, 11:41:02 AM

Title: best approach to extracting data from a page
Post by: stephen j ryan on September 27, 2013, 11:41:02 AM
hi everyone

i am using nettalk to download web pages

on those pages are table and column data

is there a way to access these columns and rows using string theory? or any other approach

we have a big bulky parser but its not ideal for this type of job.

many thanks
steve
Title: Re: best approach to extracting data from a page
Post by: Rene Simons on October 01, 2013, 09:13:08 AM
Hi Stephen,

StringTheory is probably the best option here.

Cheers,
Rene
Title: Re: best approach to extracting data from a page
Post by: Bruce on October 02, 2013, 07:13:03 AM
yeah, I've done thins with StringTheory on some pages, splitting first on <tr> and then on <td> and so on.
then removing all tags using the Replace method.

cheers
Bruce
Title: Re: best approach to extracting data from a page
Post by: rjolda on October 13, 2013, 07:02:38 AM
Hi Steve,
We use XFiles to manage table data from some web pages with a known structure.
Works like a charm!  It is then easy to manipulate the Xfile data!
FWIW,
Ron Jolda