Parsing Data

Need help with a script? This is the place to discuss how to get your code running!

Parsing Data

Postby JoeKoomen2011 » Mon Mar 13, 2017 9:20 am

I am extracting data from thousands of HTML documents.

Unfortunately the data was not entered consistently. I have extracted the line with the data I need and turned breaks <BR> into tabs so I can step through the data item by item.

The last 6 items in each line follow consistent rules so I can extract them.

After that I have no Idea how many items are left in the line (See sample below).

    building name — suite no. — street address
    building name — street address — suite no.
    building name — building name — street address
    suite no. — street address
    street address
    building name — building name — building name — street address — room no.

I can have from 1 to 6 items left that I need to sort into three more cells. Ideally extracting the Suite/Unit/Room number in one column, the street address in a second and whatever items are left need to be combined into one item/column like below.

    building name— street address — suite no.
    building name — street address — suite no.
    building name, building name — street address —
    — street address — suite no.
    — street address —
    building name,building name,building name — street address — room no.

I've been having a hard time figuring out which item has the Suite info (compounded by three different terms being used) how to extract it.

Any ideas?

Joe
Joe Koomen
>> Random! ...Damn near killed 'em! <<
User avatar
JoeKoomen2011
 
Posts: 454
Joined: Thu Mar 12, 2009 1:38 pm

Re: Parsing Data

Postby JoeKoomen2011 » Mon Mar 13, 2017 10:20 am

A possibly better way to explain this is that I have undetermined number of items, as many as from 1 to 5.

I need to find out which item contains either Suite, Room or Unit.
Another item that begins with numbers (street address).
Then combine the remaining items into a third item.

Obviously too complicated for an if then structure, and I haven't been able to work it out in a switch structure, so now I'm trying out less familiar options.

Anyone?

Joe
Joe Koomen
>> Random! ...Damn near killed 'em! <<
User avatar
JoeKoomen2011
 
Posts: 454
Joined: Thu Mar 12, 2009 1:38 pm

Re: Parsing Data

Postby JoeKoomen2011 » Mon Mar 13, 2017 4:28 pm

Problem solved.
Joe Koomen
>> Random! ...Damn near killed 'em! <<
User avatar
JoeKoomen2011
 
Posts: 454
Joined: Thu Mar 12, 2009 1:38 pm


Return to Scripting in SuperTalk

Who is online

Users browsing this forum: No registered users and 1 guest

cron