So I’ve been handed a sample of data to see if EasyMorph will handle it. If it will, then it may result in a licence being bought by my client.
Here is a small sample of the 2 million or so rows in the file
My task, is to boil it all down, into individual rows, so for example
Will become
Finance Department,a,b,c,d,e
Aberdeen City Council,a,b,c,d,e
Town House,a,b,c,d,e
Only the first column is to be translated, the rest of the columns are just to be filled down
As a general rule of thumb, any semicolons can be split without consideration, I’ve already worked out how to do that part
Taking this line as an example:
Splitting on the semicolon gives me the following:
Next potentially comes splitting on the ‘/’ symbol, but this is where this now gets complicated.
We can ONLY split on the ‘/’ IF it’s NOT the last one on the line and not followed by 1 or more digits and possibly other chars after the digits, or in other words
name1/name2/name3
Should result in
name1
name2
name3
but
name1/name2/name3/55
should result in
name1
name2
name3/55
and
name1/55
should remain unchanged
The next to consider are number ranges.
1-10
on a line on it’s own should result in
1
2
3
4
5
6
7
8
9
10
Where as multiple groups EG:
1-5,2-9,8-12
Should result in
1
2
3
4
5
2
3
4
5
6
7
8
9
8
9
10
11
12
Meanwhile, if there is a prefix and trailing ‘/’ then they shall be split and recombined, so for example:
Flat 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 /16
Should result in:
16 : Flat 1
16 : Flat 2
16 : Flat 3
16 : Flat 4
16 : Flat 5
16 : Flat 6
16 : Flat 7
16 : Flat 8
16 : Flat 9
16 : Flat 10
16 : Flat 11
16 : Flat 12
16 : Flat 13
16 : Flat 14
16 : Flat 15
Do you think that EasyMorph is up to the challenge for this file? The aim at the end of the day is to split the compact address range rows back up, into individual entities, so that my client can then move the X&Y co-ordinates against each line, to more closely line up with specific buildings on a map, rather than everything just being within a radius of the postcode centre.