B31 vs. B32

Need help with a script? This is the place to discuss how to get your code running!

B31 vs. B32

Postby sctell » Mon Apr 23, 2018 8:06 am

I have a for years been using a wrapper of FMDB(SQLite) along with ListMaster to organise and retrieve data.

With B32 the speed of loading and displaying the data is 923ms.

With B31 it is 375ms.

Has something changed between these two releases to account for this?

Using High Sierra 10.13.4


Regards

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby sctell » Mon Apr 23, 2018 9:19 am

I have done a little investigation and the culprit is not ListMaster nor FMDB external. It seems to be this code:


Code: Select all
function formatArrayData tArray
   
    put the milliseconds into tStart
   
    set the itemDelimiter to tab
    put "0.00" into tBalance
    set the numberFormat to "0.00"
    repeat for each line tRec in tArray
        add item 4 of tRec to tBalance
        put tBalance into item 5 of tRec
        convert item 1 of tRec to short date
        convert item 4 of tRec to currency
        convert item 5 of tRec to currency
        put tRec & return after tCellData
    end repeat
    delete last char of tCellData
   
    put the milliseconds - tStart
   
    return tCellData
   
end formatArrayData


In B32 this takes 850ms
In B31 it takes 250ms

What could be the reason for this?

All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby codegreen » Mon Apr 23, 2018 12:26 pm

If you send me a sample project and data I can profile it and tell you in more detail, but long story short the convert command has been largely rewritten in Cocoa in b32 to accommodate a wider range of localizations and to sidestep some nasty old and new bugs in the related Carbon frameworks.

So basically (as with all things Unicode) it's doing a lot more work under the hood to accomplish almost the same thing as before. Also as with much object-encapsulated data we lose locality of reference (and thus spend a lot more CPU cycles waiting on cache lines and less doing useful work).

All that said I'm surprised your script works at all, and would be even more so if it turned out to actually be doing what you think it does. THOU SHALT NOT MODIFY FOR-EACH PROXY VARIABLES!!! They are just pointers into the parent container (NOT copies) so any operation that modifies their value (ESPECIALLY if it changes their nominal length or internal data type) will likely have unintended (and possibly fatal) side-effects.

Just sayin'...

;)
-Mark
codegreen
 
Posts: 1556
Joined: Mon Jul 14, 2008 11:03 pm

Re: B31 vs. B32

Postby sctell » Mon Apr 23, 2018 1:01 pm

codegreen wrote:All that said I'm surprised your script works at all, and would be even more so if it turned out to actually be doing what you think it does. THOU SHALT NOT MODIFY FOR-EACH PROXY VARIABLES!!! They are just pointers into the parent container (NOT copies) so any operation that modifies their value (ESPECIALLY if it changes their nominal length or internal data type) will likely have unintended (and possibly fatal) side-effects.


A little surprised by this as it has never failed for me. The help guide is a little sparse in emphasising this as all it says is

The difference is that with the for each loop SuperCard keeps one master copy of container and walks a pointer through it to extract the chunks (rather than carving the chunk from a fresh copy of whatever is currently in container for every iteration) which for large data sets makes the operation dramatically faster.


Extract the chunks to me suggests extracts the chunk into tRec in my script.

Anyway, as the author I am sure you know best so I will copy tRec into another local variable first?

Perhaps a huge note in the help guide could emphasise this with an example.

codegreen wrote:If you send me a sample project and data I can profile it and tell you in more detail, but long story short the convert command has been largely rewritten in Cocoa in b32 to accommodate a wider range of localizations and to sidestep some nasty old and new bugs in the related Carbon frameworks.


Probably not necessary, as soon as I omitted the convert command the speed increased.

It seems the way forward is to produce ones own functions for conversion where there is a major repeat loop (in this case 5000 lines).

I had my own originally before the Convert command came along so I will revert to those.

Thanks

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby codegreen » Mon Apr 23, 2018 2:34 pm

LOL! Looking back over the code I see you're right about how for each proxies currently work.

IIRC they didn't start out that way! Eventually though I reluctantly admitted the performance gains from the non-buffered approach weren't worth leaving an open manhole here for unwary scripters to fall into.

Never mind...

-Mark
codegreen
 
Posts: 1556
Joined: Mon Jul 14, 2008 11:03 pm

Re: B31 vs. B32

Postby sctell » Tue Apr 24, 2018 12:28 am

codegreen wrote:If you send me a sample project and data I can profile it and tell you in more detail, but long story short the convert command has been largely rewritten in Cocoa in b32 to accommodate a wider range of localizations and to sidestep some nasty old and new bugs in the related Carbon frameworks.


Mark,

I will send you a sample project if you don't mind as I have tried my routine and it seems "set the numberFormat" also seems to be afflicted. I suppose you have integrated Convert and NumberFormat.

I also tried the numberFomat external but had some display issues with it, perhaps this needs updating? but I suppose that will also be integrated.

All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby sctell » Tue Apr 24, 2018 1:57 am

Mark,

I have sent project direct to you but also posted it here for others to look at.


http://forums.supercard.us/viewtopic.php?f=17&t=2986


All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby codegreen » Tue Apr 24, 2018 12:35 pm

As I suspected...

Thanks to data encapsulation (and concomitant reduced locality of reference) convert now spends 75+% of its time just waiting around for out-of-cache memory accesses.

Science!

:(
-Mark
codegreen
 
Posts: 1556
Joined: Mon Jul 14, 2008 11:03 pm

Re: B31 vs. B32

Postby sctell » Tue Apr 24, 2018 10:46 pm

codegreen wrote:As I suspected...

Thanks to data encapsulation (and concomitant reduced locality of reference) convert now spends 75+% of its time just waiting around for out-of-cache memory accesses.



Is that the end then, are we stuck with underperformance?


All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby codegreen » Wed Apr 25, 2018 12:51 am

sctell wrote:Is that the end then, are we stuck with underperformance?

Yup, I'm afraid that's all she wrote.

I'm sure you already know that today's personal computers have a multi-tiered memory architecture, and you can perform a truly astonishing amount of work on data that's in a modern CPU's top-level cache in the same time it takes just to fetch other data from main memory.

The Carbon conversion APIs mostly passed around pointers to simple cache-friendly flat C structs. The Cocoa version instead uses more modern (and thus opaque) types which don't play nice with caches thanks to the layer(s) of indirection/unpredictability added by encapsulation.

So both implementations spend the vast majority of their time in Apple's code, and basically perform the same logical operations AFAIK as efficiently as possible via their respective frameworks. But where the Carbon conversion APIs pinned the thread's CPU usage, thanks to impaired locality of reference the Cocoa version's usage rarely rises above about 20% (and this is after weeks of careful hand-tuning). Xcode's normally helpful feedback-directed optimization actually makes things even worse...

FWIW there's tons of stuff Cocoa does (often MUCH) faster than Carbon (usually tasks that can be parallelized, then ideally farmed out to the video card) but it mostly doesn't play nice with Carbon and so can't be used in SC4.x.

Between extra Unicode overhead and the locality of reference issue though, single-threaded iterative tasks like chunk parsing and string<->date/number conversion are going to end up being (often MUCH) slower than when based on their (admittedly sometimes less capable/reliable) Carbon equivalents.

And then of course everything now has to be round-tripped MacRoman->UTF8->MacRoman too (at least THAT will eventually go away, but normally it's not the expensive part of the operation).

So basically no free lunch...

-Mark
codegreen
 
Posts: 1556
Joined: Mon Jul 14, 2008 11:03 pm

Re: B31 vs. B32

Postby sctell » Wed Apr 25, 2018 3:18 am

Just thinking (and perhaps not to clearly).

1. Would a cocoa based external just to format a supplied text string be quicker than going through SC?

2. Would a format function (built into SC or not) that just works on a supplied SC string rather than using Cocoa number frameworks be quicker?

I was thinking along the lines of something like the existing

get formatNum(<number>[, <formatString>[, <roundDirection>]])

that just works with a supplied SC string.

I did try the existing external but it seemed to not work with 4.8 b32?


All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby sctell » Thu Apr 26, 2018 4:38 am

Also it would also speed things up (not sure by how much) if you could pass an array into the convert command.

I just tried something similar in my FMDB external.

Firstly I just did a straight, pass a string in and used this code:

Code: Select all
-(void)formatAsCurrency:(SCParamBlock*)pBlock
{
   
    NSString *tString = [pBlock parameterAtIndex: 2];
   
    NSNumberFormatter *formatter = [[NSNumberFormatter alloc] init];
   
    [formatter setNumberStyle:NSNumberFormatterDecimalStyle];
   
    NSNumber *myNumber = [formatter numberFromString:tString];
   
    [formatter setNumberStyle:NSNumberFormatterCurrencyStyle];
   
    NSString *tStrValue = [formatter stringFromNumber:myNumber];
   
    [pBlock setReturnValue:tStrValue];
   
    [formatter release];
}


This took ages to parse 10000 numbers - approx 2 seconds.

Then I Parsed all the numbers within the external, this took approx 500ms.

Clearly your code in SC is more efficient than anything I could do so I thought give Convert the option to parse a whole range of numbers with a delimiter of some kind.

Is this feasible?

All the best

terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby codegreen » Thu Apr 26, 2018 7:50 pm

I'm sure that if you tried you could slap together even a Cocoa external that would run circles around SC's internal implementation of this feature, especially if you pulled the entire loop inside.

SC is forced to assume that things like the numberFormat and locale may change between invocations of convert, so it has to mindlessly reset various properties of even its cached formatters every time (which is surprisingly expensive in Cocoa). So if you just assumed they wouldn't change within your loop and reused a single unmodified formatter across it, you could probably shave off roughly 50% right there. It's also likely that those classes in turn cache state info that they'll run faster with if it's left undisturbed.

FWIW I picked over all this code again and was able to steal time from elsewhere in convert and put, so now your example is only about 40-50% slower than b31. But the underlying framework code still seems to be spending only a relatively small percentage of its time actually doing any useful work, and there's not much I can do about that...

-Mark
codegreen
 
Posts: 1556
Joined: Mon Jul 14, 2008 11:03 pm

Re: B31 vs. B32

Postby sctell » Sun Apr 29, 2018 10:42 am

codegreen wrote:SC is forced to assume that things like the numberFormat and locale may change between invocations of convert, so it has to mindlessly reset various properties of even its cached formatters every time (which is surprisingly expensive in Cocoa). So if you just assumed they wouldn't change within your loop and reused a single unmodified formatter across it, you could probably shave off roughly 50% right there. It's also likely that those classes in turn cache state info that they'll run faster with if it's left undisturbed.


Mark,

I have created an external.

I load in the full data array tab & return delimited.

Create a global NSNumberFormatter that I apply to the data array then return the whole data array.

So far this has reduced the time to 420ms (you were not far off with 50%).

I thought about NSString stringWithFormat as I am sure this is quicker than NSNumberFormatter but thousand separators seem to be a problem in formatting the number?

However that leaves one more area to address. DATES.

I currently store the date as SC seconds. How would you recommend I deal with these?

I have thought about changing the date format stored in the SQLite DB to YYYYMMDD. This is a simple script to change all my records.

Would this be more efficient than messing about with SC/Cocoa dates?

All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am

Re: B31 vs. B32

Postby sctell » Mon Apr 30, 2018 10:35 pm

sctell wrote:Would this be more efficient than messing about with SC/Cocoa dates?


I have now completed the external using NSDate. The final speed test when compared to the first post in this thread:

B31 using Convert 375ms

B32 using Convert 923ms

B32 using External 414ms (importing all data with loop in external)

B32 using External on individual numbers/dates > 1000ms (must be the overhead of repeated calls to the external).

Conclusion, work with external until performance improvements of "Convert/Put" are seen. Perhaps the external will be quicker still if it sees the "Put" improvements

All the best

Terry
sctell
 
Posts: 1150
Joined: Sun Jul 06, 2008 10:41 am


Return to Scripting in SuperTalk

Who is online

Users browsing this forum: No registered users and 0 guests

cron