Tutorials & Examples Web-help Blog

Percentile function


#1

It would be useful to be able to have a percentile function: (https://support.office.com/en-us/article/PERCENTILE-function-91b43a53-543c-4708-93de-d626debdddca) and be able to call percentile([Column], 0.25), etc. It would be even more useful to be able to calculate across different groups of data.

I have a large SQL routine that is getting the 1st, median, 3rd percentiles as an aggregate. I can almost replace the entire routine with EasyMorph if it had percentile support

I can try to recreate the formula using pure EasyMorph but it would be preferable to have a function for it or support calling external programs with STDIN/STDOUT without requiring an intermediate CSV file – Run Program - option to supply STDIN and STDOUT format

Sidebar: is any form of scripting on the roadmap?


#2

A percentile transformation is long due – it first appeared on our roadmap in 2015 :flushed:. Will bump it up into release 3.8. Grouping is a good suggestion – duly noted.

I keep thinking about this suggestion. The Input transformation would be a natural fit for reading from STDIN when no calling project present. Or maybe it should be a separate transformation. Writing to STDOUT could be a destination in the Export Text transformation (instead of a file). No clear vision on this yet.

I suppose at some point we’ll reach a point where Python or R integration would be needed but that’s not on the roadmap yet.

There are ideas about a Powershell transformation for PS commands/scripts. Will probably get implemented before the end of this year – Powershell turned out to be a good companion for EM.


#3

Attached is a EM project that can be used to calculate percentiles. Sharing it here for others in the interim. Glad to hear it’s making the 3.8 roadmap

Percentile.morph (8.9 KB)

Regarding the STDIN/STDOUT, I think the “Call Program” transformation could be used with a checkbox to supply the table a STDIN

Something like this:

It’s possible to work with the output already in EM (albeit not that smooth). This would seem to be the smallest change to support the need (eliminate intermediate files)


#4

Is the percentile transformation available? Maybe, I just don’t see it.

Thanks.


#5

Not exactly as the Excel percentile but the “Keep min/max” action in the percentile mode filters values that are above or below the specified percentile.

UPDATE
That’s not correct. I misunderstood the meaning of percentile.


#6

Thanks for this very fast reply. I try to analyze data and calculate every values difference to the median of its group. As of now, I don’t see how to do it… but I will give it another try tomorrow with a fresh mind.


#7

Here is a sample project that generates sample data in groups and then calculates the difference to the median of each group:
diff-from-median.morph (7.8 KB)


#8

Thank you very much. Your fast and profound reply is highly appreciated.