nferraz / st Goto Github PK

View Code? Open in Web Editor NEW

915.0 915.0 67.0 69 KB

simple statistics from the command line

License: MIT License

Perl 100.00%

st's People

Contributors

Stargazers

Watchers

st's Issues

perl: bad interpreter: No such file or directory

I got the following error when testing on a Mac:

$ st --sum numbers.txt
-bash: /usr/local/bin/st: perl: bad interpreter: No such file or directory

I changed the first line of st to:

#!/opt/local/bin/perl -T

which worked. Is there a universal way to reference perl (i.e. irrespective of perl's path)?

Slow calculations

Hi,

thanks for a great tool - st is exactly what I need, except for the speed.

Up to 1000 lines of numbers in a file works ok:

$ time head -n 10 jmeter_saso.log | cut -d, -f2 | st
N min max sum mean stddev
9.00 7578.00 19843.00 132073.00 14674.78 3632.56

real 0m0.207s
user 0m0.170s
sys 0m0.010s

$ time head -n 100 jmeter_saso.log | cut -d, -f2 | st
N min max sum mean stddev
99.00 7578.00 35999.00 2372769.00 23967.36 5713.40

real 0m0.339s
user 0m0.300s
sys 0m0.020s

$ time head -n 1000 jmeter_saso.log | cut -d, -f2 | st
N min max sum mean stddev
999.00 80.00 38075.00 7644960.00 7652.61 10007.16

real 0m2.375s
user 0m2.280s
sys 0m0.030s

But at 10.000 lines it starts getting really slow:

$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | st
N min max sum mean stddev
9999.00 40.00 38075.00 11624304.00 1162.55 3934.22

real 0m26.478s
user 0m24.600s
sys 0m0.070s

I don't know why it takes so long time, perl can do it pretty quickly:

$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | perl -lne '$x += $_; END { print $x; }'
11624304

real 0m0.022s
user 0m0.010s
sys 0m0.000s

Even just the --sum takes 1000 times as long as perl:

$ time head -n 10000 jmeter_saso.log | cut -d, -f2 | st --sum
Invalid value 'elapsed' on input line 1
11624304.00

real 0m22.732s
user 0m22.520s
sys 0m0.020s

My files are 1.000.000 lines long, and have now used 22CPU minutes without being complete.

Same command as "simple terminal"

Hi there,

first thanks for your program.

But here the problem:

The command st is already used by the suckless project for their "simple terminal".

You may want rename it to something longer. Because if you don't then the maintainer of the linux distros do it.

It happened with chromium (the browser), because there was already a game called chromium. So nowadays we have chromium-browser and chromium-bsu`.

Maybe something like stt for statistics tool or stati. I don't know. But i wanted you inform about this problem. :)

Any plans to make an API for it?

Hello,
Its a great script... Do you have an plans to make an API for it?? I have a php application and would love to use it

Default number formatting doesn't align properly

Steps to reproduce:

Download example file from: http://pastie.org/10479974 and put it in /tmp/test-input.txt
Run:

$ st /tmp/test-input.txt
N   min max sum mean    stddev
1000    404 52092   2.38715e+06 2387.15 3356.8

Note how the mean doesn't align automatically with its value because the formatted sum is very long.

An option to show just the numbers

I think it would be beneficial for everyone who tends to use st in some kind of oneliner/bashscript to have an option which would omit the description of values which are being printed on the first line.

wc for instance does this on default but I believe that something like --n could work for switching to numbers only.

What do you think about that? My perl knowledge is rather limited but I'd like to try to implement such an easy feature.

Thanks

"quantiles" option should be "quartiles", and only the 1st, 2nd and 3rd quartiles are meaningful

This may be a bit pedantic, but you have confused "quartiles" with "quantiles", and there should be only 3 meaningful quartiles (the help text lists 0-4 as possible quartile values).

0th quartile is not really meaningful empirically (it would be the lowest value observed), and similar things can be said about the 4th quartile (it is the highest value observed).

In general, there are N-1 meaningful N-quantiles. The percentiles are the 100 quantiles, quartiles are the 4 quantiles, and the median is the only 2 quantile.

How to install on centos?

prepare:
$ yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-Test-Simple

then follow the INSTALL

Thanks!

Found your repo from: http://stackoverflow.com/a/18617795.

Nice little utility - very useful!

Variance option

The variance command line option (-var) is not working, so I made this patch:

diff --git a/lib/App/St.pm b/lib/App/St.pm
index 622338d..cc1b57e 100644
--- a/lib/App/St.pm
+++ b/lib/App/St.pm
@@ -192,6 +192,7 @@ sub result {
         stderr     => $self->stderr(),
         min        => $self->min(),
         max        => $self->max(),
+        variance   => $self->variance(),
     );

     if ($self->{keep_data}) {

Install without root privilages

Anyway to install st without sudo?

Group by column

If I have a multi-column file, I might want to calculate metrics separately for the "groups" of rows in the file. Imagine I have:

a 1
a 2
b 2
b 3

Ideally, I'd like statistics for a and b rows separately. This could either be done by indicating which column holds the value, and use the remainder (prefix and suffix) of the row as the "group by" key, or by providing a list of columns to use as key, and a key for the value. I'd prefer the former as it is easier to specify, but for large files where only some of the data is relevant, the latter approach is more flexible.

INSTALL, Makefile.PL

$ perl Makefile.PL

Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at Makefile.PL line 4.
BEGIN failed--compilation aborted at Makefile.PL line 4.

$ perl --version

This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi

Add support for mode

It would be useful to support mode, not just mean and median.

Provide ability to specify percentile groups beyond quartiles

I think st is great and I use it all the time. One thing that would make it more useful for me is the ability to be more selective about the percentile groups of interest. For example, instead of the 25th percentile, 50th percentile, and 75th percentile values in a dataset, I would like to get the 90th percentile and the 99th percentile.

One possible way of writing this:

st --N --percentile=90 --percentile 99 --percentile 99.9  mydata.log
N	p90	p99	p99.9
3968275	470	800	3500

Another more compact way might be:

st --N --p90 --p99 --p99.9  mydata.log

Pick a license, any license ;)

https://github.com/nferraz/st/blob/master/LICENSE

https://github.com/nferraz/st/blob/master/bin/st#L240-243

But "Perl itself" is GPL+ or Artistic

shebang line does not work on ubuntu

Apparently it's the flag to perl combined with env that causes the problem:

#!/usr/bin/env perl -T

$ st numbers.txt
/usr/bin/env: perl -T: No such file or directory

#!/usr/bin/perl -T
$ st numbers.txt
N min max sum mean sd
3.00 1.00 100.00 111.00 37.00 54.74

#!/usr/bin/env perl
$ st numbers.txt
N min max sum mean sd
3.00 1.00 100.00 111.00 37.00 54.74

Make the data file parser more flexible..

Well ,why not write several lines of code to make it more robust to parse the data file?, no matter we have one or more numbers per line ,No matter the number is splited by /n,/s,/t ,/r ,comma ,we can parse it.

The data file is showed as belows:

For this file ,st should be able to parse it to 1,2,3,4,5,five numbers!

Problem producing statistics

Hi,

I have a file with 8933 rows. When I call st, I see in stdout many lines like the following:

' on input line 8908

Where is the problem?
Thanks

current version appears to be broken

Thank you for making st available.
v1.1.4 installs and tests fine on CentOS 7.
Current version appears to be broken:

% make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/01-use.t .......... ok
t/02-new.t .......... ok
t/03-validate.t ..... ok
t/04-process.t ...... ok
t/05-basic-stats.t .. 1/6

Failed test at t/05-basic-stats.t line 17.

got: '1'

expected: '10'

Failed test at t/05-basic-stats.t line 19.

got: '1'

expected: '10'

Failed test at t/05-basic-stats.t line 20.

got: '1'

expected: '55'

Failed test at t/05-basic-stats.t line 23.

got: '1'

expected: '3.02'

Looks like you failed 4 tests of 6.

t/05-basic-stats.t .. Dubious, test returned 4 (wstat 1024, 0x400)
Failed 4/6 subtests
t/05-format.t ....... 1/6

Failed test at t/05-format.t line 20.

got: '1'

expected: '10'

Failed test at t/05-format.t line 22.

got: '1'

expected: '10'

Failed test at t/05-format.t line 23.

got: '1'

expected: '55'

Failed test at t/05-format.t line 27.

got: '1'

expected: '3.02765'

Looks like you failed 4 tests of 6.

t/05-format.t ....... Dubious, test returned 4 (wstat 1024, 0x400)
Failed 4/6 subtests
t/06-percentile.t ... 1/4

Failed test at t/06-percentile.t line 26.

got: '1'

expected: '5.5'

Failed test at t/06-percentile.t line 26.

got: '1'

expected: '9.5'

Failed test at t/06-percentile.t line 26.

got: '1'

expected: '10'

Looks like you failed 3 tests of 4.

t/06-percentile.t ... Dubious, test returned 3 (wstat 768, 0x300)
Failed 3/4 subtests
t/06-quantiles.t .... 1/5

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '10'

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '3.5'

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '7.5'

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '5.5'

Looks like you failed 4 tests of 5.

t/06-quantiles.t .... Dubious, test returned 4 (wstat 1024, 0x400)
Failed 4/5 subtests
t/07-result.t ....... Can't call method "quartile" on unblessed reference at .../st/blib/lib/App/St.pm line 219.
t/07-result.t ....... Dubious, test returned 255 (wstat 65280, 0xff00)
No subtests run

Test Summary Report

t/05-basic-stats.t (Wstat: 1024 Tests: 6 Failed: 4)
Failed tests: 1, 3-4, 6
Non-zero exit status: 4
t/05-format.t (Wstat: 1024 Tests: 6 Failed: 4)
Failed tests: 1, 3-4, 6
Non-zero exit status: 4
t/06-percentile.t (Wstat: 768 Tests: 4 Failed: 3)
Failed tests: 1-2, 4
Non-zero exit status: 3
t/06-quantiles.t (Wstat: 1024 Tests: 5 Failed: 4)
Failed tests: 1-3, 5
Non-zero exit status: 4
t/07-result.t (Wstat: 65280 Tests: 0 Failed: 0)
Non-zero exit status: 255
Parse errors: No plan found in TAP output
Files=9, Tests=55, 1 wallclock secs ( 0.04 usr 0.01 sys + 0.45 cusr 0.09 csys = 0.59 CPU)
Result: FAIL
Failed 5/9 test programs. 15/55 subtests failed.
make: *** [test_dynamic] Error 255

Change the name of the binary , it is colliding with other popular tool

Context:

https://st.suckless.org

The binary is have the same name as your offering . Can I have them both in my system wihtout alising one of them?

No , I can't . So, do something which can distinguish your binary from other binary to reside in the same system.

Use of taint mode makes cluster install tricky

Hi there, thanks for this neat tool.

Had a few issues with non root user install.

Usually I use the $PERL5LIB env var to append to @INC, this means I can install modules in my home directory rather than pestering a sys admin.

However, taint mode explicitly ignores all ENV variables making this not possible.

Is there a clear need for taint mode here?

nferraz / st Goto Github PK

st's People

Contributors

Stargazers

Watchers

Forkers

st's Issues

Failed test at t/05-basic-stats.t line 17.

got: '1'

expected: '10'

Failed test at t/05-basic-stats.t line 19.

got: '1'

expected: '10'

Failed test at t/05-basic-stats.t line 20.

got: '1'

expected: '55'

Failed test at t/05-basic-stats.t line 23.

got: '1'

expected: '3.02'

Looks like you failed 4 tests of 6.

Failed test at t/05-format.t line 20.

got: '1'

expected: '10'

Failed test at t/05-format.t line 22.

got: '1'

expected: '10'

Failed test at t/05-format.t line 23.

got: '1'

expected: '55'

Failed test at t/05-format.t line 27.

got: '1'

expected: '3.02765'

Looks like you failed 4 tests of 6.

Failed test at t/06-percentile.t line 26.

got: '1'

expected: '5.5'

Failed test at t/06-percentile.t line 26.

got: '1'

expected: '9.5'

Failed test at t/06-percentile.t line 26.

got: '1'

expected: '10'

Looks like you failed 3 tests of 4.

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '10'

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '3.5'

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '7.5'

Failed test at t/06-quantiles.t line 27.

got: '1'

expected: '5.5'

Looks like you failed 4 tests of 5.

Test Summary Report

Recommend Projects

Recommend Topics

Recommend Org