zenlc2000 / pyp3 Goto Github PK
View Code? Open in Web Editor NEWPyed Piper tool by Toby Rosen at Sony Imageworks converted to Python 3
Pyed Piper tool by Toby Rosen at Sony Imageworks converted to Python 3
My fix for #3 broke the case of p.replace(p.re())
. While that case isn't always predictable, it was usable within limits.
The manual notes
all lines that test False ('', {}, [], False, 0) are eliminated from the output completely. You can instead print out a blank line if something tests false using --keep_false.
the re() function should return to returning an empty line on no match. The correct fix for #3 is filtering out ''
lines at the output stage.
Observed
$ echo -e "This is some input\nignore this line" > file"
$ pyp3 -t file -b 2 "'some' in fp"
$ cat file | pyp3 "'some' in p"
This is some input
$ pyp3 -b 2 "'some' in sp" "This is some input" "ignore this line"
$ pyp3 -b 2 "sp" "This is some input" "ignore this line"
This is some input
ignore this line
Expected
should behave the same regardless of where the input stream is from.
In the original pyp, unmatched lines are removed. Ex:
$ echo -e "This\nis\nmany\nlines" | pyp "p.re('many')"
many
But in pyp3, all the unmatched lines appear.
$ echo -e "This\nis\nmany\nlines" | pyp3 "p.re('many')"
many
While it's not documented in the "manpage" from pyp -m
, --dummy-input is mentioned in the pyp -h
help text:
-n, --no_input use with command that generates output with no input;
same as --dummy_input 1
Some might see this as is a simple documentation bug, but I think --dummy_input could be rather useful for debugging. pyp supports file and "secondary" input, but only to supplement what comes in on STDIN. If one wants to read 10 lines from a file, then one needs to also send 10 lines in on STDIN. Using -n provides 1 dummy line and allows one to read 1 line from a file. Implementing --dummy-input n
would allow reading n lines from a file without reading in any lines from STDIN.
Mostly I'm interested in this because reading content from STDIN conflicts with pdb
. When I'm working with python, I tend to toss a lot of import pdb; pdb.set_trace()
statements about, especially when I'm learning a new codebase and getting a feel for how it works. But if I'm piping data into pyp, I can't access the pdb prompt when it appears. And then I make this face
Issue 23 from the original repository asks for sed -i like behavior. Something like
$ echo -e "123\n457" > file
$ sed -i 's/7/6/' file
$ cat file
123
456
This seems like a good addition, but working with files are currently rather second class in pyp. I think this would take some re-thought. Maybe if pyp runs without any STDIN and the -t option, it treats the file input as pp instead of fpp.
Maybe this can already be done and I'm just not getting it, but here's a contrived example to illulstrate.
Let's say I have some output of ps aux
which looks like this:
$ ps aux
message+ 792 0.0 0.0 42892 3672 ? Ss 11:33 0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root 839 0.0 0.1 274488 5924 ? Ssl 11:33 0:00 /usr/lib/accountsservice/accounts-daemon
daemon 846 0.0 0.0 26044 2064 ? Ss 11:33 0:00 /usr/sbin/atd -f
root 1003 0.0 0.0 13376 168 ? Ss 11:33 0:00 /sbin/mdadm --monitor --pid-file /run/mdadm/monitor.pid --daemonise --scan --syslog
bobpaul 1318 0.0 0.1 21516 5224 pts/0 Ss 11:37 0:00 -bash
bobpaul 1339 0.0 4.5 676188 183092 ? Ssl 11:38 0:18 emacs --daemon
bobpaul 1499 0.0 0.1 21568 5504 pts/1 Ss+ 11:48 0:00 -bash
bobpaul 1512 0.0 0.1 21480 5420 pts/2 Ss+ 11:48 0:00 -bash
bobpaul 2635 0.0 0.0 12944 936 pts/0 R+ 19:03 0:00 grep --color=auto -e daemon -e bash
bobpaul 2636 0.0 0.0 21516 2104 pts/0 D+ 19:03 0:00 -bash
$
Now, for all lines that contain bash I want to print the 5th column. For all lines that contain daemon I want to print the 2nd column. This can be done in awk like:
$ ps aux | awk '/daemon/ { print $2 } /bash/ { print $5 }'
792
839
846
1003
21516
1339
21568
21480
2635
12944
21516
$
So I try it to incrementally build the command with pyp... I start by matching both conditions, which after a bit of messing around, I figured out I could do with 'or'. (Maybe this is already abusive.)
$ ps aux | pyp "p.re('.*daemon.*').split() or p.re('.*bash.*').split()"
[[0]message+[1]792[2]0.0[3]0.0[4]42892[5]3672[6]?[7]Ss[8]11:33[9]0:00[10]/usr/bin/dbus-daemon[11]--system[12]--address=systemd:[13]--nofork[14]--nopidfile[15]--systemd-activation]
[[0]root[1]839[2]0.0[3]0.1[4]274488[5]5924[6]?[7]Ssl[8]11:33[9]0:00[10]/usr/lib/accountsservice/accounts-daemon]
[[0]daemon[1]846[2]0.0[3]0.0[4]26044[5]2064[6]?[7]Ss[8]11:33[9]0:00[10]/usr/sbin/atd[11]-f]
[[0]root[1]1003[2]0.0[3]0.0[4]13376[5]168[6]?[7]Ss[8]11:33[9]0:00[10]/sbin/mdadm[11]--monitor[12]--pid-file[13]/run/mdadm/monitor.pid[14]--daemonise[15]--scan[16]--syslog]
[[0]bobpaul[1]1318[2]0.0[3]0.1[4]21516[5]5224[6]pts/0[7]Ss[8]11:37[9]0:00[10]-bash]
[[0]bobpaul[1]1339[2]0.0[3]4.5[4]676188[5]183092[6]?[7]Ssl[8]11:38[9]0:18[10]emacs[11]--daemon]
[[0]bobpaul[1]1499[2]0.0[3]0.1[4]21568[5]5504[6]pts/1[7]Ss+[8]11:48[9]0:00[10]-bash]
[[0]bobpaul[1]1512[2]0.0[3]0.1[4]21480[5]5420[6]pts/2[7]Ss+[8]11:48[9]0:00[10]-bash]
[[0]bobpaul[1]2635[2]0.0[3]0.0[4]12944[5]936[6]pts/0[7]R+[8]19:03[9]0:00[10]grep[11]--color=auto[12]-e[13]daemon[14]-e[15]bash]
[[0]bobpaul[1]2636[2]0.0[3]0.0[4]21516[5]2104[6]pts/0[7]D+[8]19:03[9]0:00[10]-bash]
$
Good so far. And grab the columns (remember awk is 1 indexed, python is 0):
$ ps aux | pyp "p.re('.*daemon.*').split()[1] or p.re('.*bash.*').split()[4]"
792
839
846
1003
1339
2635
$
Wait, that's not enough results. It's only shows the columns for daemon matches. I think what's happening is the [1]
selector must cause the first part to evaluate to True in cases where the regex didn't match (returned None). (None[1] would cause an exception, so part of the exception handling routine must make it always return True).
This becomes apparent if we remove the column selector from the daemon regex:
$ ps | pyp "p.re('.*daemon.*').split() or p.re('.*bash.*').split()[4]"
[[0]message+[1]792[2]0.0[3]0.0[4]42892[5]3672[6]?[7]Ss[8]11:33[9]0:00[10]/usr/bin/dbus-daemon[11]--system[12]--address=systemd:[13]--nofork[14]--nopidfile[15]--systemd-activation]
[[0]root[1]839[2]0.0[3]0.1[4]274488[5]5924[6]?[7]Ssl[8]11:33[9]0:00[10]/usr/lib/accountsservice/accounts-daemon]
[[0]daemon[1]846[2]0.0[3]0.0[4]26044[5]2064[6]?[7]Ss[8]11:33[9]0:00[10]/usr/sbin/atd[11]-f]
[[0]root[1]1003[2]0.0[3]0.0[4]13376[5]168[6]?[7]Ss[8]11:33[9]0:00[10]/sbin/mdadm[11]--monitor[12]--pid-file[13]/run/mdadm/monitor.pid[14]--daemonise[15]--scan[16]--syslog]
21516
[[0]bobpaul[1]1339[2]0.0[3]4.5[4]676188[5]183092[6]?[7]Ssl[8]11:38[9]0:18[10]emacs[11]--daemon]
21568
21480
[[0]bobpaul[1]2635[2]0.0[3]0.0[4]12944[5]936[6]pts/0[7]R+[8]19:03[9]0:00[10]grep[11]--color=auto[12]-e[13]daemon[14]-e[15]bash]
21516
$
Now it's returning both matches again, but only selecting columns on the second match.
Am I just approaching this problem the wrong way, or is it not currently possible to replicate the awk code that outputs a different column depending on what within the line matched?
https://code.google.com/archive/p/pyp/issues/29 has a patch that speeds up flatten_list() dramatically.
Is codebase originally based on the pyp_dev code (I found it posted in 2 different bug reports)? Regardless, the flatten_list() in this repo looks different than the proposed patch, so it would be good to take a look I think.
Per issue #22 on the original repository., the doc recommended p.replace(p.re(REGEX),BAR)
doesn't always work as expected. A simple patch was submitted and accepted but never published.
$ echo -e "This is some input\nignore this line" | python3 pyp3 "keep('some')"
error: can only concatenate list (not "PypStr") to list : keep('some')
error: can only concatenate list (not "PypStr") to list : keep('some')
$ echo -e "This is some input\nignore this line" | python2 pyp3 "keep('some')"
This is some input
A declarative, efficient, and flexible JavaScript library for building user interfaces.
đ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. đđđ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google â¤ī¸ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.