zenlc2000 / pyp3 Goto Github PK

View Code? Open in Web Editor NEW

34.0 34.0 13.0 35 KB

Pyed Piper tool by Toby Rosen at Sony Imageworks converted to Python 3

Python 100.00%

pyp3's People

Contributors

Stargazers

Watchers

Forkers

datnamer mennis mesprague lmatt-bit qrkourier scaradim bobpaul singe lasergit econtal zaxebo1 dinesy westurner

pyp3's Issues

p.replace(p.re()) regression

My fix for #3 broke the case of p.replace(p.re()). While that case isn't always predictable, it was usable within limits.

The manual notes

all lines that test False ('', {}, [], False, 0) are eliminated from the output completely. You can instead print out a blank line if something tests false using --keep_false.

the re() function should return to returning an empty line on no match. The correct fix for #3 is filtering out '' lines at the output stage.

logic filters don't work with sp, spp, fp, or fpp

Observed

$ echo -e "This is some input\nignore this line" > file"
$ pyp3 -t file -b 2 "'some' in fp"

$ cat file | pyp3 "'some' in p"
This is some input
$ pyp3 -b 2 "'some' in sp" "This is some input" "ignore this line"
$ pyp3 -b 2 "sp" "This is some input" "ignore this line"
This is some input
ignore this line

Expected
should behave the same regardless of where the input stream is from.

Unmatched lines return empty lines

In the original pyp, unmatched lines are removed. Ex:

$ echo -e "This\nis\nmany\nlines" | pyp "p.re('many')"
many

But in pyp3, all the unmatched lines appear.

$ echo -e "This\nis\nmany\nlines" | pyp3 "p.re('many')"


many

--dummy-input is not implemented should read --blank_inputs

While it's not documented in the "manpage" from pyp -m, --dummy-input is mentioned in the pyp -h help text:

  -n, --no_input        use with command that generates output with no input;
                        same as --dummy_input 1

~~Some might see~~ this as is a simple documentation bug, but I think --dummy_input could be rather useful for debugging. pyp supports file and "secondary" input, but only to supplement what comes in on STDIN. If one wants to read 10 lines from a file, then one needs to also send 10 lines in on STDIN. Using -n provides 1 dummy line and allows one to read 1 line from a file. Implementing --dummy-input n would allow reading n lines from a file without reading in any lines from STDIN.

Mostly I'm interested in this because reading content from STDIN conflicts with pdb. When I'm working with python, I tend to toss a lot of import pdb; pdb.set_trace() statements about, especially when I'm learning a new codebase and getting a feel for how it works. But if I'm piping data into pyp, I can't access the pdb prompt when it appears. And then I make this face ☹️

Edit files inplace like sed -i

Issue 23 from the original repository asks for sed -i like behavior. Something like

$ echo -e "123\n457" > file
$ sed -i 's/7/6/' file
$ cat  file
123
456

This seems like a good addition, but working with files are currently rather second class in pyp. I think this would take some re-thought. Maybe if pyp runs without any STDIN and the -t option, it treats the file input as pp instead of fpp.

Selecting columns for a list breaks multiple matches

Maybe this can already be done and I'm just not getting it, but here's a contrived example to illulstrate.

Let's say I have some output of ps aux which looks like this:

$ ps aux 
message+   792  0.0  0.0  42892  3672 ?        Ss   11:33   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       839  0.0  0.1 274488  5924 ?        Ssl  11:33   0:00 /usr/lib/accountsservice/accounts-daemon
daemon     846  0.0  0.0  26044  2064 ?        Ss   11:33   0:00 /usr/sbin/atd -f
root      1003  0.0  0.0  13376   168 ?        Ss   11:33   0:00 /sbin/mdadm --monitor --pid-file /run/mdadm/monitor.pid --daemonise --scan --syslog
bobpaul     1318  0.0  0.1  21516  5224 pts/0    Ss   11:37   0:00 -bash
bobpaul     1339  0.0  4.5 676188 183092 ?       Ssl  11:38   0:18 emacs --daemon
bobpaul     1499  0.0  0.1  21568  5504 pts/1    Ss+  11:48   0:00 -bash
bobpaul     1512  0.0  0.1  21480  5420 pts/2    Ss+  11:48   0:00 -bash
bobpaul     2635  0.0  0.0  12944   936 pts/0    R+   19:03   0:00 grep --color=auto -e daemon -e bash
bobpaul     2636  0.0  0.0  21516  2104 pts/0    D+   19:03   0:00 -bash
$

Now, for all lines that contain bash I want to print the 5th column. For all lines that contain daemon I want to print the 2nd column. This can be done in awk like:

$ ps aux | awk '/daemon/ { print $2 } /bash/ { print $5 }'
792
839
846
1003
21516
1339
21568
21480
2635
12944
21516
$

So I try it to incrementally build the command with pyp... I start by matching both conditions, which after a bit of messing around, I figured out I could do with 'or'. (Maybe this is already abusive.)

$ ps aux | pyp "p.re('.*daemon.*').split() or p.re('.*bash.*').split()"
[[0]message+[1]792[2]0.0[3]0.0[4]42892[5]3672[6]?[7]Ss[8]11:33[9]0:00[10]/usr/bin/dbus-daemon[11]--system[12]--address=systemd:[13]--nofork[14]--nopidfile[15]--systemd-activation]
[[0]root[1]839[2]0.0[3]0.1[4]274488[5]5924[6]?[7]Ssl[8]11:33[9]0:00[10]/usr/lib/accountsservice/accounts-daemon]
[[0]daemon[1]846[2]0.0[3]0.0[4]26044[5]2064[6]?[7]Ss[8]11:33[9]0:00[10]/usr/sbin/atd[11]-f]
[[0]root[1]1003[2]0.0[3]0.0[4]13376[5]168[6]?[7]Ss[8]11:33[9]0:00[10]/sbin/mdadm[11]--monitor[12]--pid-file[13]/run/mdadm/monitor.pid[14]--daemonise[15]--scan[16]--syslog]
[[0]bobpaul[1]1318[2]0.0[3]0.1[4]21516[5]5224[6]pts/0[7]Ss[8]11:37[9]0:00[10]-bash]
[[0]bobpaul[1]1339[2]0.0[3]4.5[4]676188[5]183092[6]?[7]Ssl[8]11:38[9]0:18[10]emacs[11]--daemon]
[[0]bobpaul[1]1499[2]0.0[3]0.1[4]21568[5]5504[6]pts/1[7]Ss+[8]11:48[9]0:00[10]-bash]
[[0]bobpaul[1]1512[2]0.0[3]0.1[4]21480[5]5420[6]pts/2[7]Ss+[8]11:48[9]0:00[10]-bash]
[[0]bobpaul[1]2635[2]0.0[3]0.0[4]12944[5]936[6]pts/0[7]R+[8]19:03[9]0:00[10]grep[11]--color=auto[12]-e[13]daemon[14]-e[15]bash]
[[0]bobpaul[1]2636[2]0.0[3]0.0[4]21516[5]2104[6]pts/0[7]D+[8]19:03[9]0:00[10]-bash]
$

Good so far. And grab the columns (remember awk is 1 indexed, python is 0):

$ ps aux | pyp "p.re('.*daemon.*').split()[1] or p.re('.*bash.*').split()[4]"
792
839
846
1003
1339
2635
$

Wait, that's not enough results. It's only shows the columns for daemon matches. I think what's happening is the [1] selector must cause the first part to evaluate to True in cases where the regex didn't match (returned None). (None[1] would cause an exception, so part of the exception handling routine must make it always return True).

This becomes apparent if we remove the column selector from the daemon regex:

$ ps | pyp "p.re('.*daemon.*').split() or p.re('.*bash.*').split()[4]"
[[0]message+[1]792[2]0.0[3]0.0[4]42892[5]3672[6]?[7]Ss[8]11:33[9]0:00[10]/usr/bin/dbus-daemon[11]--system[12]--address=systemd:[13]--nofork[14]--nopidfile[15]--systemd-activation]
[[0]root[1]839[2]0.0[3]0.1[4]274488[5]5924[6]?[7]Ssl[8]11:33[9]0:00[10]/usr/lib/accountsservice/accounts-daemon]
[[0]daemon[1]846[2]0.0[3]0.0[4]26044[5]2064[6]?[7]Ss[8]11:33[9]0:00[10]/usr/sbin/atd[11]-f]
[[0]root[1]1003[2]0.0[3]0.0[4]13376[5]168[6]?[7]Ss[8]11:33[9]0:00[10]/sbin/mdadm[11]--monitor[12]--pid-file[13]/run/mdadm/monitor.pid[14]--daemonise[15]--scan[16]--syslog]
21516
[[0]bobpaul[1]1339[2]0.0[3]4.5[4]676188[5]183092[6]?[7]Ssl[8]11:38[9]0:18[10]emacs[11]--daemon]
21568
21480
[[0]bobpaul[1]2635[2]0.0[3]0.0[4]12944[5]936[6]pts/0[7]R+[8]19:03[9]0:00[10]grep[11]--color=auto[12]-e[13]daemon[14]-e[15]bash]
21516
$

Now it's returning both matches again, but only selecting columns on the second match.

Am I just approaching this problem the wrong way, or is it not currently possible to replicate the awk code that outputs a different column depending on what within the line matched?

$ echo -e "This is some input\nignore this line" | python3 pyp3 "keep('some')"
error: can only concatenate list (not "PypStr") to list : keep('some')
error: can only concatenate list (not "PypStr") to list : keep('some')
$ echo -e "This is some input\nignore this line" | python2 pyp3 "keep('some')"
This is some input

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

zenlc2000 / pyp3 Goto Github PK

pyp3's People

Contributors

Stargazers

Watchers

Forkers

pyp3's Issues

p.replace(p.re()) regression

logic filters don't work with sp, spp, fp, or fpp

Unmatched lines return empty lines

--dummy-input is not implemented should read --blank_inputs

Edit files inplace like sed -i

Selecting columns for a list breaks multiple matches

Evaluate pyp_dev_1.3 and flatten speed patch

Add rereplace() function

keep() and lose() filters only work if run with Python2

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent