julienpalard / pipe Goto Github PK

View Code? Open in Web Editor NEW

1.9K 25.0 113.0 215 KB

A Python library to use infix notation in Python

License: MIT License

Python 100.00%

pipe's Introduction

Pipe — Infix programming toolkit

Module enabling a sh like infix syntax (using pipes).

Introduction

As an example, here is the solution for the 2nd Euler Project problem:

Find the sum of all the even-valued terms in Fibonacci which do not exceed four million.

Given fib a generator of Fibonacci numbers:

sum(fib() | where(lambda x: x % 2 == 0) | take_while(lambda x: x < 4000000))

Each pipes is lazy evalatated, can be aliased, and partially initialized, so it could be rewritten as:

is_even = where(lambda x: x % 2 == 0)
sum(fib() | is_even | take_while(lambda x: x < 4000000)

Installing

To install the library, you can just run the following command:

# Linux/macOS
python3 -m pip install pipe

# Windows
py -3 -m pip install pipe

Using

The basic syntax is to use a | like in a shell:

>>> from itertools import count
>>> from pipe import select, take
>>> sum(count() | select(lambda x: x ** 2) | take(10))
285
>>>

Some pipes take an argument:

>>> from pipe import where
>>> sum([1, 2, 3, 4] | where(lambda x: x % 2 == 0))
6
>>>

Some do not need one:

>>> from pipe import traverse
>>> for i in [1, [2, 3], 4] | traverse:
...     print(i)
1
2
3
4
>>>

In which case it's allowed to use the calling parenthesis:

>>> from pipe import traverse
>>> for i in [1, [2, 3], 4] | traverse():
...     print(i)
1
2
3
4
>>>

Existing Pipes in this module

Alphabetical list of available pipes; when several names are listed for a given pipe, these are aliases.

`batched`

Like Python 3.12 itertool.batched:

>>> from pipe import batched
>>> list("ABCDEFG" | batched(3))
[('A', 'B', 'C'), ('D', 'E', 'F'), ('G',)]
>>>

`chain`

Chain a sequence of iterables:

>>> from pipe import chain
>>> list([[1, 2], [3, 4], [5]] | chain)
[1, 2, 3, 4, 5]
>>>

Warning : chain only unfolds an iterable containing ONLY iterables:

list([1, 2, [3]] | chain)

Gives a TypeError: 'int' object is not iterable Consider using traverse.

`chain_with(other)`

Like itertools.chain, yields elements of the given iterable, then yields elements of its parameters

>>> from pipe import chain_with
>>> list((1, 2, 3) | chain_with([4, 5], [6]))
[1, 2, 3, 4, 5, 6]
>>>

`dedup(key=None)`

Deduplicate values, using the given key function if provided.

>>> from pipe import dedup
>>> list([-1, 0, 0, 0, 1, 2, 3] | dedup)
[-1, 0, 1, 2, 3]
>>> list([-1, 0, 0, 0, 1, 2, 3] | dedup(key=abs))
[-1, 0, 2, 3]
>>>

`enumerate(start=0)`

The builtin enumerate() as a Pipe:

>>> from pipe import enumerate
>>> list(['apple', 'banana', 'citron'] | enumerate)
[(0, 'apple'), (1, 'banana'), (2, 'citron')]
>>> list(['car', 'truck', 'motorcycle', 'bus', 'train'] | enumerate(start=6))
[(6, 'car'), (7, 'truck'), (8, 'motorcycle'), (9, 'bus'), (10, 'train')]
>>>

`filter(predicate)`

Alias for where(predicate), see where(predicate).

`groupby(key=None)`

Like itertools.groupby(sorted(iterable, key = keyfunc), keyfunc)

>>> from pipe import groupby, map
>>> items = range(10)
>>> ' / '.join(items | groupby(lambda x: "Odd" if x % 2 else "Even")
...                  | select(lambda x: "{}: {}".format(x[0], ', '.join(x[1] | map(str)))))
'Even: 0, 2, 4, 6, 8 / Odd: 1, 3, 5, 7, 9'
>>>

`islice()`

Just the itertools.islice function as a Pipe:

>>> from pipe import islice
>>> list((1, 2, 3, 4, 5, 6, 7, 8, 9) | islice(2, 8, 2))
[3, 5, 7]
>>>

`izip()`

Just the itertools.izip function as a Pipe:

>>> from pipe import izip
>>> list(range(0, 10) | izip(range(1, 11)))
[(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9), (9, 10)]
>>>

`map()`, `select()`

Apply a conversion expression given as parameter to each element of the given iterable

>>> list([1, 2, 3] | map(lambda x: x * x))
[1, 4, 9]

>>> list([1, 2, 3] | select(lambda x: x * x))
[1, 4, 9]
>>>

`netcat`

The netcat Pipe sends and receive bytes over TCP:

data = [
    b"HEAD / HTTP/1.0\r\n",
    b"Host: python.org\r\n",
    b"\r\n",
]
for packet in data | netcat("python.org", 80):
    print(packet.decode("UTF-8"))

Gives:

HTTP/1.1 301 Moved Permanently
Content-length: 0
Location: https://python.org/
Connection: close

`permutations(r=None)`

Returns all possible permutations:

>>> from pipe import permutations
>>> for item in 'ABC' | permutations(2):
...     print(item)
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'C')
('C', 'A')
('C', 'B')
>>>

>>> for item in range(3) | permutations:
...     print(item)
(0, 1, 2)
(0, 2, 1)
(1, 0, 2)
(1, 2, 0)
(2, 0, 1)
(2, 1, 0)
>>>

`reverse`

Like Python's built-in reversed function.

>>> from pipe import reverse
>>> list([1, 2, 3] | reverse)
[3, 2, 1]
>>>

`select(fct)`

Alias for map(fct), see map(fct).

`skip()`

Skips the given quantity of elements from the given iterable, then yields

>>> from pipe import skip
>>> list((1, 2, 3, 4, 5) | skip(2))
[3, 4, 5]
>>>

`skip_while(predicate)`

Like itertools.dropwhile, skips elements of the given iterable while the predicate is true, then yields others:

>>> from pipe import skip_while
>>> list([1, 2, 3, 4] | skip_while(lambda x: x < 3))
[3, 4]
>>>

`sort(key=None, reverse=False)`

Like Python's built-in "sorted" primitive.

>>> from pipe import sort
>>> ''.join("python" | sort)
'hnopty'
>>> [5, -4, 3, -2, 1] | sort(key=abs)
[1, -2, 3, -4, 5]
>>>

`t`

Like Haskell's operator ":":

>>> from pipe import t
>>> for i in 0 | t(1) | t(2):
...     print(i)
0
1
2
>>>

`tail(n)`

Yields the given quantity of the last elements of the given iterable.

>>> from pipe import tail
>>> for i in (1, 2, 3, 4, 5) | tail(3):
...     print(i)
3
4
5
>>>

`take(n)`

Yields the given quantity of elements from the given iterable, like head in shell script.

>>> from pipe import take
>>> for i in count() | take(5):
...     print(i)
0
1
2
3
4
>>>

`take_while(predicate)`

Like itertools.takewhile, yields elements of the given iterable while the predicate is true:

>>> from pipe import take_while
>>> for i in count() | take_while(lambda x: x ** 2 < 100):
...     print(i)
0
1
2
3
4
5
6
7
8
9
>>>

`tee`

tee outputs to the standard output and yield unchanged items, useful for debugging a pipe stage by stage:

>>> from pipe import tee
>>> sum(["1", "2", "3", "4", "5"] | tee | map(int) | tee)
'1'
1
'2'
2
'3'
3
'4'
4
'5'
5
15
>>>

The 15 at the end is the sum returning.

`transpose()`

Transposes the rows and columns of a matrix.

>>> from pipe import transpose
>>> [[1, 2, 3], [4, 5, 6], [7, 8, 9]] | transpose
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]
>>>

`traverse`

Recursively unfold iterables:

>>> list([[1, 2], [[[3], [[4]]], [5]]] | traverse)
[1, 2, 3, 4, 5]
>>> squares = (i * i for i in range(3))
>>> list([[0, 1, 2], squares] | traverse)
[0, 1, 2, 0, 1, 4]
>>>

`uniq(key=None)`

Like dedup() but only deduplicate consecutive values, using the given key function if provided (or else the identity).

>>> from pipe import uniq
>>> list([1, 1, 2, 2, 3, 3, 1, 2, 3] | uniq)
[1, 2, 3, 1, 2, 3]
>>> list([1, -1, 1, 2, -2, 2, 3, 3, 1, 2, 3] | uniq(key=abs))
[1, 2, 3, 1, 2, 3]
>>>

`where(predicate)`, `filter(predicate)`

Only yields the matching items of the given iterable:

>>> list([1, 2, 3] | where(lambda x: x % 2 == 0))
[2]
>>>

Don't forget they can be aliased:

>>> positive = where(lambda x: x > 0)
>>> negative = where(lambda x: x < 0)
>>> sum([-10, -5, 0, 5, 10] | positive)
15
>>> sum([-10, -5, 0, 5, 10] | negative)
-15
>>>

Constructing your own

You can construct your pipes using the Pipe class like:

from pipe import Pipe
square = Pipe(lambda iterable: (x ** 2 for x in iterable))
map = Pipe(lambda iterable, fct: builtins.map(fct, iterable)
>>>

As you can see it's often very short to write, and with a bit of luck the function you're wrapping already takes an iterable as the first argument, making the wrapping straight forward:

>>> from collections import deque
>>> from pipe import Pipe
>>> end = Pipe(deque)
>>>

and that's it itrable | end(3) is deque(iterable, 3):

>>> list(range(100) | end(3))
[97, 98, 99]
>>>

In case it gets more complicated one can use Pipe as a decorator to a function taking an iterable as the first argument, and any other optional arguments after:

>>> from statistics import mean

>>> @Pipe
... def running_average(iterable, width):
...     items = deque(maxlen=width)
...     for item in iterable:
...         items.append(item)
...         yield mean(items)

>>> list(range(20) | running_average(width=2))
[0, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5]
>>> list(range(20) | running_average(width=10))
[0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5]
>>>

One-off pipes

Sometimes you just want a one-liner, when creating a pipe you can specify the function's positional and named arguments directly

>>> from itertools import combinations

>>> list(range(5) | Pipe(combinations, 2))
[(0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
>>>

a simple running sum with initial starting value

>>> from itertools import accumulate

>>> list(range(10) | Pipe(accumulate, initial=1))
[1, 1, 2, 4, 7, 11, 16, 22, 29, 37, 46]
>>>

or filter your data based on some criteria

>>> from itertools import compress

list(range(20) | Pipe(compress, selectors=[1, 0] * 10))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
>>> list(range(20) | Pipe(compress, selectors=[0, 1] * 10))
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
>>>

Euler project samples

Find the sum of all the multiples of 3 or 5 below 1000.

>>> sum(count() | where(lambda x: x % 3 == 0 or x % 5 == 0) | take_while(lambda x: x < 1000))
233168
>>>

Find the sum of all the even-valued terms in Fibonacci which do not exceed four million.

sum(fib() | where(lambda x: x % 2 == 0) | take_while(lambda x: x < 4000000))

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

>>> square = map(lambda x: x ** 2)
>>> sum(range(101)) ** 2 - sum(range(101) | square)
25164150
>>>

Going deeper

Partial Pipes

A pipe can be parametrized without being evaluated:

>>> running_average_of_two = running_average(2)
>>> list(range(20) | running_average_of_two)
[0, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5]
>>>

For multi-argument pipes then can be partially initialized, you can think of curying:

some_iterable | some_pipe(1, 2, 3)
some_iterable | Pipe(some_func, 1, 2, 3)

is strictly equivalent to:

some_iterable | some_pipe(1)(2)(3)

So it can be used to specialize pipes, first a dummy example:

>>> @Pipe
... def addmul(iterable, to_add, to_mul):
...     """Computes (x + to_add) * to_mul to every items of the input."""
...     for i in iterable:
...         yield (i + to_add) * to_mul

>>> mul = addmul(0)  # This partially initialize addmul with to_add=0
>>> list(range(10) | mul(10))
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]

Which also works with keyword arguments:

>>> add = addmul(to_mul=1)  # This partially initialize addmul with `to_mul=1`
>>> list(range(10) | add(10))
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>>

But now for something interesting:

>>> import re
>>> @Pipe
... def grep(iterable, pattern, flags=0):
...     for line in iterable:
...         if re.match(pattern, line, flags=flags):
...             yield line
...
>>> lines = ["Hello", "hello", "World", "world"]
>>> for line in lines | grep("H"):
...     print(line)
Hello
>>>

Now let's reuse it in two ways, first with a pattern:

>>> lowercase_only = grep("[a-z]+$")
>>> for line in lines | lowercase_only:
...     print(line)
hello
world
>>>

Or now with a flag:

>>> igrep = grep(flags=re.IGNORECASE)
>>> for line in lines | igrep("hello"):
...    print(line)
...
Hello
hello
>>>

Lazy evaluation

Pipe uses generators all the way down, so it is naturally lazy.

In the following examples we'll use itertools.count: an infinite generator of integers.

We'll make use of the tee pipe too, which prints every values that passe through it.

The following example does nothing, nothing is printed by tee so no value passed through it. It's nice because generating an infinite sequence of squares is "slow".

>>> result = count() | tee | select(lambda x: x ** 2)
>>>

Chaining more pipes still won't make previous ones start generating values, in the following example not a single value is pulled out of count:

>>> result = count() | tee | select(lambda x: x ** 2)
>>> first_results = result | take(10)
>>> only_odd_ones = first_results | where(lambda x: x % 2)
>>>

Same without variables:

>>> result = (count() | tee
...                   | select(lambda x: x ** 2)
...                   | take(10)
...                   | where(lambda x: x % 2))
>>>

Only when values are actually needed, the generators starts to work.

In the following example only two values will be extracted out of count:

0 which is squared (to 0), passes the take(10) eaily, but is dropped by where
1 which is squared (to 1), also easily passes the take(10), passes the where, and passes the take(1).

At this point take(1) is satisfied so no other computations need to be done. Notice tee printing 0 and 1 passing through it:

>>> result = (count() | tee
...                   | select(lambda x: x ** 2)
...                   | take(10)
...                   | where(lambda x: x % 2))
>>> print(list(result | take(1)))
0
1
[1]
>>>

Deprecations

In pipe 1.x a lot of functions were returning iterables and a lot other functions were returning non-iterables, causing confusion. The one returning non-iterables could only be used as the last function of a pipe expression, so they are in fact useless:

range(100) | where(lambda x: x % 2 == 0) | add

can be rewritten with no less readability as:

sum(range(100) | where(lambda x: x % 2 == 0))

so all pipes returning non-iterables were deprecated (raising warnings), and finally removed in pipe 2.0.

What should I do?

Oh, you just upgraded pipe, got an exception, and landed here? You have three solutions:

Stop using closing-pipes, replace ...|...|...|...|as_list to list(...|...|...|), that's it, it's even shorter.
If "closing pipes" are not an issue for you, and you really like them, just reimplement the few you really need, it often take a very few lines of code, or copy them from here.
If you still rely on a lot of them and are in a hurry, just pip install pipe<2.

And start testing your project using the Python Development Mode so you catch those warnings before they bite you.

But I like them, pleassssse, reintroduce them!

This has already been discussed in #74.

An @Pipe is often easily implemented in a 1 to 3 lines of code function, and the pipe module does not aim at giving all possibilities, it aims at giving the Pipe decorator.

So if you need more pipes, closing pipes, weird pipes, you-name-it, feel free to implement them on your project, and consider the already-implemented ones as examples on how to do it.

See the Constructing your own paragraph below.

pipe's People

Contributors

Stargazers

Watchers

Forkers

martinsvoboda guniorobot jbvsmo butzeb rcludwick michel-slm lost-theory babakness jrwren dingodzilla dreampuf smmoosavi xch89820 ychaouche wolfprogrammer kingxsp hwms sseg theho mulinfro zhangqin plasmatium mebusw xaverrevax lilipan cwahbong leexn nirum ryankung yangfeit19 a-milogradov dm04806 bedekelly safwank wp3xpp j450h1 jakirkham gepcel cutso python3pkg lamborryan gxy2017 kawasaki2013 javadba huangkbaaron epicyon ojomio thautwarm pepe5 dnuffer stillmatic ctongfei psavine42 nykh mmanoeljorn vstoykov mohammad7t korenmiklos hansthen righthandabacus mirth bugatone yuanjie-ai louisparis sgs2018 abdur-rahmaanj fiisoft-public letmutx cjwdoing alanlonglong brothertook vitalfadeev sobolevn vamosraghava sergiors yueyun00 jnoortheen adbmd ygalvao zmbarson alhoo injae sugatoray jerabaul29 drreeww fidget-spinner techthiyanes kianmeng deshion renesugar eycab shaikhanas1993 yashagarwal dit-zy hubald jenisys zaxebo1 0xviviyorg yorailevi melo0806a

pipe's Issues

Conda package information

Please provide link to conda package in readme if it exists; if not, please create a conda package, thank you!

VS Code prompt doesn't work well with function defined with @Pipe

In VS Code, when I hover mouse on a function name, it will prompt me the parameters. However, if the function is decorated by @ Pipe, it just tell me
(function) xxx: Pipe
As there some solutions to it?

Very cool project

TKS for sharing this amazing project!!
The coding style is easy to read and the Pipe is a powerful tool ~

Collector as a side effect ?

how would you write a function that acts as collector i.e. because i normally have the source as list or numpy array, but want to collect the result as 2D-bitarray ...f.e.

np.random.randint(0,100,10)  | ....... | before_last |  list2Dbitary()

in this case before_last generates numpy-array-of-bitarrays and then in the last step I convert it to 2Dbitarray ... i.e. the pipe generates 2 data structs which can be large and will take too much memory..

If I have something as context/state then as a side effect of functions I can update this 2Dbitarray created before the pipe starts.

How would you do that ?

Is there any possibility to add more functions?

I've been working on a library called pyf that implements other functions that don't exist in JulienPalard/Pipe

I'd like to know, Is there plans to add more functions?
If don't, I'd like to know the reason.

Recommendations for additional operators and make it more Pipe

Wow, what an incredible project and a fantastic idea! When I first came across this repository, it immediately reminded me of the Rust iterator and its ability to chain multiple methods on an iterator in a lazy manner. It got me thinking about how great it would be to implement some of those operations, just like in Rust iterators.
For example:

... | <Pipe function> | collect(factory) instead of factory(... | <Pipe function>) to PIPE ALL!
Creating more official operators that are both practical and captivating for users. Here are some potential examples:

@Pipe
def step_by(iterable, step):
    "Yield one item out of 'step' in the given iterable."
    for i, item in enumerate(iterable):
        if i % step == 0:
            yield item


@Pipe
def reduce(iterable, predicate):
    "Reduce the given iterable to one element using the given criterion."
    return functools.reduce(predicate, iterable)


@Pipe
def position(iterable, predicate):
    "Get the position of the element in the iterable."
    for i, item in enumerate(iterable):
        if predicate(item):
            return i


@Pipe
def next_chunk(iterable, n):
    ...

# something more

If you consider it to be practical, I would be delighted to contribute.

new version?

I found that I can pip install pipe but it is kind of old (v1.4.2 from 2010) but surprisingly, the pipe.py here also marked as v1.4.2! Would you consider bump up the version and update the pypi.org python repository as well?

Add type hints

Use a static type checker called mypy. In libs, I think it's very important to have type hints, so that when users are going to use the lib, code linters can help, knowing exactly what kind of parameters to receive, and what the return type is.

The ideal solution would be to add type hints as specified by PEP 484, PEP 526, PEP 544, PEP 586, PEP 589, and PEP 591 directly on your code.

Would you mind creating a new release?

It looks like it's been over a year. Thanks!

Is anybody uses Pipe in production?

Is anybody uses Pipe in production? What are good and bad parts?

Concat on unicode objects

Sorry if this has already been patched in one of the forks.

In [189]: "héllo" | pipe.concat
Out[189]: 'h, \xc3, \xa9, l, l, o'

In [190]: u"héllo" | pipe.concat
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)

/home/chaouche/CODE/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pipe.py in __ror__(self, other)
350
351     def __ror__(self, other):
--> 352         return self.function(other)
353
354     def __call__(self, *args, **kwargs):

/usr/lib/python2.7/site-packages/pipe.py in concat(iterable, separator)
488 @Pipe
489 def concat(iterable, separator=", "):
--> 490     return separator.join(map(str,iterable))
491
492 @Pipe

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc3' in position 0: ordinal not in range(128)

Shadowing built-in functions

These four functions in the package

any()
all()
max()
min()

shadow built-in functions. Since python doesn't have an easy way to hide certain names during from pipe import *, may I suggest renaming name (perhaps simply prefix them with p like pmax?) Since we already took the stance of renaming map to select and filter to where I think this is in keeping with the whole strategy and will allow users to from pipe import * without worrying about shadowing.

Since this is definitely a breaking change, I respect your opinion. The exact naming is also up to you.

What is the license of this software?

What is the license of this software? Is it BSD?

Thanks

@Pipe decorator for class methods

The Pipe decorator does not work on methods defined on a class.

This would be useful to make a pipe out of an existing method. As an example, I can do something like this:

class Job(requests.Session):
      def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
  
      def get(self, items, url, *args, **kwargs):
          for item in items:
              yield super().get(url.format(item), *args, **kwargs).json()
  
      def jq(self, items, stmt):
          compiled = jq.compile(stmt)
          for item in items:
              yield from iter(compiled.input_value(item))

  job = Job()
  
  x = (
      ["hansthen", "JulienPalard"] |
      Pipe(job.get)("https://api.github.com/users/{}/repos") |
      Pipe(job.jq)(".[].license // empty | .name")
  )

But I would rather create a the Pipe at class level like this:

 class Job(requests.Session):
      def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
  
      @Pipe
      def get(self, items, url, *args, **kwargs):
          for item in items:
              yield self.session.get(url.format(item), *args, **kwargs).json()
  
      @Pipe
      def jq(self, items, stmt):
          compiled = jq.compile(stmt)
          for item in items:
              yield from iter(compiled.input_value(item))
  
  
  job = Job()
  
  x = (
      ["hansthen", "JulienPalard"] |
      job.get("https://api.github.com/users/{}/repos") |
      job.jq(".[].license // empty | .name")
  )

Could you add a recipe to make the Pipe class work with class methods?

input Type dependent preprocessing function ?

how would you integrate Type dependent preprocessing function like this one :

  def pre(fun, data):
            if hasattr(data, '__iter__') : return [ fun(d) for d in data ]
            else : return fun(data)

tryed several different ways, but can't get it work !

IDEA: fan out operator

You could implement the & (or + ?) operator as a fan out operator.
It could be used if you want to send the output of one Pipe to several others at the same time, something like:

[1,2,3,4] | where(...) | ( stdout & sum & max )

Of course the result is a bit questionable... a pipeable tuple ?

Pipes with Context

how would i do in addition to the iterator to have a context :


@Pipe
def foo(it, ctx): ....

still use it like :

... | foo | bar | ...

Lambda replacements

Hi!

I like this package! I wrote something similar: https://github.com/gamis/flo, but I think I like yours better.

The one thing mine has that yours doesn't is a concise replacement for lambda functions. I find lambda x: x**2 kind of annoying. With my library, instead of having code that looks like

mylist = ['pretty','cool','items', 'kiddo']
myindex = mylist | map(lambda x: x.upper()) | where(lambda x: 'E' in x) | groupby(key=lambda x: x[0]) | select(lambda x: x[0])

it could look like

myindex = mylist | map( _.upper() ) | where( _.has('E') ) | groupby( _[0] ) | select( _[0] )

or potentially even terser:

myindex = mylist | map_.upper() | where_.has('E') | groupby_[0] | select_[0]

If you're interested in such an addition, I can work on a PR in the next month or two. In any case, I'd love any feedback you have.

Thanks!

Greg

pylint issue

The code snipped from the readme produces an pylint error:

sum(range(100) | where(lambda x: x % 2 == 0))

No value for argument 'predicate' in function call pylint(no-value-for-parameter)

Pipe does not work if left side object defines or

Hi,

I just found your project reading my github RSS feed and it is really nice! Congratz.

I have a project, Should-DSL, and I use the same __ror__ approach in some places.

But there is a problem with this approach. If the left side object defines it's own __or__, the right side object (in your case a Pipe instance) __ror__ is never called.

Let me show you an example:

from pipe import count

class MyList(list):
    def __or__(self, other):
        return "FOO"

def test_builtins():
   result = [1, 2, 3] | count
   assert result == 3, result

def test_custom_objects():
    result = MyList([1, 2, 3]) | count
    assert result == 3, result


if __name__ == '__main__':
    test_builtins()
    test_custom_objects()

The output follows:

Traceback (most recent call last):
  File "failing_example.py", line 18, in <module>
    test_custom_objects()
  File "failing_example.py", line 13, in test_custom_objects
    assert result == 3, result
AssertionError: FOO

Unfortunately I have no solution for this. The approach a friend of mine was trying to use in Should-DSL is to add a new operator for special cases - but that's not a good solution.

Please, let me know if you find out a solution!

Cheers,
Hugo.

passing parameters

hello,
what if i want to pass paramenters between functions??
like

[1,2,3] | func1(param1) |func2(param2)

the code should inject [1,2,3] in func1 adding the param1

Is this possible?
regards

two loops

when i try to list() the result :

I'm trying to do nested iterators ...


In [108]: list( file('text/622_lines.txt') | twofor((sents,words))  )                                                                                                        
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-108-6a273f2a1594> in <module>
----> 1 list( file('text/622_lines.txt') | twofor((sents,words))  )

/......./lib/tools/text.py in twofor(it, its)
     28 @Pipe
     29 def twofor(it,its):
---> 30         for it1 in it | its[0]:
     31                 for item in it1 | its[1]:
     32                         yield item

TypeError: 'Pipe' object is not iterable

Please tag the commits corresponding to releases

Just noticed this really elegant library from a mention on StackOverflow, and I wanted to browse around and see what the changes are between versions -- and noticed that the versions are not tagged!

Could you tag them, or in case you did but the tags didn't get pushed to GitHub, could you push them? Thanks!

Configurable Pipeline with List of transformations

[Feature Request]
Hi @JulienPalard ,
I have an idea wanna run by you before I raise Pull request.
I wanna add capability where user can define various transformations with @Pipe and then have these transformations as part of a configurable pipeline. This pipeline can be reused multiple time rather than defining the same set of transformations to be applied on input every time.

Idea is:

class Pipeline:
    def __init__(self, methods):
        self.methods = methods

    def execute(self, value):
        return self.__recursive(value, self.methods)

    def __recursive(self, value, methods):
        if methods:
            return self.__recursive(value | methods[0], methods[1:])
        return value


if __name__ == '__main__':
    methods = [
        select(lambda x: x * x),
        where(lambda x: x % 2 == 1),
        collect
    ]
    pipeline = Pipeline(methods=methods)
    assert [1, 9, 25, 49, 81] == pipeline.execute(range(10))

Let me know what you think.
Thanks

about lazy iteration

Just a small question to clarify: I looked at the code, and this library is not lazily executed, right? May be a good idea to add a small note in the readme that this library is providing "syntactic sugar" but no "expected" / advanced functional programming goodies like lazy evaluation (which some people coming from other languages may expect quite strongly when they see some piping like, functional programming like syntax :) ).

Just to be clear, I love this library, and this is not a criticism, just I think it would be great to make it clear to the user - and also, that may be inspiration for future authors / future improvements etc :) .

two iterators as args

is this correct way of doing it ..


@Pipe
def prod(it): 
	for x in itertools.product(it[0],it[1]): yield x

i use it like this :

(it,it) | prod

Could the built-in functions always output an iterator?

When the input is an iterator, it's quite tempting to run next() on the output. But a number of the functions don't return an iterator, resulting in a TypeError. e.g.:

>>> next(range(10) | tail(2))
[...]
TypeError: 'collections.deque' object is not an iterator

It would be sweet if all/most of the functions would return an actual iterator. Exceptions would of course be things like as_list where you're explicitly asking for an output type.

My use-case, in case it matters, is leisurely throwing python scripts at a csv file that doesn't fit in memory. Pandas tries to load the whole thing in memory and fails, unless I use the chunksize argument, which makes it choke every so often -- and the syntax is god awful. Line by line seems to work fine, however slow. I ran into your library while looking for one that could basically support some kind of read_csv | some_stuff | more_stuff | write_csv type of workflow and do the entire thing row by row without me needing to reinvent the wheel. (Suggestions welcome if you know of a better option.)

allowing Pipe constructor with more arguments acting as a partial

The examples in the readme

sum(count() | select(lambda x: x ** 2) | take(10))

can also be written as:

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum)

calling the pipe allows using it similarly to partial

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum)(10)

The extra brackets is a little odd, I think.

What about this?

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum,10)

chain with PLUS

I tried several things but couldnt make it work ..

itertools.chain(it1, it2) | ...

instead I want to do it this way :

it1 + it2 | ....

As a reverse of that i want to pass iterator to multiple functions and join the result, something like :

it | (fun1 + fun2 + ..) | ...

which is :

fun1(it) + fun2(it) + ... | ...

How do you handle types that implement | ?

if variable X of type Y implement |-operator the pipe is not working because the left var has precedence in interpretation ... how do you overcome that case ? Ex.:

  X | ZZ

TypeError: bitarray object expected for bitwise operation

`groupby` Update and Use Case

groupby seems to still produce an itertools._grouper object, which appears to be a type in process to deprecate by 2.0.

I also wonder how the keyfunc parameter works and if it's scoped for pipe end users. I tried to pass something to it, and I got a multiple values error. Does it, by chance, allow for type recasting of the object returned? If not, a feature like that would be wonderful so as to quickly generate outputs of pipe operations (maybe pipes have something like that already?)

Finally, in the documentation x%2 and "Even" will produce unexpected results :)

Typo on PyPI page

There is a typo on the PyPI page.
It should be "enabling".

unexpected indent

hello this
euler2 = fib() | where(lambda x: x % 2 == 0)
| take_while(lambda x: x < 4000000)
| add

gives me an error
| take_while(lambda x: x < 4000000)
^
IndentationError: unexpected indent

can you help?

How to apply function to each element?

I want to convert elements to int. It's like map(int, [1,2,3]). How to do it?

Add a LICENSE to repository and PyPI package

Please add a license file to the repository and the PyPI package as well.

The PyPI package says: MIT license, but does not include a LICENSE file in the tar.gz file.

netwrite: 'to_send' is not defined

line 470:
def netwrite(iterable, host, port):
i think like this
def netwrite(to_send, host, port):

providing aliases corresponding to the functional programming "lingua franca"

Many thanks again for an awesome library.

For what I know, there is not a "hard" standardization on functional programming terminology across languages yet, but still some conventions are found across languages. Some of them are already present in this library (like map, take), some others are not present yet (like filter, reduce, head). The situation is made a bit confusing for people switching between languages, because as said this is not standardized, there are aliases, and conflicting uses of the same terminology here and there.

Still, wondering if this library could be a nice occasion to try to stick to / follow / establish some lingua franca, by using aliases to map to other programming languages.

Some aliases I specially miss and that are more or less lingua franca are, for example, i) filter as an alias for where, ii) reduce (seems not implemented yet?).

do you think it would be reasonable to go the extra mile in this library, and think carefully / fit to / try to establish some lingua franca?
in addition (quire related, so putting it here, but could be moved to another issue), regarding the documentation in the readme (section "Existing Pipes in this module"), do you think it may be a good idea / possible to order the entries by alphabetical order, and maybe name all aliases in the line defining the "main" function?

PEP

Have you considered making a PEP for this? Is there one and I missed it?

closing pipes, like ```reduce```, are currently not supported

Moved from the initial discussion in #67 ; from the author:

For reduce I don't think pipe can handle it, I deprecated "closing pipes" a year or so ago:

What I mean by a "closing pipe" is a pipe that does not return an iterable, but a value, so itself cannot be on the left hand side of a pipe, "breaking" the pipe, example:
>>> range(100) | filter(lambda x: x % 2 == 0) | sum 
would be readable, OK, but sum does not return an iterable, so no further | can be used. I deprecated this in favor of the even shorter and standard:
>>> sum(range(100) | filter(lambda x: x % 2 == 0)) 
I think reduce enters this category of finalizing pipes so it can't be added, as it would be better written as:
>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(100) | filter(lambda x: x % 2 == 0))

I understand the point of the author, but I would like to disagree on this point :) . This is maybe mostly aesthetics, but to my eyes this:

res = ( range(100) | filter(lambda x: x % 2 == 0)
                   | reduce(lambda x, y: x + y) )

looks nicer and more readable than this:

>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(100) | filter(lambda x: x % 2 == 0))

Because in the first case, I can just "follow the logics" as it flows and as my brain expects it, but in the second case, I have to force my brain to remember that while most of the expression flows from left to right, the final step is actually an exception to this rule as it is at the far left... Also, this breaks my habits from other places where I see similarly formatted "functional programming" expressions, like rust et. co., that would look much more like the first way of formatting it.

Would there be a way to enable a syntax that looks like the first case, but without the worries about closing pipes that are raised by the author? For example, using a separate, special "closing pipe" class that would allow to perform checks and to issue meaningful error messages if closing pipes are used at the wrong place in the expression?

Using Pipes in the functions

i'm stuck on something that seem it should be simple..

@Pipe
def flat(it): return sum(it,[])

syns = map(lambda z : z.synsets())

what I want instead is :

syns = map(lambda z : z.synsets()) | flat

so I can say :

.... | syns

instead of :

... | syns | flat

undocumented pipes?

Looks like there are some undocumented pipes; see for example netcat in https://github.com/JulienPalard/Pipe/blob/master/pipe.py (in addition to the deprecated closing pipes, but I guess not documenting these is the whole point :) ).

Starmap

Hi,
I was thinking of submitting a PR with a starmap function, something like this:

@Pipe
def starmap(iterable, selector):
    def starfunc(args):
        return selector(*args)
    return builtins.map(starfunc, iterable)

This is really useful in situations where you have an iterable of args that you want to pass to a function and have them unpacked positionally. An example might be in parsing a row of values into a datamodel E.g:

PersonAges = namedtuple("PersonAges", ("name", "age"))

rows = [["john", 32], ["paul", 31], ["ringo", 33], ["george", 34]]

people = list(
    rows
    | starmap(PersonAges)
)

This is what you get:

[PersonAges(name='john', age=32),
 PersonAges(name='paul', age=31),
 PersonAges(name='ringo', age=33),
 PersonAges(name='george', age=34)]

Makes it much simpler than doing something like this (especially when you have a large number of values to map):

map(lambda row: PersonAges(name=row[0], age=row[1]))

I couldn't find anything resembling a starmap but if there is another way to achieve this let me know.

Include Pipe module' son, sspipe, as part of the Standard Library

sspipe is a module which has as parent Pipe class, whose only difference is that it makes Pipe a first class citizen.

https://github.com/sspipe/sspipe/issues/4

So would you be willing to contribute to make Pipe / sspipe as part of the Standard Library?

Optimization

https://github.com/JulienPalard/Pipe/blob/master/pipe.py#L458

Wouldn't simple len(generator) do the job?

How about "apply" instead of "select"

Absolutely brilliant library. Thank you for it. Now that I have it as my shiny new hammer I'm looking for a rusty nail to use it on. One minor suggestion — did you consider other names for "select". As it is, "select" sounds like it should filter, not operate like map (which I see is also supported). Might "apply" be a better term?

Update package on pip

Hi,

Can you update the pipe package on pip?

Just installed the package and these are the functions available:
['Pipe', '__all__', '__author__', '__builtins__', '__credits__', '__date__', '__doc__', '__file__', '__name__', '__package__', '__version__', 'add', 'aggregate', 'all', 'any', 'as_dict', 'as_list', 'as_tuple', 'average', 'builtins', 'chain', 'chain_with', 'closing', 'concat', 'count', 'first', 'groupby', 'islice', 'itertools', 'izip', 'lineout', 'max', 'min', 'netcat', 'netwrite', 'permutations', 'reduce', 'reverse', 'select', 'skip', 'skip_while', 'socket', 'sort', 'stdout', 'sys', 'tail', 'take', 'take_while', 'tee', 'traverse', 'where']

There are no strip, lstrip, rstrip

Regards

has "first" but not "last"

add "how to install" to readmy

Add a how to install in the introduction of the readme, e.g.:
pip install pipe

Minor issue, but one can't assume it's simply called pipe at PYPI.

documentation correction

the docs state that

[1, 2, [3]] | chain

Gives a TypeError: chain argument #1 must support iteration Consider using traverse.

However, the command does not give an error but rather returns a <itertools.chain object>
Only when casting to a list:

list([1, 2, [3]] | chain)

there is an error TypeError: 'int' object is not iterable

Drawbacks of using this library

I love pipes in R, so this is a very enticing option for me. However, what are the main drawbacks of using this library? I imagine:

It's not "pythonic"
No type hinting support
Is performance affected?

Great work and thank you!

julienpalard / pipe Goto Github PK

pipe's Introduction

Pipe — Infix programming toolkit

Introduction

Installing

Using

Existing Pipes in this module

batched

chain

chain_with(other)

dedup(key=None)

enumerate(start=0)

filter(predicate)

groupby(key=None)

islice()

izip()

map(), select()

netcat

permutations(r=None)

reverse

select(fct)

skip()

skip_while(predicate)

sort(key=None, reverse=False)

t

tail(n)

take(n)

take_while(predicate)

tee

transpose()

traverse

uniq(key=None)

where(predicate), filter(predicate)