Git Product home page Git Product logo

dsc-lambda-functions-dc-ds-111819's Introduction

Lambda Functions

Introduction

Lambda functions are often a convenient way to write throw-away functions on the fly. If you need to write a more complicated function you may still need to use the more formal def method, but lambda functions provide a quick and concise way to write functions.

Objectives

You will be able to:

  • Describe the purpose of lambda functions, when they should be employed, and their limitations
  • Create lambda functions to use as arguments of other functions
  • Use the .map() or .apply() method to apply a function to a pandas series or DataFrame

Example

Let's say you want to count the number of words in each yelp review.

import pandas as pd
df = pd.read_csv('Yelp_Reviews.csv', index_col=0)
df.head(2)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
business_id cool date funny review_id stars text useful user_id
1 pomGBqfbxcqPv14c3XH-ZQ 0 2012-11-13 0 dDl8zu1vWPdKGihJrwQbpw 5 I love this place! My fiance And I go here atl... 0 msQe1u7Z_XuqjGoqhB0J5g
2 jtQARsP6P-LbkyjbO1qNGg 1 2014-10-23 1 LZp4UX5zK3e-c5ZGSeo3kA 1 Terrible. Dry corn bread. Rib tips were all fa... 3 msQe1u7Z_XuqjGoqhB0J5g
df['text'].map(lambda x: len(x.split())).head()
1     58
2     30
4     30
5     82
10    32
Name: text, dtype: int64

Similar to defining functions in general or naming the iterable in for loops, the variable that you use after calling the lambda keyword does not matter:

df['text'].map(lambda review_text: len(review_text.split())).head()
1     58
2     30
4     30
5     82
10    32
Name: text, dtype: int64

Lambda functions with conditionals

Lambda functions can also accept some conditionals if chained in a list comprehension

df['text'].map(lambda x: 'Good' if any([word in x.lower() for word in ['awesome', 'love', 'good', 'great']]) else 'Bad').head()
1     Good
2      Bad
4     Good
5      Bad
10     Bad
Name: text, dtype: object

Note

The above is terribly poor style and does in no way represent PEP 8 or Pythonic style. (For example, no line should be over 72 characters according to PEP 8; the previous line was 127 characters.) That said, it is an interesting demonstration of chaining a conditional, any method, and a list comprehension all inside a lambda function!
Shew!

Returning to a more manageable example...

Perhaps we want to naively select the year from the date string rather than convert it to a datetime object.

df.date.map(lambda x: x[:4]).head()
1     2012
2     2014
4     2014
5     2011
10    2016
Name: date, dtype: object

Lambda functions are also useful within the sort() function

# Without a key
names = ['Miriam Marks','Sidney Baird','Elaine Barrera','Eddie Reeves','Marley Beard',
         'Jaiden Liu','Bethany Martin','Stephen Rios','Audrey Mayer','Kameron Davidson',
         'Carter Wong','Teagan Bennett']
sorted(names)
['Audrey Mayer',
 'Bethany Martin',
 'Carter Wong',
 'Eddie Reeves',
 'Elaine Barrera',
 'Jaiden Liu',
 'Kameron Davidson',
 'Marley Beard',
 'Miriam Marks',
 'Sidney Baird',
 'Stephen Rios',
 'Teagan Bennett']
# Sorting by last name
names = ['Miriam Marks','Sidney Baird','Elaine Barrera','Eddie Reeves','Marley Beard',
         'Jaiden Liu','Bethany Martin','Stephen Rios','Audrey Mayer','Kameron Davidson',
'Teagan Bennett']
sorted(names, key=lambda x: x.split()[1])
['Sidney Baird',
 'Elaine Barrera',
 'Marley Beard',
 'Teagan Bennett',
 'Kameron Davidson',
 'Jaiden Liu',
 'Miriam Marks',
 'Bethany Martin',
 'Audrey Mayer',
 'Eddie Reeves',
 'Stephen Rios']

A general approach to writing [Data Transformation] Functions

Above, we've covered a lot of the syntax of lambda functions, but the thought process for writing these complex transformations was not transparent. Let's take a minute to discuss some approaches to tackling these problems.

Experiment and solve for individual cases first

Before trying to write a function to apply to an entire series, it's typically easier to attempt to solve for an individual case. For example, if we're trying to determine the number of words in a review, we can try and do this for a single review first.

First, choose an example field that you'll be applying the function to.

example = df['text'].iloc[0]
example
'I love this place! My fiance And I go here atleast once a week. The portions are huge! Food is amazing. I love their carne asada. They have great lunch specials... Leticia is super nice and cares about what you think of her restaurant. You have to try their cheese enchiladas too the sauce is different And amazing!!!'

Then start writing the function for that example. For example, if we need to count the number of words, it's natural to first divide the review into words. A natural way to do this is with the str.split() method.

example.split()
['I',
 'love',
 'this',
 'place!',
 'My',
 'fiance',
 'And',
 'I',
 'go',
 'here',
 'atleast',
 'once',
 'a',
 'week.',
 'The',
 'portions',
 'are',
 'huge!',
 'Food',
 'is',
 'amazing.',
 'I',
 'love',
 'their',
 'carne',
 'asada.',
 'They',
 'have',
 'great',
 'lunch',
 'specials...',
 'Leticia',
 'is',
 'super',
 'nice',
 'and',
 'cares',
 'about',
 'what',
 'you',
 'think',
 'of',
 'her',
 'restaurant.',
 'You',
 'have',
 'to',
 'try',
 'their',
 'cheese',
 'enchiladas',
 'too',
 'the',
 'sauce',
 'is',
 'different',
 'And',
 'amazing!!!']

Then we just need to count this!

len(example.split())
58

Then return to solving for all!

df['text'].map(lambda x: len(x.split())).head()
1     58
2     30
4     30
5     82
10    32
Name: text, dtype: int64

Watch for edge cases and exceptions

When generalizing from a single case to all cases, it's important to consider exceptions or edge cases. For example, in the above example, you might wonder whether extra spaces or punctuations effects the output.

'this is a      weird test!!!Can we break it??'.split()
['this', 'is', 'a', 'weird', 'test!!!Can', 'we', 'break', 'it??']

As you can see, extra spaces won't break our function, but missing a space after punctuation will. Perhaps this is a rare enough event that we don't worry further, but exceptions are always something to consider when writing functions.

Other Common Patterns: the % and // operators

Another common pattern that you may find very useful is the modulus or remainder operator (%), as well as the floor division operator (//). These are both very useful when you want behavior such as 'every fourth element' or 'groups of three consecutive elements'. Let's investigate a couple of examples.

The modulus operator (%)

Useful for queries such as 'every other element' or 'every fifth element' etc.

# Try a single example
3%2
1
2%2
0
# Generalize the pattern: every other
for i in range(10):
    print('i: {}, i%2: {}'.format(i, i%2))
i: 0, i%2: 0
i: 1, i%2: 1
i: 2, i%2: 0
i: 3, i%2: 1
i: 4, i%2: 0
i: 5, i%2: 1
i: 6, i%2: 0
i: 7, i%2: 1
i: 8, i%2: 0
i: 9, i%2: 1

The floor division (//) operator

Useful for creating groups of a set size. For example: groups of ten, groups of seven, etc.

# Try a single example
9//3
3
5//3
1
# Generalize the pattern: every other
for i in range(10):
    print('i: {}, i//2: {}'.format(i, i//3))
i: 0, i//2: 0
i: 1, i//2: 0
i: 2, i//2: 0
i: 3, i//2: 1
i: 4, i//2: 1
i: 5, i//2: 1
i: 6, i//2: 2
i: 7, i//2: 2
i: 8, i//2: 2
i: 9, i//2: 3

Combining % and //

Combining the two can be very useful, such as when creating subplots! Below we iterate through 12 elements arranging them into 3 rows and 4 columns.

for i in range(12):
    print('i: {}, Row: {} Column: {}'.format(i, i//4, i%4))
i: 0, Row: 0 Column: 0
i: 1, Row: 0 Column: 1
i: 2, Row: 0 Column: 2
i: 3, Row: 0 Column: 3
i: 4, Row: 1 Column: 0
i: 5, Row: 1 Column: 1
i: 6, Row: 1 Column: 2
i: 7, Row: 1 Column: 3
i: 8, Row: 2 Column: 0
i: 9, Row: 2 Column: 1
i: 10, Row: 2 Column: 2
i: 11, Row: 2 Column: 3
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
fig, axes = plt.subplots(nrows=3, ncols=4, figsize=(10,10))
x = np.linspace(start=-10, stop=10, num=10*83)
for i in range(12):
    row = i//4
    col = i%4
    ax = axes[row, col]
    ax.scatter(x, x**i)
    ax.set_title('Plot of x^{}'.format(i))
plt.show()

png

Summary

Lambda functions can be a convenient way to write "throw away" functions that you want to declare inline. In the next lesson we'll give you some practice with creating them!

dsc-lambda-functions-dc-ds-111819's People

Contributors

loredirick avatar mathymitchell avatar mike-kane avatar peterbell avatar sumedh10 avatar tkoar avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.