Git Product home page Git Product logo

dsc-1-05-10-join-statements's Introduction

Join Statements

Introduction

In this section, you will learn about several types of Join statements.

Objectives

You will be able to:

  • Compare and contrast the various types of joins
  • Understand the structure of Join statements, and the role of foreign and primary keys in them

CRM Schema

In almost all cases, rather then just working with a single table we will typically need data from multiple tables. Doing this requires the use of **joins ** using shared columns from the two tables. For example, here's a diagram of a mock customer relation management database.

Connecting to the Database

import sqlite3
import pandas as pd
conn = sqlite3.connect('data.sqlite', detect_types=sqlite3.PARSE_COLNAMES)
cur = conn.cursor()

Displaying product details along with order details

Let's say we need to generate some report that includes details about products from orders. To do that, we would need to take data from multiple tables in a single statement.

cur.execute("""select * from orderdetails
                        join products
                        on orderdetails.productCode = products.productCode
                        limit 10;
                       """)
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 10100 S18_1749 30 136.00 3 S18_1749 1917 Grand Touring Sedan Vintage Cars 1:18 Welly Diecast Productions This 1:18 scale replica of the 1917 Grand Tour... 2724 86.70 170.00
1 10100 S18_2248 50 55.09 2 S18_2248 1911 Ford Town Car Vintage Cars 1:18 Motor City Art Classics Features opening hood, opening doors, opening ... 540 33.30 60.54
2 10100 S18_4409 22 75.46 4 S18_4409 1932 Alfa Romeo 8C2300 Spider Sport Vintage Cars 1:18 Exoto Designs This 1:18 scale precision die cast replica fea... 6553 43.26 92.03
3 10100 S24_3969 49 35.29 1 S24_3969 1936 Mercedes Benz 500k Roadster Vintage Cars 1:24 Red Start Diecast This model features grille-mounted chrome horn... 2081 21.75 41.03
4 10101 S18_2325 25 108.06 4 S18_2325 1932 Model A Ford J-Coupe Vintage Cars 1:18 Autoart Studio Design This model features grille-mounted chrome horn... 9354 58.48 127.13

Compared to the individual tables:

orderdetails

cur.execute("""select * from orderdetails limit 10;""")
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber
0 10100 S18_1749 30 136.00 3
1 10100 S18_2248 50 55.09 2
2 10100 S18_4409 22 75.46 4
3 10100 S24_3969 49 35.29 1
4 10101 S18_2325 25 108.06 4

products

cur.execute("""select * from products limit 10;""")
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 S10_1678 1969 Harley Davidson Ultimate Chopper Motorcycles 1:10 Min Lin Diecast This replica features working kickstand, front... 7933 48.81 95.70
1 S10_1949 1952 Alpine Renault 1300 Classic Cars 1:10 Classic Metal Creations Turnable front wheels; steering function; deta... 7305 98.58 214.30
2 S10_2016 1996 Moto Guzzi 1100i Motorcycles 1:10 Highway 66 Mini Classics Official Moto Guzzi logos and insignias, saddl... 6625 68.99 118.94
3 S10_4698 2003 Harley-Davidson Eagle Drag Bike Motorcycles 1:10 Red Start Diecast Model features, official Harley Davidson logos... 5582 91.02 193.66
4 S10_4757 1972 Alfa Romeo GTA Classic Cars 1:10 Motor City Art Classics Features include: Turnable front wheels; steer... 3252 85.68 136.00

the using clause

A more concise way to join the tables if the column name is identical is the using clause.

cur.execute("""select * from orderdetails
                        join products
                        using(productCode)
                        limit 10;
                       """)
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 10100 S18_1749 30 136.00 3 1917 Grand Touring Sedan Vintage Cars 1:18 Welly Diecast Productions This 1:18 scale replica of the 1917 Grand Tour... 2724 86.70 170.00
1 10100 S18_2248 50 55.09 2 1911 Ford Town Car Vintage Cars 1:18 Motor City Art Classics Features opening hood, opening doors, opening ... 540 33.30 60.54
2 10100 S18_4409 22 75.46 4 1932 Alfa Romeo 8C2300 Spider Sport Vintage Cars 1:18 Exoto Designs This 1:18 scale precision die cast replica fea... 6553 43.26 92.03
3 10100 S24_3969 49 35.29 1 1936 Mercedes Benz 500k Roadster Vintage Cars 1:24 Red Start Diecast This model features grille-mounted chrome horn... 2081 21.75 41.03
4 10101 S18_2325 25 108.06 4 1932 Model A Ford J-Coupe Vintage Cars 1:18 Autoart Studio Design This model features grille-mounted chrome horn... 9354 58.48 127.13

Aliasing

Alternatively, you can also alias tables by giving them an alternative shorthand name directly after them. Here we use the aliases 'o' and 'p' for orderdetails and products respectively.

cur.execute("""select * from orderdetails o
                        join products p
                        on o.productCode = p.productCode
                        limit 10;
                       """)
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 10100 S18_1749 30 136.00 3 S18_1749 1917 Grand Touring Sedan Vintage Cars 1:18 Welly Diecast Productions This 1:18 scale replica of the 1917 Grand Tour... 2724 86.70 170.00
1 10100 S18_2248 50 55.09 2 S18_2248 1911 Ford Town Car Vintage Cars 1:18 Motor City Art Classics Features opening hood, opening doors, opening ... 540 33.30 60.54
2 10100 S18_4409 22 75.46 4 S18_4409 1932 Alfa Romeo 8C2300 Spider Sport Vintage Cars 1:18 Exoto Designs This 1:18 scale precision die cast replica fea... 6553 43.26 92.03
3 10100 S24_3969 49 35.29 1 S24_3969 1936 Mercedes Benz 500k Roadster Vintage Cars 1:24 Red Start Diecast This model features grille-mounted chrome horn... 2081 21.75 41.03
4 10101 S18_2325 25 108.06 4 S18_2325 1932 Model A Ford J-Coupe Vintage Cars 1:18 Autoart Studio Design This model features grille-mounted chrome horn... 9354 58.48 127.13

Left Joins

Above, we have only been doing inner joins which is the intersection of the two tables. There are many other types of joins, displayed below. Of these, sqlite does not support outer joins, but it is good to be aware of as more powerful versions of sql such as postgresql support these additional functions.

For example, the statement

select * from products left join orderdetails;

would return all products, even those that hadn't been ordered. We can imagine that all products in inventory should have a description in the product table, but perhaps not every product is represented in the orderdetails table.

cur.execute("""select * from products
                        left join orderdetails
                        using(productCode);
                       """)
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
print(len(df))
print(len(df[df.orderNumber.isnull()]))
df[df.orderNumber.isnull()].head()
2997
1
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP orderNumber quantityOrdered priceEach orderLineNumber
1122 S18_3233 1985 Toyota Supra Classic Cars 1:18 Highway 66 Mini Classics This model features soft rubber tires, working... 7733 57.01 107.57 None None None None

As you can see, its rare, but there is one product that has yet to be ordered

Primary Versus Foreign Keys

Another important consideration when performing joins is to think more about the key or column you are joining on. As we'll see in upcoming lessons, this can lead to interesting behavior if the join value is not unique in one or both of the tables. In all of the above examples, we joined two tables using the primary key. The primary key(s) of a table are those column(s) which uniquely identify a row. You'll also see this designated in our schema diagram with the asterisk (*).

We can also join tables using foreign keys which are not the primary key for that particular table, but rather another table. For example, employeeNumber is the primary key for the employees table and corresponds to the salesRepEmployeeNumber of the customers table. In the customers table, salesRepEmployeeNumber is only a foreign key, and is unlikely to be a unique identifier, as it is likely that an employee serves multiple customers. As such, in the resulting view, employeeNumber would no longer be a unique field.

cur.execute("""select * from customers c
                        join employees e
                        on c.salesRepEmployeeNumber = e.employeeNumber
                        order by employeeNumber;
                       """)
df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
customerNumber customerName contactLastName contactFirstName phone addressLine1 addressLine2 city state postalCode ... salesRepEmployeeNumber creditLimit employeeNumber lastName firstName extension email officeCode reportsTo jobTitle
0 124 Mini Gifts Distributors Ltd. Nelson Susan 4155551450 5677 Strong St. San Rafael CA 97562 ... 1165 210500.00 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
1 129 Mini Wheels Co. Murphy Julie 6505555787 5557 North Pendale Street San Francisco CA 94217 ... 1165 64600.00 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
2 161 Technics Stores Inc. Hashimoto Juri 6505556809 9408 Furth Circle Burlingame CA 94217 ... 1165 84600.00 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
3 321 Corporate Gift Ideas Co. Brown Julie 6505551386 7734 Strong St. San Francisco CA 94217 ... 1165 105000.00 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
4 450 The Sharp Gifts Warehouse Frick Sue 4085553659 3086 Ingle Ln. San Jose CA 94217 ... 1165 77600.00 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep

5 rows ร— 21 columns

Notice that this also returned both columns: salesRepEmployeeNumber and employeeNumber.

Summary

In this lesson we investigated joins including the on and using clause, aliasing table names, left joins and primary and foreign keys.

dsc-1-05-10-join-statements's People

Contributors

loredirick avatar mathymitchell avatar tkoar avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dsc-1-05-10-join-statements's Issues

extra space

In the code snippets
cur.execute("""select * from products limit 10;""") df = pd.DataFrame(cur.fetchall()) #Take results and create dataframe df. columns = [i[0] for i in cur.description] df.head()
You have an extra space at line 3 (df. columns).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.