Git Product home page Git Product logo

sql-training's Introduction

⚠️ This repository has been archived. ⚠️


Apache Flink® SQL Training

This repository provides a training for Flink's SQL API.

In this training you will learn to:

  • run SQL queries on streams.
  • use Flink's SQL CLI client.
  • perform window aggregations, stream joins, and pattern matching with SQL queries.
  • specify a continuous SQL query that maintain a dynamic result table.
  • write the result of streaming SQL queries to Kafka and MySQL.

Please find the training instructions in the Wiki of this repository.

Requirements

The training is based on Flink's SQL CLI client and uses Docker Compose to setup the training environment.

You only need Docker to run this training.
You don't need Java, Scala, or an IDE.

What is Apache Flink?

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

What is SQL on Apache Flink?

Flink features multiple APIs with different levels of abstraction. SQL is supported by Flink as a unified API for batch and stream processing, i.e., queries are executed with the same semantics on unbounded, real-time streams or bounded, recorded streams and produce the same results. SQL on Flink is commonly used to ease the definition of data analytics, data pipelining, and ETL applications.

The following example shows a SQL query that computes the number of departing taxi rides per hour.

SELECT
  TUMBLE_START(rowTime, INTERVAL '1' HOUR) AS t,
  COUNT(*) AS cnt
FROM Rides
WHERE
  isStart
GROUP BY 
  TUMBLE(rowTime, INTERVAL '1' HOUR)

Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.

sql-training's People

Contributors

alpinegizmo avatar dependabot[bot] avatar fhueske avatar twalthr avatar windber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sql-training's Issues

Jackson issue when selecting data from table

Hello,

There is an issue with the docker image as it's not working out of the box.
When running the SQL CLI and trying to 'select * from Rides' I get an error:
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassCastException: cannot assign instance of java.util.concurrent.ConcurrentHashMap to field org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.DeserializerCache._cachedDeserializers of type org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.util.LRUMap in instance of org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.DeserializerCache

failed to wget driverChanges.txt.gz when build sql-client

--2021-02-01 13:11:16-- https://drive.google.com/uc?export=download&id=1pf4tfv-YpoVQ9_O0948M8oXeCfVH-0MH
Resolving drive.google.com (drive.google.com)... 203.208.46.146, 74.125.71.94
Connecting to drive.google.com (drive.google.com)|203.208.46.146|:443... failed: Connection refused.
Connecting to drive.google.com (drive.google.com)|74.125.71.94|:443... failed: Connection refused.
--2021-02-01 13:11:58-- https://drive.google.com/uc?export=download&id=1SriiwcIdMvY7uJsWSY4Hhh32iO3F4ND2
Resolving drive.google.com (drive.google.com)... 203.208.46.146, 74.125.71.94
Connecting to drive.google.com (drive.google.com)|203.208.46.146|:443... failed: Connection refused.
Connecting to drive.google.com (drive.google.com)|74.125.71.94|:443... failed: Connection refused.
--2021-02-01 13:12:42-- https://drive.google.com/uc?export=download&id=1gY8W07OFvB7_4lHlAyingM4WQzs0_8lT
Resolving drive.google.com (drive.google.com)... 203.208.46.146, 74.125.71.94
Connecting to drive.google.com (drive.google.com)|203.208.46.146|:443... failed: Connection refused.
Connecting to drive.google.com (drive.google.com)|74.125.71.94|:443... failed: Connection refused.
ERROR: Service 'sql-client' failed to build : The command '/bin/sh -c wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-json/${FLINK_VERSION}/flink-json-${FLINK_VERSION}.jar; wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/${FLINK_VERSION}/flink-sql-connector-kafka_2.11-${FLINK_VERSION}.jar; wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-filesystem_2.11/${FLINK_VERSION}/flink-connector-filesystem_2.11-${FLINK_VERSION}.jar; wget -P /opt/flink/lib https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.7.5-8.0/flink-shaded-hadoop-2-uber-2.7.5-8.0.jar; wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-jdbc_2.11/${FLINK_VERSION}/flink-connector-jdbc_2.11-${FLINK_VERSION}.jar; wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/mysql/mysql-connector-java/8.0.19/mysql-connector-java-8.0.19.jar; mkdir -p /opt/data; mkdir -p /opt/data/stream; wget -O /opt/data/driverChanges.txt.gz 'https://drive.google.com/uc?export=download&id=1pf4tfv-YpoVQ9_O0948M8oXeCfVH-0MH'; wget -O /opt/data/fares.txt.gz 'https://drive.google.com/uc?export=download&id=1SriiwcIdMvY7uJsWSY4Hhh32iO3F4ND2'; wget -O /opt/data/rides.txt.gz 'https://drive.google.com/uc?export=download&id=1gY8W07OFvB7_4lHlAyingM4WQzs0_8lT';' returned a non-zero code: 4

how to use my own config.yaml

(base) localhost:~ sunfu$ docker-compose exec sql-client ./sql-client.sh
Reading default environment from: file:/opt/sql-client/conf/config.yaml
No session environment specified.
Validating current environment...done.

[Reading default environment from: file:/opt/sql-client/conf/config.yaml] how to overwrite de config.yaml

Cannot create container for service jobmanager: b'Invalid container name (_jobmanager_1), only [a-zA-Z0Creating _zookeeper_1

Status: Downloaded newer image for fhueske/flink-sql-client-training-1.7.2:latest
Creating _elasticsearch_1 ... error
Creating _jobmanager_1 ...
Creating _zookeeper_1 ...

Creating jobmanager_1 ... error
[a-zA-Z0-9][a-zA-Z0-9
.-] are allowed'

ERROR: for _jobmanager_1 Cannot create container for service jobmanager: b'Invalid container name (_jobmanager_1), only [a-zA-Z0Creating _zookeeper_1 ... error

ERROR: for _zookeeper_1 Cannot create container for service zookeeper: b'Invalid container name (zookeeper_1), only [a-zA-Z0-9][a-zA-Z0-9.-] are allowed'

ERROR: for elasticsearch Cannot create container for service elasticsearch: b'Invalid container name (elasticsearch_1), only [a-zA-Z0-9][a-zA-Z0-9.-] are allowed'

ERROR: for jobmanager Cannot create container for service jobmanager: b'Invalid container name (jobmanager_1), only [a-zA-Z0-9][a-zA-Z0-9.-] are allowed'

ERROR: for zookeeper Cannot create container for service zookeeper: b'Invalid container name (zookeeper_1), only [a-zA-Z0-9][a-zA-Z0-9.-] are allowed'
ERROR: Encountered errors while bringing up the project.

Is there something wrong with the Solution of Average Number of Persons Leaving an Area Per Hour i

Average Number of Persons Leaving an Area Per Hour

SELECT
  area,
  SUM(psgSum)/24.0 AS avgPsgLeaving
FROM
  (SELECT 
     toAreaId(lon, lat) AS area,
     TUMBLE_END(rideTime ,INTERVAL '1' HOUR) AS t,
     SUM(psgCnt) AS psgSum
   FROM 
     Rides
   WHERE 
     isStart AND isInNYC(lon, lat)
   GROUP BY
     toAreaId(lon, lat),
     TUMBLE(rideTime, INTERVAL '1' HOUR))
GROUP BY
  area;

why the SUM(psgSum)/24.0 is divided by 24.
What if there were If it's only five hours, or more than 24 hours of data.
I think it is necessary to calculate how many hours between the start time and end time of all data we get from the table.
If you want to get the average of a day, you may need other restrictions to to make sure the data you get get from the table is
the same one day and complete.

Error on install

When starting sql-client.sh the next error occured. OS Windows
C:\sql-training>docker-compose exec sql-client ./sql-client.sh
OCI runtime exec failed: exec failed: container_linux.go:370: starting container process caused: no such file or directory: unknown

Solution for can't run container in Docker19.x

Environment

Deepin15.11, Docker19.03

Problem

Enter docker-compose up -d and containers can't run successfully. And container's State is Exit 1.
docker-compose logs showed me that Zookeeper has permission denied exception with sock.
It's helpless to use sudo or usermod -aG docker ${USER}.

Solution

Finally I switch Docker18.0.9, containers could run normally.

Service 'sql-client' failed to build: The command '/bin/sh -c cd /opt/sql-udfs; mvn clean install' returned a non-zero code: 1

When use the command
docker-compose up
for the first time , this error ocurred
ERROR: Service 'sql-client' failed to build: The command '/bin/sh -c cd /opt/sql-udfs; mvn clean install' returned a non-zero code: 1

[ERROR] Failed to execute goal on project sql-training-udfs: Could not resolve dependencies for project com.ververica.sql-training:sql-training-udfs:jar:2-FLINK-1.11_2.11: Could not transfer artifact org.apache.flink:flink-shaded-zookeeper-3:jar:3.4.14-11.0 from/to central (https://repo.maven.apache.org/maven2): GET request of: org/apache/flink/flink-shaded-zookeeper-3/3.4.14-11.0/flink-shaded-zookeeper-3-3.4.14-11.0.jar from central failed: Premature end of Content-Length delimited message body (expected: 7,712,156; received: 2,974,083) -> [Help 1]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.