ddf-project / ddf-flink Goto Github PK
View Code? Open in Web Editor NEWDDF with Flink Implementation
License: Apache License 2.0
DDF with Flink Implementation
License: Apache License 2.0
It is new method in the latest DDF Master
Infer the schema of a sql2ddf command MRQL Top Level query type.
We need to add representation handlers to convert the DDF to relevant formats as accepted by the ML Algorithms.
This is prerequisite for the ML RepresentationHandlers. Namely correct implementation of these methods -
Depends on #33
Sorry about that, this was just a silly package naming error in io.spark.ddf
, which itself should have been io.ddf.spark
. See ddf-project/DDF#122
New method in latest DDF master
The following query throws an illegal args exception
select * from airlineNA where ( (case when Year is null then 1 else 0 end) + (case when Month is null then 1 else 0 end) + (case when DayofMonth is null then 1 else 0 end) + (case when DayOfWeek is null then 1 else 0 end) + (case when DepTime is null then 1 else 0 end) + (case when CRSDepTime is null then 1 else 0 end) + (case when ArrTime is null then 1 else 0 end) + (case when CRSArrTime is null then 1 else 0 end) + (case when UniqueCarrier is null then 1 else 0 end) + (case when FlightNum is null then 1 else 0 end) + (case when TailNum is null then 1 else 0 end) + (case when ActualElapsedTime is null then 1 else 0 end) + (case when CRSElapsedTime is null then 1 else 0 end) + (case when AirTime is null then 1 else 0 end) + (case when ArrDelay is null then 1 else 0 end) + (case when DepDelay is null then 1 else 0 end) + (case when Origin is null then 1 else 0 end) + (case when Dest is null then 1 else 0 end) + (case when Distance is null then 1 else 0 end) + (case when TaxiIn is null then 1 else 0 end) + (case when TaxiOut is null then 1 else 0 end) + (case when Cancelled is null then 1 else 0 end) + (case when CancellationCode is null then 1 else 0 end) + (case when Diverted is null then 1 else 0 end) + (case when CarrierDelay is null then 1 else 0 end) + (case when WeatherDelay is null then 1 else 0 end) + (case when NASDelay is null then 1 else 0 end) + (case when SecurityDelay is null then 1 else 0 end) + (case when LateAircraftDelay is null then 1 else 0 end) )< 1
The error message is
java.lang.IllegalArgumentException: Cannot parse [select * from airlineNA where ( (case when Year is null then 1 else 0 end) + (case when Month is null then 1 else 0 end) + (case when DayofMonth is null then 1 else 0 end) + (case when DayOfWeek is null then 1 else 0 end) + (case when DepTime is null then 1 else 0 end) + (case when CRSDepTime is null then 1 else 0 end) + (case when ArrTime is null then 1 else 0 end) + (case when CRSArrTime is null then 1 else 0 end) + (case when UniqueCarrier is null then 1 else 0 end) + (case when FlightNum is null then 1 else 0 end) + (case when TailNum is null then 1 else 0 end) + (case when ActualElapsedTime is null then 1 else 0 end) + (case when CRSElapsedTime is null then 1 else 0 end) + (case when AirTime is null then 1 else 0 end) + (case when ArrDelay is null then 1 else 0 end) + (case when DepDelay is null then 1 else 0 end) + (case when Origin is null then 1 else 0 end) + (case when Dest is null then 1 else 0 end) + (case when Distance is null then 1 else 0 end) + (case when TaxiIn is null then 1 else 0 end) + (case when TaxiOut is null then 1 else 0 end) + (case when Cancelled is null then 1 else 0 end) + (case when CancellationCode is null then 1 else 0 end) + (case when Diverted is null then 1 else 0 end) + (case when CarrierDelay is null then 1 else 0 end) + (case when WeatherDelay is null then 1 else 0 end) + (case when NASDelay is null then 1 else 0 end) + (case when SecurityDelay is null then 1 else 0 end) + (case when LateAircraftDelay is null then 1 else 0 end) )< 1] because `)' expected but `w' found
[info] at io.ddf.flink.content.SqlSupport$TableDdlParser.parse(SqlSupport.scala:302)
[info] at io.ddf.flink.etl.SqlHandler.parse(SqlHandler.scala:26)
[info] at io.ddf.flink.etl.SqlHandler.sql2ddf(SqlHandler.scala:29)
[info] at io.ddf.flink.etl.SqlHandler.sql2ddf(SqlHandler.scala:154)
We should review the Spark implementation test cases and document and write test specs for our Flink implementation
Come up with approach for Flink integration.
Upgrade the Flink version to 0.9.1 and implement FlinkDDF using a "Table" based representation
Is it possible to delete the specified DDF when a drop table command is executed on it?
We need to extract the underlying DataSet from the DDF to be passed to the Flink's ML Algo.
Hi @tuplejump/owners, apparently GitHub defaults to open access via OAUTH, unless you specifically reconfigure it. Please go to https://github.com/organizations/tuplejump/settings/oauth_application_policy, and click on "Set up application access restrictions", then "Restrict third-party application access".
I ran into this when authorizing spark-packages to connect to ddf-project, then noticed that it proceeds to propose to grant access to all the repos I have access to, which is not a good idea.
Cheers,
https://www.dropbox.com/s/pk6xe0vxfhs06ni/Screenshot%202015-06-17%2000.51.56.png?dl=0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.