Comments (7)
Here's the code...
package main
import (
"flag"
"fmt"
"bitbucket.org/myapp/db"
"bitbucket.org/myapp/model"
"bitbucket.org/myapp/ptmath"
_ "github.com/chrislusf/glow/driver"
_ "github.com/chrislusf/glow/flow"
)
//GroupResult - grouping result
type GroupResult struct {
Leader *model.PtSet
GroupSize int
}
//Group2Test - input for grouping test
type Group2Test struct {
LeaderCandidate *model.PtSet
PtSet []model.PtSet
}
func main() {
flag.Parse()
//get PointSets from DB
ptSets := db.GetAvailablePointSets()
//Convert points to a different Coordinate System before running analysis:
for idx, ptSet := range ptSets {
conversionMatrix := ptmath.GetConversionMatrix(ptSet)
var xyzSlice []model.XyzPt
for _, pt := range ptSet.PtSourceSlice {
xyz := ptmath.CalculateXYZ(*conversionMatrix, pt)
xyzSlice = append(xyzSlice, *xyz)
}
ptSets[idx].xyzSlice = xyzSlice
}
//map reduce method to find biggest group:
bestGroupLeader := loadCadidates(ptSets)
fmt.Println("\nBest candidate ID:", winner.LeaderCandidate.ID, ", GroupSize: ", winner.GroupSize)
}
var (
f = flow.New()
flowOut = make(chan GroupResult)
)
func loadCadidates(ptSets []model.PtSet) *GroupResult {
var bestCandidate *GroupResult
f.Source(func(out chan Group2Test) {
ptSetIdx := 0
for ptSetIdx < len(ptSets) {
out <- Group2Test{&ptSets[ptSetIdx], ptSets}
ptSetIdx++
}
}, /*len(ptSets)*/ 10).Map(func(g2Test Group2Test) GroupResult {
return loadCandidateGroupSize(g2Test.LeaderCandidate, g2Test.PtSet)
}).Reduce(func(x GroupResult, y GroupResult) GroupResult {
//find the largest group:
if x.GroupSize > y.GroupSize {
return x
} //else
return y
}).Map(func(winner GroupResult) {
fmt.Println("\nBest ID:", winner.LeaderCandidate.ID, ", GroupSize: ", winner.GroupSize)
bestCandidate = &winner
}).Run()
return bestCandidate
}
func loadCandidateGroupSize(leaderCandidate *model.PtSet, ptSets []model.PtSet) GroupResult {
//count how many point sets have all points within some distance of a leaderCandidate
setSize := len(leaderCandidate.xyzSlice)
limit := 5
groupSize := 0
for _, ptSet := range ptSets {
idx := 0
maxDist := 0
for idx < setSize {
ptDist = ptmath.Distance(ptSet.xyzSlice[idx], leaderCandidate.xyzSlice[idx])
if ptDist > maxDist {
maxDist = ptDist
}
idx++
}
if maxDist < limit {
groupSize++
}
}
res := GroupResult{}
res.LeaderCandidate = leaderCandidate
res.GroupSize = groupSize
return res
}
from glow.
- Agent1 appears to be creating dozens and dozens of 1GB sized dat files.
Agent2 and Agent3 (if using the provided /etc shell script for my own code) created a few small files and then finished / done.
How can I avoid this file blowup on agent1?
[edit]: looks like the agent bearing the brunt of the work (and thus the dat files) changes run to run.
from glow.
I am not remembering the details of glow now. Please use gleam. It also has a pure go support.
from glow.
Thanks for the response Chris- I hd previously attempted to port the above to gleam, was having some difficulties loading the f.source and f.map - how might one do this with the above (ie- preprocessed data rather than a file/db)?
Thanks again!
from glow.
from glow.
Hi liuluheng,
thanks for the link!
in there io writer is still used in the source (as opposed to channel loading like you can do in glow).
Does this mean you can basically trick the writer to act like a channel?
from glow.
@andrewrt yes
from glow.
Related Issues (20)
- Fold operation HOT 3
- Add unit tests for moderately complex APIs across the code base HOT 8
- Fix the timing out flakiness revealed in dataset_map_test.go HOT 2
- document failure/retry modes in distributed use HOT 1
- any ideas to add Lua(LuaJIT)? HOT 14
- Consider reduce the number of Travis CI builds HOT 3
- any plan for hive like execution engine? HOT 2
- Is there a means of teeing the flow? HOT 3
- All the work is done by only 1 node HOT 2
- Issues at start_local_glow_cluster.sh HOT 2
- Doing partial reduceByKey in Flow created in func init() HOT 3
- How to make it working for multiple split logs HOT 2
- Glow support a time window like the Flink?
- glow使用与部署方法怎么处理? HOT 3
- Has it been used in the commercial production environment so far?
- glow run block when read big file data to mysql HOT 2
- Glow support for elastic search
- feature requiest: type i to enter editor mode
- update instalation instructions since go get is not longer surported HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glow.