shilad / cartograph Goto Github PK
View Code? Open in Web Editor NEWCreates interactive maps from datasets
Home Page: http://cartograph.info
Creates interactive maps from datasets
Home Page: http://cartograph.info
Dear Cartograph Team,
recently I found your paper and this project as I'm working on visualising large embedded email corpora (http://bigvis2018.imis.athena-innovation.gr/papers/BigVis_2018_paper_5.pdf). After a simple prove of concept plotting stuff on an SVG the limitations are obvious ;)
So I tried getting Cartograph to run, but I'm running into troubles.
The setup documentation in the readme seems a bit out of sync with the actual code. Maybe even there are some files missing in the repository also? Either way, it would be great if you could add some updated stubs of all data and config files to the repository. Sure you can't add the entire dataset, but maybe head -n 100 <file>
will do.
(sorry, it's a bit messy...)
config
[DEFAULT]
dataset: enron
baseDir: ./data/%(dataset)s
generatedDir: %(baseDir)s/tsv
mapDir: %(baseDir)s/maps
geojsonDir: %(baseDir)s/geojson
externalDir: %(baseDir)s/input
[ExternalFiles]
vecs_with_id: %(externalDir)s/vectors
external_ids: %(externalDir)s/index.txt
popularity: %(externalDir)s/weights
[Metrics]
path: %(externalDir)s/weights
[PreprocessingConstants]
sample_size: 50000
num_clusters: 9
water_level: .05
sample_borders: True
label_weight = 0.1
clust_weight = 0.15
[Server]
compress_png: false
vectors
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
-0.028411094 -0.43398476 -0.23126745 0.6316483 -0.78059083 -0.67641264 0.6261415 1.0011581 -0.014821127 0.73825896 1.8444881 1.4733132 -1.3384722 -0.303069 0.67351747 -0.87371784 -0.0046889554 0.21009193 0.43050244 0.62093276 1.0013609 -0.11702167 -0.3403003 0.58391756 0.5502574 0.33163577 -0.25996274 0.56732345 1.028726 1.1469501 -0.22684239 0.76774 -1.2419224 0.14434749 1.5129268 -1.0209415 1.4649246 0.10025918 0.87900776 -0.3622763 0.7307377 -1.1676275 -0.12992088 1.1006008 -0.6213685 0.43943706 0.15711449 1.025458 0.8961261 0.99320453 0.19466212 -0.08929603 0.19527042 -0.0073949094 0.59422106 -1.1147947 0.58938104 -1.1505778 0.7899511 0.7613447 0.47599074 -0.078840725 0.039218135 0.7289678 1.31691 0.16610615 1.1411622 0.0028057138 -0.6451362 0.51711106 -0.102931455 -1.2138388 -2.3541882 -0.29484618 -0.28184843 0.24393603 -0.55432034 -0.18717165 -0.45226297 0.22155763 -0.55328757 -0.005244005 -0.55566764 -0.55934155 -1.7321936 0.8062975 -3.0936897 -0.9909483 -0.37095368 1.4672005 -0.6553549 -1.315908 1.5223165 0.4280989 -0.95227355 -0.29297256 -0.6689468 0.47462982 0.19303092 -0.37659514
...
index.txt
index name
0 /fossum-d/favorites/1.
1 /fossum-d/sent/33.
2 /fossum-d/sent/1145.
3 /fossum-d/sent/561.
4 /fossum-d/sent/1214.
5 /fossum-d/sent/1057.
6 /fossum-d/sent/1246.
7 /fossum-d/sent/366.
8 /fossum-d/sent/1252.
...
weights
/fossum-d/favorites/1. 0.1
/fossum-d/sent/33. 0.1
/fossum-d/sent/1145. 0.1
/fossum-d/sent/561. 0.1
/fossum-d/sent/1214. 0.1
/fossum-d/sent/1057. 0.1
/fossum-d/sent/1246. 0.1
/fossum-d/sent/366. 0.1
/fossum-d/sent/1252. 0.1
/fossum-d/sent/16. 0.1
...
console log
(the readme references a run.sh
, which I cant find, but should be this I guess...)
root@057fd321e753:/testvoldir# ./bin/luigi.sh --conf ./data/conf/enron.txt
Collecting annoy
Installing collected packages: annoy
Successfully installed annoy-1.11.4
18:05:21 DEBUG luigi-interface - Checking if ParentTask() is complete
18:05:21 INFO cartograph.config - using configuration file './data/conf/enron.txt'
18:05:21 INFO cartograph.config - using configuration file './data/conf/enron.txt'
18:05:21 DEBUG luigi-interface - Checking if CreateContours() is complete
18:05:21 DEBUG luigi-interface - Checking if ZPopTask() is complete
18:05:21 DEBUG luigi-interface - Checking if CreateFullCoordinates() is complete
18:05:21 DEBUG luigi-interface - Checking if CreateContinents() is complete
18:05:21 DEBUG luigi-interface - Checking if ColorsCode() is complete
18:05:21 DEBUG luigi-interface - Checking if AllMetrics() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task ParentTask__99914b932b has status PENDING
18:05:21 DEBUG luigi-interface - Checking if ExternalFile(path=./data/enron/input/weights) is complete
18:05:21 DEBUG luigi-interface - Checking if ExternalFile(path=./data/enron/input/index.txt) is complete
18:05:21 DEBUG luigi-interface - Checking if MakeRegions() is complete
18:05:21 DEBUG luigi-interface - Checking if MetricsCode() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task AllMetrics__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task MetricsCode__99914b932b has status DONE
18:05:21 DEBUG luigi-interface - Checking if MakeSampleRegions() is complete
18:05:21 DEBUG luigi-interface - Checking if WikiBrainNumbering() is complete
18:05:21 DEBUG luigi-interface - Checking if EnsureDirectoriesExist() is complete
18:05:21 DEBUG luigi-interface - Checking if CreateSampleRegionIndex() is complete
18:05:21 DEBUG luigi-interface - Checking if AugmentLabel() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task MakeRegions__99914b932b has status PENDING
18:05:21 DEBUG luigi-interface - Checking if CreateCategories() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task AugmentLabel__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task CreateCategories__99914b932b has status DONE
/usr/local/lib/python2.7/dist-packages/luigi/parameter.py:261: UserWarning: Parameter "prereqs" with value "()" is not of type string.
warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
18:05:21 DEBUG luigi-interface - Checking if SampleCreator(path=./data/enron/tsv/vectors_labels.tsv) is complete
18:05:21 INFO luigi-interface - Informed scheduler that task CreateSampleRegionIndex__99914b932b has status PENDING
18:05:21 DEBUG luigi-interface - Checking if ArticlePopularity() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task SampleCreator___data_enron_tsv_7e5ccc99dc has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task ArticlePopularity__99914b932b has status DONE
18:05:21 INFO luigi-interface - Informed scheduler that task EnsureDirectoriesExist__99914b932b has status DONE
18:05:21 WARNING luigi-interface - Data for WikiBrainNumbering() does not exist (yet?). The task is an external data dependency, so it cannot be run from this luigi process.
18:05:21 INFO luigi-interface - Informed scheduler that task WikiBrainNumbering__99914b932b has status PENDING
/usr/local/lib/python2.7/dist-packages/luigi/parameter.py:261: UserWarning: Parameter "prereqs" with value "(AugmentLabel(),)" is not of type string.
warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
18:05:21 DEBUG luigi-interface - Checking if RegionCode() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task MakeSampleRegions__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task RegionCode__99914b932b has status DONE
18:05:21 INFO luigi-interface - Informed scheduler that task ExternalFile___data_enron_inp_5f3518507e has status DONE
18:05:21 INFO luigi-interface - Informed scheduler that task ExternalFile___data_enron_inp_e2703ae741 has status DONE
18:05:21 INFO luigi-interface - Informed scheduler that task ColorsCode__99914b932b has status DONE
18:05:21 DEBUG luigi-interface - Checking if RegionLabel() is complete
18:05:21 DEBUG luigi-interface - Checking if CreateSampleCoordinates() is complete
18:05:21 DEBUG luigi-interface - Checking if BorderGeoJSONWriterCode() is complete
18:05:21 DEBUG luigi-interface - Checking if BorderFactoryCode() is complete
18:05:21 DEBUG luigi-interface - Checking if Denoise() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task CreateContinents__99914b932b has status PENDING
18:05:21 DEBUG luigi-interface - Checking if DenoiserCode() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task Denoise__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task DenoiserCode__99914b932b has status DONE
18:05:21 INFO luigi-interface - Informed scheduler that task BorderFactoryCode__99914b932b has status DONE
18:05:21 INFO luigi-interface - Informed scheduler that task BorderGeoJSONWriterCode__99914b932b has status DONE
18:05:21 DEBUG luigi-interface - Checking if CreateEmbedding() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task CreateSampleCoordinates__99914b932b has status PENDING
/usr/local/lib/python2.7/dist-packages/luigi/parameter.py:261: UserWarning: Parameter "prereqs" with value "(AugmentCluster(),)" is not of type string.
warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
18:05:21 DEBUG luigi-interface - Checking if AugmentCluster() is complete
18:05:21 DEBUG luigi-interface - Checking if SampleCreator(path=./data/enron/tsv/vectors_labels_clusters.tsv) is complete
18:05:21 INFO luigi-interface - Informed scheduler that task CreateEmbedding__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task SampleCreator___data_enron_tsv_eb8d565c4a has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task AugmentCluster__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task RegionLabel__99914b932b has status PENDING
18:05:21 DEBUG luigi-interface - Checking if CreateSampleAnnoyIndex() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task CreateFullCoordinates__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task CreateSampleAnnoyIndex__99914b932b has status PENDING
18:05:21 DEBUG luigi-interface - Checking if CalculateZPopCode() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task ZPopTask__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task CalculateZPopCode__99914b932b has status DONE
18:05:21 DEBUG luigi-interface - Checking if ContourCode() is complete
18:05:21 INFO luigi-interface - Informed scheduler that task CreateContours__99914b932b has status PENDING
18:05:21 INFO luigi-interface - Informed scheduler that task ContourCode__99914b932b has status DONE
18:05:21 INFO luigi-interface - Done scheduling tasks
18:05:21 INFO luigi-interface - Running Worker with 1 processes
18:05:21 DEBUG luigi-interface - Asking scheduler for work...
18:05:21 DEBUG luigi.scheduler - Starting pruning of task graph
18:05:21 DEBUG luigi.scheduler - Done pruning task graph
18:05:21 DEBUG luigi-interface - Done
18:05:21 DEBUG luigi-interface - There are no more tasks to run at this time
18:05:21 DEBUG luigi-interface - There are 18 pending tasks possibly being run by other workers
18:05:21 DEBUG luigi-interface - There are 18 pending tasks unique to this worker
18:05:21 DEBUG luigi-interface - There are 18 pending tasks last scheduled by this worker
18:05:21 INFO luigi-interface - Worker Worker(salt=693047335, workers=1, host=057fd321e753, username=root, pid=274) was stopped. Shutting down Keep-Alive thread
18:05:21 INFO luigi-interface -
===== Luigi Execution Summary =====
Scheduled 32 tasks of which:
* 13 present dependencies were encountered:
- 1 ArticlePopularity()
- 1 BorderFactoryCode()
- 1 BorderGeoJSONWriterCode()
- 1 CalculateZPopCode()
- 1 ColorsCode()
...
* 19 were left pending, among these:
* 1 were missing external dependencies:
- 1 WikiBrainNumbering()
* 18 had missing external dependencies:
- 1 AllMetrics()
- 1 AugmentCluster()
- 1 AugmentLabel()
- 1 CreateContinents()
- 1 CreateContours()
...
Did not run any tasks
This progress looks :| because there were missing external dependencies
===== Luigi Execution Summary =====
LUIGI BUILD SUCCEEDED
When I remove the [Metrics]
section from the config, this error is added near the beginning of the log:
18:36:43 WARNING luigi-interface - Will not run AllMetrics() or any dependencies due to error in deps() method:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/luigi/worker.py", line 742, in _add
deps = task.deps()
File "/usr/local/lib/python2.7/dist-packages/luigi/task.py", line 634, in deps
return flatten(self._requires())
File "/usr/local/lib/python2.7/dist-packages/luigi/task.py", line 606, in _requires
return flatten(self.requires()) # base impl
File "/testvoldir/cartograph/MetricTasks.py", line 31, in requires
return (ExternalFile(conf.get('Metrics', 'path')),
File "/usr/lib/python2.7/ConfigParser.py", line 618, in get
raise NoOptionError(option, section)
NoOptionError: No option 'path' in section: 'Metrics'
I figured adding a dummy metric as with everything set to 0.1 should work as well?
Either way, would be glad if you could point out what I'm doing wrong. Ideally update the readme and add some example files so others don't run into the same problems.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.