Git Product home page Git Product logo

cartograph's People

Contributors

alan502 avatar anjabeth avatar boyleconnor avatar fenmaz avatar giangduong36 avatar jdippena avatar mhagen94 avatar milobeyene avatar ngonmonica avatar shilad avatar talha-ahsan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cartograph's Issues

Extend Project Documentation

Dear Cartograph Team,

recently I found your paper and this project as I'm working on visualising large embedded email corpora (http://bigvis2018.imis.athena-innovation.gr/papers/BigVis_2018_paper_5.pdf). After a simple prove of concept plotting stuff on an SVG the limitations are obvious ;)
So I tried getting Cartograph to run, but I'm running into troubles.

The setup documentation in the readme seems a bit out of sync with the actual code. Maybe even there are some files missing in the repository also? Either way, it would be great if you could add some updated stubs of all data and config files to the repository. Sure you can't add the entire dataset, but maybe head -n 100 <file> will do.

Here's my setup:

(sorry, it's a bit messy...)
config

[DEFAULT]
dataset: enron
baseDir: ./data/%(dataset)s
generatedDir: %(baseDir)s/tsv
mapDir: %(baseDir)s/maps
geojsonDir: %(baseDir)s/geojson
externalDir: %(baseDir)s/input

[ExternalFiles]
vecs_with_id: %(externalDir)s/vectors
external_ids: %(externalDir)s/index.txt
popularity: %(externalDir)s/weights

[Metrics]
path: %(externalDir)s/weights

[PreprocessingConstants]
sample_size: 50000
num_clusters: 9
water_level: .05
sample_borders: True
label_weight = 0.1
clust_weight = 0.15

[Server]
compress_png: false

vectors

1       2       3       4       5       6       7       8       9       10      11      12      13      14      15      16      17      18      19      20      21      22      23      24      25      26      27      28      29      30      31      32      33      34 35      36      37      38      39      40      41      42      43      44      45      46      47      48      49      50      51      52      53      54      55      56      57      58      59      60      61      62      63      64      65      66      67      68 69      70      71      72      73      74      75      76      77      78      79      80      81      82      83      84      85      86      87      88      89      90      91      92      93      94      95      96      97      98      99      100
-0.028411094    -0.43398476     -0.23126745     0.6316483       -0.78059083     -0.67641264     0.6261415       1.0011581       -0.014821127    0.73825896      1.8444881       1.4733132       -1.3384722      -0.303069       0.67351747      -0.87371784     -0.0046889554      0.21009193      0.43050244      0.62093276      1.0013609       -0.11702167     -0.3403003      0.58391756      0.5502574       0.33163577      -0.25996274     0.56732345      1.028726        1.1469501       -0.22684239     0.76774 -1.2419224      0.14434749      1.5129268      -1.0209415      1.4649246       0.10025918      0.87900776      -0.3622763      0.7307377       -1.1676275      -0.12992088     1.1006008       -0.6213685      0.43943706      0.15711449      1.025458        0.8961261       0.99320453      0.19466212      -0.08929603    0.19527042      -0.0073949094   0.59422106      -1.1147947      0.58938104      -1.1505778      0.7899511       0.7613447       0.47599074      -0.078840725    0.039218135     0.7289678       1.31691 0.16610615      1.1411622       0.0028057138    -0.6451362     0.51711106      -0.102931455    -1.2138388      -2.3541882      -0.29484618     -0.28184843     0.24393603      -0.55432034     -0.18717165     -0.45226297     0.22155763      -0.55328757     -0.005244005      -0.55566764     -0.55934155     -1.7321936      0.8062975      -3.0936897      -0.9909483      -0.37095368     1.4672005       -0.6553549      -1.315908       1.5223165       0.4280989       -0.95227355     -0.29297256     -0.6689468      0.47462982      0.19303092      -0.37659514
...

index.txt

index   name
0       /fossum-d/favorites/1.
1       /fossum-d/sent/33.
2       /fossum-d/sent/1145.
3       /fossum-d/sent/561.
4       /fossum-d/sent/1214.
5       /fossum-d/sent/1057.
6       /fossum-d/sent/1246.
7       /fossum-d/sent/366.
8       /fossum-d/sent/1252.
...

weights

/fossum-d/favorites/1.  0.1
/fossum-d/sent/33.      0.1
/fossum-d/sent/1145.    0.1
/fossum-d/sent/561.     0.1
/fossum-d/sent/1214.    0.1
/fossum-d/sent/1057.    0.1
/fossum-d/sent/1246.    0.1
/fossum-d/sent/366.     0.1
/fossum-d/sent/1252.    0.1
/fossum-d/sent/16.      0.1
...

console log
(the readme references a run.sh, which I cant find, but should be this I guess...)

root@057fd321e753:/testvoldir# ./bin/luigi.sh --conf ./data/conf/enron.txt
Collecting annoy
Installing collected packages: annoy
Successfully installed annoy-1.11.4
18:05:21  DEBUG    luigi-interface  -  Checking if ParentTask() is complete
18:05:21  INFO     cartograph.config  -  using configuration file './data/conf/enron.txt'
18:05:21  INFO     cartograph.config  -  using configuration file './data/conf/enron.txt'
18:05:21  DEBUG    luigi-interface  -  Checking if CreateContours() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if ZPopTask() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if CreateFullCoordinates() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if CreateContinents() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if ColorsCode() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if AllMetrics() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ParentTask__99914b932b   has status   PENDING
18:05:21  DEBUG    luigi-interface  -  Checking if ExternalFile(path=./data/enron/input/weights) is complete
18:05:21  DEBUG    luigi-interface  -  Checking if ExternalFile(path=./data/enron/input/index.txt) is complete
18:05:21  DEBUG    luigi-interface  -  Checking if MakeRegions() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if MetricsCode() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   AllMetrics__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   MetricsCode__99914b932b   has status   DONE
18:05:21  DEBUG    luigi-interface  -  Checking if MakeSampleRegions() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if WikiBrainNumbering() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if EnsureDirectoriesExist() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if CreateSampleRegionIndex() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if AugmentLabel() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   MakeRegions__99914b932b   has status   PENDING
18:05:21  DEBUG    luigi-interface  -  Checking if CreateCategories() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   AugmentLabel__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateCategories__99914b932b   has status   DONE
/usr/local/lib/python2.7/dist-packages/luigi/parameter.py:261: UserWarning: Parameter "prereqs" with value "()" is not of type string.
  warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
18:05:21  DEBUG    luigi-interface  -  Checking if SampleCreator(path=./data/enron/tsv/vectors_labels.tsv) is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateSampleRegionIndex__99914b932b   has status   PENDING
18:05:21  DEBUG    luigi-interface  -  Checking if ArticlePopularity() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   SampleCreator___data_enron_tsv_7e5ccc99dc   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ArticlePopularity__99914b932b   has status   DONE
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   EnsureDirectoriesExist__99914b932b   has status   DONE
18:05:21  WARNING  luigi-interface  -  Data for WikiBrainNumbering() does not exist (yet?). The task is an external data dependency, so it cannot be run from this luigi process.
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   WikiBrainNumbering__99914b932b   has status   PENDING
/usr/local/lib/python2.7/dist-packages/luigi/parameter.py:261: UserWarning: Parameter "prereqs" with value "(AugmentLabel(),)" is not of type string.
  warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
18:05:21  DEBUG    luigi-interface  -  Checking if RegionCode() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   MakeSampleRegions__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   RegionCode__99914b932b   has status   DONE
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ExternalFile___data_enron_inp_5f3518507e   has status   DONE
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ExternalFile___data_enron_inp_e2703ae741   has status   DONE
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ColorsCode__99914b932b   has status   DONE
18:05:21  DEBUG    luigi-interface  -  Checking if RegionLabel() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if CreateSampleCoordinates() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if BorderGeoJSONWriterCode() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if BorderFactoryCode() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if Denoise() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateContinents__99914b932b   has status   PENDING
18:05:21  DEBUG    luigi-interface  -  Checking if DenoiserCode() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   Denoise__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   DenoiserCode__99914b932b   has status   DONE
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   BorderFactoryCode__99914b932b   has status   DONE
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   BorderGeoJSONWriterCode__99914b932b   has status   DONE
18:05:21  DEBUG    luigi-interface  -  Checking if CreateEmbedding() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateSampleCoordinates__99914b932b   has status   PENDING
/usr/local/lib/python2.7/dist-packages/luigi/parameter.py:261: UserWarning: Parameter "prereqs" with value "(AugmentCluster(),)" is not of type string.
  warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
18:05:21  DEBUG    luigi-interface  -  Checking if AugmentCluster() is complete
18:05:21  DEBUG    luigi-interface  -  Checking if SampleCreator(path=./data/enron/tsv/vectors_labels_clusters.tsv) is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateEmbedding__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   SampleCreator___data_enron_tsv_eb8d565c4a   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   AugmentCluster__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   RegionLabel__99914b932b   has status   PENDING
18:05:21  DEBUG    luigi-interface  -  Checking if CreateSampleAnnoyIndex() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateFullCoordinates__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateSampleAnnoyIndex__99914b932b   has status   PENDING
18:05:21  DEBUG    luigi-interface  -  Checking if CalculateZPopCode() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ZPopTask__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CalculateZPopCode__99914b932b   has status   DONE
18:05:21  DEBUG    luigi-interface  -  Checking if ContourCode() is complete
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   CreateContours__99914b932b   has status   PENDING
18:05:21  INFO     luigi-interface  -  Informed scheduler that task   ContourCode__99914b932b   has status   DONE
18:05:21  INFO     luigi-interface  -  Done scheduling tasks
18:05:21  INFO     luigi-interface  -  Running Worker with 1 processes
18:05:21  DEBUG    luigi-interface  -  Asking scheduler for work...
18:05:21  DEBUG    luigi.scheduler  -  Starting pruning of task graph
18:05:21  DEBUG    luigi.scheduler  -  Done pruning task graph
18:05:21  DEBUG    luigi-interface  -  Done
18:05:21  DEBUG    luigi-interface  -  There are no more tasks to run at this time
18:05:21  DEBUG    luigi-interface  -  There are 18 pending tasks possibly being run by other workers
18:05:21  DEBUG    luigi-interface  -  There are 18 pending tasks unique to this worker
18:05:21  DEBUG    luigi-interface  -  There are 18 pending tasks last scheduled by this worker
18:05:21  INFO     luigi-interface  -  Worker Worker(salt=693047335, workers=1, host=057fd321e753, username=root, pid=274) was stopped. Shutting down Keep-Alive thread
18:05:21  INFO     luigi-interface  -
===== Luigi Execution Summary =====

Scheduled 32 tasks of which:
* 13 present dependencies were encountered:
    - 1 ArticlePopularity()
    - 1 BorderFactoryCode()
    - 1 BorderGeoJSONWriterCode()
    - 1 CalculateZPopCode()
    - 1 ColorsCode()
    ...
* 19 were left pending, among these:
    * 1 were missing external dependencies:
        - 1 WikiBrainNumbering()
    * 18 had missing external dependencies:
        - 1 AllMetrics()
        - 1 AugmentCluster()
        - 1 AugmentLabel()
        - 1 CreateContinents()
        - 1 CreateContours()
        ...

Did not run any tasks
This progress looks :| because there were missing external dependencies

===== Luigi Execution Summary =====

LUIGI BUILD SUCCEEDED

When I remove the [Metrics] section from the config, this error is added near the beginning of the log:

18:36:43  WARNING  luigi-interface  -  Will not run AllMetrics() or any dependencies due to error in deps() method:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/luigi/worker.py", line 742, in _add
    deps = task.deps()
  File "/usr/local/lib/python2.7/dist-packages/luigi/task.py", line 634, in deps
    return flatten(self._requires())
  File "/usr/local/lib/python2.7/dist-packages/luigi/task.py", line 606, in _requires
    return flatten(self.requires())  # base impl
  File "/testvoldir/cartograph/MetricTasks.py", line 31, in requires
    return (ExternalFile(conf.get('Metrics', 'path')),
  File "/usr/lib/python2.7/ConfigParser.py", line 618, in get
    raise NoOptionError(option, section)
NoOptionError: No option 'path' in section: 'Metrics'

I figured adding a dummy metric as with everything set to 0.1 should work as well?

Either way, would be glad if you could point out what I'm doing wrong. Ideally update the readme and add some example files so others don't run into the same problems.
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.