Git Product home page Git Product logo

geneagrapher's People

Contributors

d-torrance avatar davidalber avatar llimeht avatar schoeps avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geneagrapher's Issues

DOT writer

Create a writer class that generates DOT output for the graph.


Blocked by #5.

Empty advisor traversal

There seems to be a bug causing some ID's to have an empty advisor traversal. I have this issue with ID 51506. The command ggrapher 51506:a produces a graph with only one node, but there should be many more. Perhaps this is caused by the appearance of Co-Promotor in this ID?

Unpleasing error message for bad ID 2

This was reported by @ChaoJenWong.

Similar to #16, but with a twist. For IDs that are very long the error response is not helpful. Doing ggrapher -v 999999999999999999999 returns:

Grabbing record #999999999999999999999...Traceback (most recent call last):
  File "/Users/David/tmp/ggrapher/bin/ggrapher", line 9, in <module>
    load_entry_point('geneagrapher==1.0c2', 'console_scripts', 'ggrapher')()
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 148, in ggrapher
    ggrapher.build_graph()
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 129, in build_graph
    self.build_graph_complete(record_grabber, filename=self.cache_file)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 110, in build_graph_complete
    descendant_queue=descendant_queue)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 84, in build_graph_portion
    record = grabber.get_record(id)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/cache_grabber.py", line 37, in get_record
    record = self.grabber.get_record(id)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/grabber.py", line 31, in get_record
    return get_record_from_tree(soup, id)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/grabber.py", line 37, in get_record_from_tree
    if not has_record(soup):
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/grabber.py", line 54, in has_record
    return not soup.firstText().text == u"You have specified an ID that does \
AttributeError: 'NoneType' object has no attribute 'text'

Test method docstrings

Most of the test methods have useful comments after the method declaration. Change these comments to docstrings, since that is what they should be.

Python 3 support

Is anyone working on porting the code to python 3? I tried my luck with 2to3 but there is still a few remaining issues that require a careful look. Before I dig deep into the code, I wanted to check if this is already done? Thanks!

Graph objects should only contain information for a genealogy graph

Graph objects currently store all edge information that is extracted from the Math Genealogy Project. This should be changed so that all information in a Graph object is relevant to the associated genealogy graph. That is, Graph objects should not contain edge information for nodes that are not part of the genealogy graph represented.

Unpleasing error message for bad ID

This was reported by @ChaoJenWong.

ID 999999999 does not exist in the Mathematics Genealogy Project records. When a Geneagrapher request is made for the ID, it is preferable to have an informative and concise error message. When doing ggrapher -v 999999999, however, the response is:

Grabbing record #999999999...Traceback (most recent call last):
  File "/Users/David/tmp/ggrapher/bin/ggrapher", line 9, in <module>
    load_entry_point('geneagrapher==1.0c2', 'console_scripts', 'ggrapher')()
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 148, in ggrapher
    ggrapher.build_graph()
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 129, in build_graph
    self.build_graph_complete(record_grabber, filename=self.cache_file)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 110, in build_graph_complete
    descendant_queue=descendant_queue)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/geneagrapher.py", line 84, in build_graph_portion
    record = grabber.get_record(id)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/cache_grabber.py", line 37, in get_record
    record = self.grabber.get_record(id)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/grabber.py", line 31, in get_record
    return get_record_from_tree(soup, id)
  File "/Users/David/fun/Geneagrapher/src/geneagrapher/grabber.py", line 40, in get_record_from_tree
    raise ValueError(msg)
ValueError: Invalid id 999999999

Create needed infrastructure in Graph to enable writer classes

Geneagrapher currently generates only DOT output and the generation of the dot content is done within the Graph class. We want to introduce "writer" classes to Geneagrapher that will generalize this conversion and make it more flexible. Writing the output will be done outside of the Graph class.

A writer class takes a Graph object when initialized and uses the Graph interface to get the information needed for writing the graph.


Prerequisite of #6.

Graph.seeds member should be set of IDs

The Graph.seeds member is currently a set of Node objects. It should be changed to be a set of IDs that can be used to obtain the correct Node from a Graph object.

Use assertRaisesRegexp

A number of tests are written to verify an expected exception is raised and that its message matches what is expected. These tests are currently using assertRaises and a subsequent try-except block. Example:

self.assertRaises(DuplicateNodeError, self.graph1.add_node,
                  "Leonhard Euler", "Universitaet Basel",
                  1726, 38586, set(), set())

try:
    self.graph1.add_node("Leonhard Euler", "Universitaet Basel",
                         1726, 38586, set(), set())
except DuplicateNodeError as e:
    self.assertEqual(str(e),
                     "node with id {} already exists".format(38586))
else:
    self.fail()

This can be improved by using assertRaisesRegexp.

Required lxml version is restricted

Hi, I was trying to install geneagrapher using pip, but my pip returns an error (attached below). I am not certain what the error was about, but I found that the required lxml version was old and restricted, so I tried changing the requirement from lxml==4.2.5 to lxml>=4.2.5 and then everything went well. Perhaps it would be good to relax the requirement (or use a new version of lxml)?

Detail error log here:

Building wheels for collected packages: geneagrapher, lxml
  Building wheel for geneagrapher (setup.py) ... done
  Created wheel for geneagrapher: filename=geneagrapher-1.0-py3-none-any.whl size=206184 sha256=d7b0843d34383129e085198f2b45b1d2b7f6d8e68c1fb39e58f137828aaf0137
  Stored in directory: /private/var/folders/fc/ckq7rt157jvftnycqt444j6c0000gn/T/pip-ephem-wheel-cache-cig50idw/wheels/74/c1/f0/2567cf6194cd803e82d361d64f1fc877fb17172be8761b77cc
  Building wheel for lxml (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [146 lines of output]
      Building lxml version 4.2.5.
      Building without Cython.
      Using build configuration of libxslt 1.1.29
      Building against libxml2/libxslt in the following directory: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-12-x86_64-cpython-39
      creating build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/_elementpath.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/sax.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/pyclasslookup.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/__init__.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/builder.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/doctestcompare.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/usedoctest.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/cssselect.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/ElementInclude.py -> build/lib.macosx-12-x86_64-cpython-39/lxml
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/__init__.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/soupparser.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/defs.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/_setmixin.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/clean.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/_diffcommand.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/html5parser.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/__init__.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/formfill.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/builder.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/ElementSoup.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/_html5builder.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/usedoctest.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      copying src/lxml/html/diff.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/html
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron
      copying src/lxml/isoschematron/__init__.py -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron
      copying src/lxml/etree.h -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/etree_api.h -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/lxml.etree.h -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/lxml.etree_api.h -> build/lib.macosx-12-x86_64-cpython-39/lxml
      copying src/lxml/includes/xmlerror.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/c14n.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/xmlschema.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/__init__.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/schematron.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/tree.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/uri.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/etreepublic.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/xpath.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/htmlparser.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/xslt.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/config.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/xmlparser.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/xinclude.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/dtdvalid.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/relaxng.pxd -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/lxml-version.h -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      copying src/lxml/includes/etree_defs.h -> build/lib.macosx-12-x86_64-cpython-39/lxml/includes
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/rng
      copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/rng
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl
      copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl
      copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl
      creating build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.macosx-12-x86_64-cpython-39/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      running build_ext
      building 'lxml.etree' extension
      creating build/temp.macosx-12-x86_64-cpython-39
      creating build/temp.macosx-12-x86_64-cpython-39/src
      creating build/temp.macosx-12-x86_64-cpython-39/src/lxml
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -DCYTHON_CLINE_IN_TRACEBACK=0 -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include -Isrc -Isrc/lxml/includes -I/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/lxml/etree.c -o build/temp.macosx-12-x86_64-cpython-39/src/lxml/etree.o -w -flat_namespace
      src/lxml/etree.c:247851:33: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree_Error.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247859:37: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree_LxmlError.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247867:37: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree_C14NError.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247877:38: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__TempStore.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247890:45: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__ExceptionContext.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247900:37: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__LogEntry.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247915:41: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__BaseErrorLog.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247927:41: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__ListErrorLog.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247938:44: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__ErrorLogContext.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247954:37: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__ErrorLog.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247966:43: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__DomainErrorLog.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247978:45: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__RotatingErrorLog.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:247991:38: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree_PyErrorLog.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:248008:20: error: no member named 'tp_print' in 'struct _typeobject'
        LxmlDocumentType.tp_print = 0;
        ~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:248018:35: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree_DocInfo.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:248026:19: error: no member named 'tp_print' in 'struct _typeobject'
        LxmlElementType.tp_print = 0;
        ~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:248106:48: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree___ContentOnlyElement.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:248146:36: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__Comment.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      src/lxml/etree.c:248157:50: error: no member named 'tp_print' in 'struct _typeobject'
        __pyx_type_4lxml_5etree__ProcessingInstruction.tp_print = 0;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
      fatal error: too many errors emitted, stopping now [-ferror-limit=]
      20 errors generated.
      Compile failed: command '/usr/bin/clang' failed with exit code 1
      creating var
      creating var/folders
      creating var/folders/fc
      creating var/folders/fc/ckq7rt157jvftnycqt444j6c0000gn
      creating var/folders/fc/ckq7rt157jvftnycqt444j6c0000gn/T
      cc -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include -I/usr/include/libxml2 -c /var/folders/fc/ckq7rt157jvftnycqt444j6c0000gn/T/xmlXPathInitsrk5ia83.c -o var/folders/fc/ckq7rt157jvftnycqt444j6c0000gn/T/xmlXPathInitsrk5ia83.o
      cc var/folders/fc/ckq7rt157jvftnycqt444j6c0000gn/T/xmlXPathInitsrk5ia83.o -L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib -lxml2 -o a.out
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for lxml
  Running setup.py clean for lxml
Successfully built geneagrapher
Failed to build lxml

Syntax Error in => record["message"]

I used the pip install and then used ggrapher -help
This is the message I get.

Traceback (most recent call last):
  File "/anaconda3/bin/ggrapher", line 7, in <module>
    from geneagrapher.geneagrapher import ggrapher
  File "/anaconda3/lib/python3.7/site-packages/geneagrapher/geneagrapher.py", line 119
    print record["message"]
               ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(record["message"])?

First I thought it might be because I used python3.7 but even when I switch to 2.7 it did not work. Not sure it is an issue from your code or my wrong using. Any help would be very much appreciated. Thank you!

Doing 'python setup.py test' runs all tests twice

The changes made for #11 have resulted in python setup.py test running all tests twice:

$ python setup.py test
running test
running egg_info
.
.
.
Test __unicode__() method for record without year. ... ok

----------------------------------------------------------------------
Ran 146 tests in 23.961s

OK

Currently, it should be 73 tests.

Don't assume that database files have .db extension in tests

I'm getting the following test error:

ERROR: test_init2 (tests.geneagrapher.test_cache_grabber.TestCacheGrabberMethods)
Test constructor with non-default filename.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/profzoom/src/geneagrapher/geneagrapher/tests/geneagrapher/test_cache_grabber.py", line 42, in test_init2
    os.remove("mycachename.db")
FileNotFoundError: [Errno 2] No such file or directory: 'mycachename.db'

This is because the file that was created by CacheGrabber's initialization method doesn't have the .db extension:

$ ls mycachename*
mycachename

According to the shelve docs, there's no guarantee as to the presence of an extension (or even the number of files created!):

The filename specified is the base filename for the underlying database. As a side-effect, an extension may be added to the filename and more than one file may be created.

Wrong formatted json output

I experimented how the last geneagrapher version works. I'm interested in json output, which can be used with pyhon networkx, igraph, and julia Graphs.jl.
The json option for output creates a file which isn't a json formatted dict, and cannot be read by python json or julia JSON. (Here is a file https://github.com/empet/Datasets/blob/master/strogatz.json).
With the actual format, an attempt to read it with Python:

import json
with open('strogatz.json', 'r') as fp:
    jdata = json.load( fp)

ends with this error:

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

A file structure, defined as below, would be fine, and doesn't require many additions to the actual code of
geneagrapher:

{
 "nodes":[{"id": 1, "label": "Mathematician1\nUniversiy of Texas at Austin"},
          {"id": 2, "label": "Mathematician2\nCornell University"}, 
          {"id": 3, "label": "Mathematician3\nUCLA"}],
 "edges":[[1,2], [1,3]]
}

To be able to setup a tree structure from the json file generated by geneagrapher, I copied the nodes and labels into a csv file, and edges into another one, obviously with many find and replace to get right data to be saved as columns in a csv file. Here is the corresponding tree resulted by processing and plotting the tree data with a few Julia packages, and custom functions: https://nbviewer.org/gist/empet/e4887c789c5d276688752db7d218de31. It is an interactive plot.

Port to Beautiful Soup 4

From Debian bug #891107:

beautifulsoup version 4 was replaced as a new package, bs4, which has
been in Debian for over 5 years now. beautfiulsoup (3.x) hasn't seen any
maintenance since then. It's high time to remove it from the archive.

Most code written against Beautiful Soup 3 will work against Beautiful
Soup 4 with one simple change. All you should have to do is change the
package name from BeautifulSoup to bs4, and depend on python-bs4 instead
of python-beautifulsoup.

There is some documentation on the migration here:
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#porting-code-to-bs4

Remove enumeration from test method names

At one time it seemed like a good idea to force the test running order. This was done by making the test method names' lexicographic order the same as the order in which they appear in the test classes. The desired lexicographic order is achieved by adding test numbers to the method names (e.g., test001_init_empty).

This seemed like a good idea, at the time, but now it seems too clumsy and cumbersome. Remove these numbers. If having the tests run in a different order proves problematic, the issue can be revisited.

Redesign grabbing interface for extensibility

The next version of Geneagrapher will introduce a simple caching mechanism. A prerequisite of this is to redesign the current interface and calls to allow more extensibility (e.g., modifying the code to pass an argument defining the grab/cache used when initializing the Geneagrapher object).


Prerequisite of #8.

Simple cache using shelve

Add a simple cache using shelve from the Python Standard Library.

This cache will not support concurrency (i.e., running multiple instances of Geneagrapher simultaneously) and should be clearly explained to the user.


Blocked by #7.

cannot install

Run python setup.py install

I got

Traceback (most recent call last):
  File "setup.py", line 1, in <module>
    from setuptools import setup, find_packages
ImportError: No module named setuptools

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.