Git Product home page Git Product logo

pycproject's People

Contributors

chreman avatar jameshiew avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

davanstrien

pycproject's Issues

feature wish: look for adjacent terms

a function to be able to look at neighbouring terms would be great. for example to look at the term "patient" three words around the term "anxiety" => "patient adj3 anxiety". also one to look, if they are placed in the same sentence or paragraph.

can not upgrade via pip

after the update (6a4a8a4), i can not upgrade via pip my pycproject.

this is what happened:

First i executed pip install pycproject --upgrade
This did not work, cause

*********************************************************************************
Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
*********************************************************************************
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    
----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i4z7rgdq/lxml/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-0bdxv72b-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-build-i4z7rgdq/lxml/

So i tried to upgrade libxml2, first with pip (did not work), then with apt-get (ubuntu 16.04).
After this i tried to upgrade pycproject again, another error was thrown.

x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Isrc/lxml/includes -I/usr/include/python3.5m -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-3.5/src/lxml/lxml.etree.o -w
  In file included from src/lxml/lxml.etree.c:320:0:
  src/lxml/includes/etree_defs.h:14:31: fatal error: libxml/xmlversion.h: No such file or directory
  compilation terminated.
  Compile failed: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  creating tmp
  cc -I/usr/include/libxml2 -c /tmp/xmlXPathInit40vcw29a.c -o tmp/xmlXPathInit40vcw29a.o
  /tmp/xmlXPathInit40vcw29a.c:1:26: fatal error: libxml/xpath.h: No such file or directory
  compilation terminated.
  *********************************************************************************
  Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
  *********************************************************************************
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for lxml
  Running setup.py clean for lxml
Failed to build lxml
Installing collected packages: lxml, pycproject
  Found existing installation: lxml 3.5.0
    Uninstalling lxml-3.5.0:
Exception:
Traceback (most recent call last):
  File "/usr/lib/python3.5/shutil.py", line 538, in move
    os.rename(src, real_dst)
PermissionError: [Errno 13] Permission denied: '/usr/lib/python3/dist-packages/lxml' -> '/tmp/pip-19mus2bz-uninstall/usr/lib/python3/dist-packages/lxml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cheeseman/.local/lib/python3.5/site-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/home/cheeseman/.local/lib/python3.5/site-packages/pip/commands/install.py", line 342, in run
    prefix=options.prefix_path,
  File "/home/cheeseman/.local/lib/python3.5/site-packages/pip/req/req_set.py", line 778, in install
    requirement.uninstall(auto_confirm=True)
  File "/home/cheeseman/.local/lib/python3.5/site-packages/pip/req/req_install.py", line 754, in uninstall
    paths_to_remove.remove(auto_confirm)
  File "/home/cheeseman/.local/lib/python3.5/site-packages/pip/req/req_uninstall.py", line 115, in remove
    renames(path, new_path)
  File "/home/cheeseman/.local/lib/python3.5/site-packages/pip/utils/__init__.py", line 267, in renames
    shutil.move(old, new)
  File "/usr/lib/python3.5/shutil.py", line 550, in move
    rmtree(src)
  File "/usr/lib/python3.5/shutil.py", line 474, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/lib/python3.5/shutil.py", line 432, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/lib/python3.5/shutil.py", line 430, in _rmtree_safe_fd
    os.unlink(name, dir_fd=topfd)
PermissionError: [Errno 13] Permission denied: 'cssselect.py'

Use metadata in get_ functions if scholarly.html not available

Just an idea as I'm starting to play with pycproject...

I have a mix of open-access and closed articles, all of which have metadata, but only some of which have scholarly.html. This could arise in other situations as well, if for example some issue prevented scholarly.html to be generated for some files, or if I haven't yet downloaded the articles.

In this situation, should I have code that uses get_title, get_abstract etc, I would expect it to get data from scholarly.html if available, but otherwise get what it can from the metadata.

This way I don't have to write two different code paths for open and closed articles, and code that only uses information available in the metadata works before downloading the articles.

Does this make sense? Or is the metadata structure so repository-specific that it makes no sense to try to get information reliably from it?

Cheers

html5lib erorr

when using the python wrapper, since a few weeks i have this problem. its happening in jupyter as it is in python3. maybe we have to define the versions of the used packages in advance, so it can not break cause of changing pip modules we rely on.

`import numpy as np
from pandas import Series, DataFrame
import matplotlib.pyplot as plt
from pycproject.readctree import CProject
from pycproject.factnet import *
import os
from collections import Counter

%matplotlib inline


AttributeError Traceback (most recent call last)
in ()
2 from pandas import Series, DataFrame
3 import matplotlib.pyplot as plt
----> 4 from pycproject.readctree import CProject
5 from pycproject.factnet import *
6 import os

/home/cheeseman/.local/lib/python3.5/site-packages/pycproject/init.py in ()
----> 1 from . import readctree

/home/cheeseman/.local/lib/python3.5/site-packages/pycproject/readctree.py in ()
17
18 # import data handling
---> 19 from bs4 import BeautifulSoup
20
21

/usr/lib/python3/dist-packages/bs4/init.py in ()
28 import warnings
29
---> 30 from .builder import builder_registry, ParserRejectedMarkup
31 from .dammit import UnicodeDammit
32 from .element import (

/usr/lib/python3/dist-packages/bs4/builder/init.py in ()
312 register_treebuilders_from(_htmlparser)
313 try:
--> 314 from . import _html5lib
315 register_treebuilders_from(_html5lib)
316 except ImportError:

/usr/lib/python3/dist-packages/bs4/builder/_html5lib.py in ()
68
69
---> 70 class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
71
72 def init(self, soup, namespaceHTMLElements):

AttributeError: module 'html5lib.treebuilders' has no attribute '_base'
`

error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.