Git Product home page Git Product logo

phyllo's People

Contributors

cegme avatar jord9nn avatar katyfelkner avatar ramcharran avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

phyllo's Issues

Segmentation fault when indexing

There is a segmentation fault when indexing the database.

The error can be replicated as follows:

docker build -it cegme/phyllo .
docker run -it cegme/phyllo /bin/bash

$ python3 app.py

The above code builds the image and then goes inside it for shell access. It then tries to run the app code. This code should run the tokenize() function that is in size app.py. In the tokenize() function, the register_tokenizer function is causing the segmentation fault.

My initial guess is that my CPU is running out of space allocated for the image.

@YanLiang1102 can you take a look at this issue.

Correct imports and modules

We need to create a module for phyllo so that it can be installed easily in a Docker. We need to correct the imports to make them aware of the full package.

We also need to create functions to hid some of the automatically executed code. Code that is not in functions will cause errors when imported because of the order data is downloaded.

program exits before executing the insert statement app.py

When I tried to execute app.py, i got the following ouput

Connected to pydev debugger (build 171.4249.47)
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
 * Restarting with stat
connection to cursor
registering tokenizer
virtual table created
Process finished with exit code 245

It does not throw any error, no segmentation fault nor any other error
it just does not execute the insert statement

Flask is resetting the whenever the application is run

This issue wasnt there before. This is the issue that was causing the flask to restart.

This is the output when i try to execute it now:
bash-4.3# python3 app.py
connection to cursor
registering tokenizer
virtual table created
Segmentation fault (core dumped)

When i remove the registration statement everything works fine and the program executes completely.

GIF of loading and searches

Hello @ramcharran,

Can you let me know when you upload the recorded gif of the installation and search showing all the advanced search operations?

A screen capture video should be good enough. Then, we will forward that over to the DLL/Dr. Huskey.

Add a Docker service to Phyllo

We will create a docker service to access all the data scraped.
Below are the list of tasks to get the service working.

  • Import docker file from the sqlite-fts-python repository.
  • Change the docker base image from ubuntu to alpine so it is smaller
  • Add a skeleton flask service to alpine
  • Add scripts to download all data and create a single data store
  • Add scripts to perform full-text indexing to the downloaded data
  • Hook the flask service into the downloaded data

Not able to pass Jsonify dictionary to jquery

The results obtained in app.py when we run the query are of type = defaultdict(set)

in the set part of the result dict the snippets of the query are stored.

When I tried to pass the result to the jquery it is able to retrieve everything but the snippets. I do not understand how to rectify this.

The screenshot below shows the query search on the web application
capture1

The screenshot below shows that result returned after running the query does contain the snippets:
capture2

UnicodeDecodeError while searching with single term

The app.py works for all the advanced searches except for single word searches like - SELECT title, book, author, link, snippet(text_idx) FROM text_idx WHERE text_idx MATCH 'possumus'; and OR searches like SELECT title, book, author, link, snippet(text_idx) FROM text_idx WHERE text_idx MATCH 'quam OR Galliae';

The application exits at line:

r1 = c.fetchall()

with the following error when the queries similar to the ones mentioned above are run.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 105: invalid continuation byte
screenshot 22
screenshot 23

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.