amphi's People
Forkers
sansaveniramphi's Issues
Authentication
Use phx.gen.auth
to refactor authentication
Profile page
A page that shows the user, liked posts/comments, written posts/comments, publications, collaborators maybe
PDF Reader
It should be super comfy to read a PDF, read the comments alongside and write new ones.
- Load/Render PDF in a memory efficient way (release pages that are not being displayed, for example)
- Display comment threads in some way
- Make it possible to highlight text and comment that section
- Display cited papers
Infrastructure
Once we have a basic web crawler and website running we should deploy this to some service. Dunno which service suits our needs the best. Maybe we're good with just getting a database service first and run the website locally, only. I also heard of fly.io which might be cool.
Web Crawler
The crawler should have the following functionalities:
- Fetch new articles and crawl through arxiv database (later other providers like pubmed)
- Extract text, author (name, email, affiliation), publication date, citations, keywords, ccs
- Save all related DOIs in order to avoid duplicates in the db (providers use different DOI for the "same" paper)
- Save the entries in the database
I implemented a basic web crawler in js so that we can use pdf.js. This seemed to make it easier to read the pdf in comparison to pdfplumber, for example. Getting a clean copy of the text content is quite difficult, but might not be necessary.
Feed
There should be a feed with relevant/new papers that could be interesting to the user. No idea how to implement this though. The post order for the feed should probably take the following data points into account:
- publication date
- number of likes
- topic (keywords, ccs)
- number of reads
This means that first we have to implement things like:
- post likes
- mechanism to see different feeds (e.g. all, new, new in AI)
- user history to count the number of reads of a paper
Database Setup
Eventually, it should be possible to do a fuzzy text search on the contents of all the papers in the db. As far as I understand, this is the perfect use case of NoSQL. However, NoSQL databases are relatively slow at relating data to one another, so loading all the publications of one specific author might be slow.
I'm not sure how to tackle this. Maybe it's possible to use postgres for data like comments/users and mongodb for the papers?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.