Git Product home page Git Product logo

arxiv-dataset-download's Introduction

ArXiv Metadata Dataset

This tool generates a dataset containing ArXiv papers' metadata. You can specify subjects classification of your interest, combining the following tags:

Tag Extended name
astro-ph Astrophysics
astro-ph.CO Cosmology and Nongalactic Astrophysics
astro-ph.EP Earth and Planetary Astrophysics
astro-ph.GA Astrophysics of Galaxies
astro-ph.HE High Energy Astrophysical Phenomena
astro-ph.IM Instrumentation and Methods for Astrophysics
astro-ph.SR Solar and Stellar Astrophysics
cond-mat.dis-nn Disordered Systems and Neural Networks
cond-mat.mes-hall Mesoscale and Nanoscale Physics
cond-mat.mtrl-sci Materials Science
cond-mat.other Other Condensed Matter
cond-mat.quant-gas Quantum Gases
cond-mat.soft Soft Condensed Matter
cond-mat.stat-mech Statistical Mechanics
cond-mat.str-el Strongly Correlated Electrons
cond-mat.supr-con Superconductivity
cs.AI Artificial Intelligence
cs.AR Hardware Architecture
cs.CC Computational Complexity
cs.CE Computational Engineering, Finance, and Science
cs.CG Computational Geometry
cs.CL Computation and Language
cs.CR Cryptography and Security
cs.CV Computer Vision and Pattern Recognition
cs.CY Computers and Society
cs.DB Databases
cs.DC Distributed, Parallel, and Cluster Computing
cs.DL Digital Libraries
cs.DM Discrete Mathematics
cs.DS Data Structures and Algorithms
cs.ET Emerging Technologies
cs.FL Formal Languages and Automata Theory
cs.GL General Literature
cs.GR Graphics
cs.GT Computer Science and Game Theory
cs.HC Human-Computer Interaction
cs.IR Information Retrieval
cs.IT Information Theory
cs.LG Machine Learning
cs.LO Logic in Computer Science
cs.MA Multiagent Systems
cs.MM Multimedia
cs.MS Mathematical Software
cs.NA Numerical Analysis
cs.NE Neural and Evolutionary Computing
cs.NI Networking and Internet Architecture
cs.OH Other Computer Science
cs.OS Operating Systems
cs.PF Performance
cs.PL Programming Languages
cs.RO Robotics
cs.SC Symbolic Computation
cs.SD Sound
cs.SE Software Engineering
cs.SI Social and Information Networks
cs.SY Systems and Control
econ.EM Econometrics
eess.AS Audio and Speech Processing
eess.IV Image and Video Processing
eess.SP Signal Processing
gr-qc General Relativity and Quantum Cosmology
hep-ex High Energy Physics - Experiment
hep-lat High Energy Physics - Lattice
hep-ph High Energy Physics - Phenomenology
hep-th High Energy Physics - Theory
math.AC Commutative Algebra
math.AG Algebraic Geometry
math.AP Analysis of PDEs
math.AT Algebraic Topology
math.CA Classical Analysis and ODEs
math.CO Combinatorics
math.CT Category Theory
math.CV Complex Variables
math.DG Differential Geometry
math.DS Dynamical Systems
math.FA Functional Analysis
math.GM General Mathematics
math.GN General Topology
math.GR Group Theory
math.GT Geometric Topology
math.HO History and Overview
math.IT Information Theory
math.KT K-Theory and Homology
math.LO Logic
math.MG Metric Geometry
math.MP Mathematical Physics
math.NA Numerical Analysis
math.NT Number Theory
math.OA Operator Algebras
math.OC Optimization and Control
math.PR Probability
math.QA Quantum Algebra
math.RA Rings and Algebras
math.RT Representation Theory
math.SG Symplectic Geometry
math.SP Spectral Theory
math.ST Statistics Theory
math-ph Mathematical Physics
nlin.AO Adaptation and Self-Organizing Systems
nlin.CD Chaotic Dynamics
nlin.CG Cellular Automata and Lattice Gases
nlin.PS Pattern Formation and Solitons
nlin.SI Exactly Solvable and Integrable Systems
nucl-ex Nuclear Experiment
nucl-th Nuclear Theory
physics.acc-ph Accelerator Physics
physics.ao-ph Atmospheric and Oceanic Physics
physics.app-ph Applied Physics
physics.atm-clus Atomic and Molecular Clusters
physics.atom-ph Atomic Physics
physics.bio-ph Biological Physics
physics.chem-ph Chemical Physics
physics.class-ph Classical Physics
physics.comp-ph Computational Physics
physics.data-an Data Analysis, Statistics and Probability
physics.ed-ph Physics Education
physics.flu-dyn Fluid Dynamics
physics.gen-ph General Physics
physics.geo-ph Geophysics
physics.hist-ph History and Philosophy of Physics
physics.ins-det Instrumentation and Detectors
physics.med-ph Medical Physics
physics.optics Optics
physics.plasm-ph Plasma Physics
physics.pop-ph Popular Physics
physics.soc-ph Physics and Society
physics.space-ph Space Physics
q-bio.BM Biomolecules
q-bio.CB Cell Behavior
q-bio.GN Genomics
q-bio.MN Molecular Networks
q-bio.NC Neurons and Cognition
q-bio.OT Other Quantitative Biology
q-bio.PE Populations and Evolution
q-bio.QM Quantitative Methods
q-bio.SC Subcellular Processes
q-bio.TO Tissues and Organs
q-fin.CP Computational Finance
q-fin.EC Economics
q-fin.GN General Finance
q-fin.MF Mathematical Finance
q-fin.PM Portfolio Management
q-fin.PR Pricing of Securities
q-fin.RM Risk Management
q-fin.ST Statistical Finance
q-fin.TR Trading and Market Microstructure
quant-ph Quantum Physics
stat.AP Applications
stat.CO Computation
stat.ME Methodology
stat.ML Machine Learning
stat.OT Other Statistics
stat.TH Statistics Theory

Data is retrived in parallel using a thread for each core. Entries are saved saved in JSON files as arrays of objects having this structure:

{
    "id" : "xxx.xxxxxx",
    "date" : "2016-12-08T17:45:16Z"
    "title" : "XXXX...",
    "authors" : ["X", "Y"],
    "link" : [{"rel": "related", "href": "http://arxiv.org/pdf/xx.xx", "type": "application/pdf", "title": "pdf"}],
    "tag" : ["cs.AI", "cs.CL"],
    "abstract" : "etc....."
}

This project is a fork.

arxiv-dataset-download's People

Contributors

gabrielepisciotta avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.