Git Product home page Git Product logo

citation-network's Introduction

citation-network

RCitation - A quick way to create a citation network based on titles in R

A.R. Siders, [email protected]

DOI

Intro

  • This code identifies cross-citations within a set of texts.
  • It works by searching for full titles. The rationale is that if Article 1 contains the full title of Article 2, it is because Article 1 cites Article 2.
  • This approach will not work if a field does not use full titles in the citation format or if paper titles are very short or comprised of common phrases (e.g., a paper titled "Vulnerability" yields many false positives in a network of climate vulnerability studies).
  • Errors arise due to typos in the text citations, formatting problems when converting texts from pdf to txt, or short citations (e.g., when an author only cites the part of a title before a colon). This approach is a quick but inexact way to create a citation network and is useful for exploring patterns and clusters.
  • The code uses R packages tm and plyr to load and search texts.

Getting Started

You will need:

  1. Papers: A set of texts in txt format. Batch converters from pdf and word are available online. The texts should be stored in a folder and should be the ONLY items in the folder. Texts must be in the same order as the titles in the csv file. One approach is to save all texts in Papers folder using author last name listed alphabetically, and then to organize titles in same order. Texts may include the full text of studies or, to reduce error rates, just the reference sections of studies.

  2. Titles: A csv file with a list of text titles in the first column with a header at the top. (Or you may edit the code to reference the appropriate column when reading in the title csv.) Web of Science provides titles in this column format when bibliographic information is downloaded.

  3. Network Visualizer: Results are formatted to be uploaded as an edges table into Gephi network visualization and anlaysis software, but any network visualization software or other R code could be used instead.

To Test

In the "test files" folder are sample texts (reference lists from academic studies in txt format) and titles (a list of paper titles in a csv document with header) that can be used to test the citation network code.

  1. Download testtitles.csv to your working directory.
  2. Download all 20 test papers into a folder in your working directory.
  3. Use provided results to compare with your output.

Common errors include a) papers not being in alphabetical order, and b) reading in the wrong number of titles (should be 20).

citation-network's People

Contributors

arsiders avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.