Name: Shawn M. Jones
Type: User
Company: Los Alamos National Laboratory, Old Dominion University
Bio: Research Assistant at Los Alamos National Laboratory and PhD Student at Old Dominion University, studying web science, web archiving, and more...
Twitter: shawnmjones
Location: Santa Fe, NM
Blog: https://www.shawnmjones.org
Shawn M. Jones's Projects
A Tool To Push Web Resources Into Web Archives
brozzler - distributed browser-based web crawler
This repository exists to share stories generated from the Dark and Stormy Archives project.
Shared repository for ODU CS 495 / 595 Fall 2013
ODU CS 795/795 Web Archiving Forensics, Fall 2020.
This project implements the visualization components fo the Dark and Stormy Archives project.
This repository contains work done to determine how much of www.guideline.gov and qualitymeasures.ahrq.gov were archived.
CS 825 Project Showing the Geography of federal contracting in Hampton Roads
This repository contains work done on the IIPC Dark and Stormy Archives grant.
The source of the JCDL 2023 website.
A Memento Plugin for MediaWiki
This system evaluates a series of mementos (archived web pages) to determine which are off topic. The series can be part of an Archive-It collection, a single TimeMap, or stored in a WARC file.
A Memento Client Library in Python
Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages
Core Python Web Archiving Toolkit for replay and recording of web archives
Links on the web break all the time, robustify them!
Shawn's GitHub Web Site
https://www.ap.org/en
A threadsafe sqlite worker for Python
sumgram is a tool that summarizes a collection of text documents by generating the most frequent sumgrams (multiple ngrams)
Python class to parse an simplify access to Memento timemaps.
These are the generic login scripts I use.
Visual Hash for matching copies of visually similar images.
Experiments in testable, scaleable crawler architectures
ODU WS-DL Thesis/Dissertation LaTeX Template