Git Product home page Git Product logo

cleanzr's Projects

bdd icon bdd

Duplicate detection in R using a Bayesian partitioning approach

blink icon blink

This is main code for Steorts (2015), which is also on CRAN. Please cite the paper/code if you find this useful.

cd icon cd

CD dataset for Entity Resolution

clevr icon clevr

Clustering and Link Prediction Evaluation in R

cora icon cora

Cora data set for Entity Resolution

dblink icon dblink

Distributed Bayesian Entity Resolution in Apache Spark

dblinkr icon dblinkr

An R interface for the dblink Spark application

exchanger icon exchanger

Bayesian Entity Resolution with Exchangeable Random Partition Priors

exchanger-experiments icon exchanger-experiments

Scripts for reproducing the experiments in our JSSAM article on Bayesian Graphical Entity Resolution

fasthash icon fasthash

Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).

italy icon italy

A sample survey conducted by the Bank of Italy every two years containing duplicated data.

klsh icon klsh

Blocking for record linkage

representr icon representr

Create representative records post-record linkage

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.