Git Product home page Git Product logo

haven's Introduction

Haven

Travis-CI Build Status

Haven allows you to load foreign data formats (SAS, Spss and Stata) in to R by wrapping the fantastic ReadStat C library written by Evan Miller. Haven offers similar functionality to the base foreign package but:

  • Can read SAS's proprietary binary format (SAS7BDAT). The one other package on CRAN that does that, sas7bdat, was created to document the reverse-engineering effort. Thus its implementation is designed for experimentation, rather than efficiency. Haven is significantly faster and should also support a wider range of SAS files, and works with SAS7BCAT files.

  • It can be faster. Some spss files seem to load about 4x faster, but others load slower. If you have a lot of SPSS files to import, you might want to benchmark both and pick the fastest.

  • Works with Stata 13 files (foreign only works up to Stata 12).

  • Can also write SPSS and Stata files (This is hard to test so if you run into any problems, please let me know).

  • Can only read the data from the most common statistical packages (SAS, Stata and SPSS).

  • You always get a data frame, date times are converted to corresponding R classes and labelled vectors are returned as new labelled class. You can easily coerce to factors or replace labelled values with missings as appropriate. If you also use dplyr, you'll notice that large data frames are printed in a convenient way.

  • Uses underscores instead of dots ;)

Haven is still a work in progress so please file an issue if it fails to correctly load a file that you're interested in.

Installation

# Install the released version from CRAN:
install.packages("haven")

# Install the cutting edge development version from GitHub:
# install.packages("devtools")
devtools::install_github("hadley/haven")

Usage

  • SAS: read_sas("path/to/file")
  • SPSS: read_por("path/to/file"), read_sav("path/to/file")
  • Stata: read_dta("path/to/file")

Updating readstat

If you're working on the development version of haven, and you'd like to update the embedded ReadStat library, you can run the following code. It is not necessary if you're just using the package.

tmp <- tempfile()
download.file("https://github.com/WizardMac/ReadStat/archive/master.zip", tmp, 
  method = "wget")
unzip(tmp, exdir = tempdir())

src <- dir(file.path(tempdir(), "ReadStat-master", "src"), "\\.[ch]$", full.name = TRUE)
file.copy(src, "src/", overwrite = TRUE)
unlink(c("src/readstat_rdata.c", "src/readstat_rdata.h"))

haven's People

Contributors

ajdamico avatar biostatmatt avatar bquast avatar edwindj avatar ehrlinger avatar gorcha avatar hadley avatar jeroen avatar jjallaire avatar kwenzig avatar tklebel avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.