Git Product home page Git Product logo

stringr.plus's Issues

can't install your packages

thank you for your excellent work, but I have some problems.

R version 4.3.0 (2023-04-21 ucrt) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

remotes::install_github("johncassil/stringr.plus")
Downloading GitHub repo johncassil/stringr.plus@HEAD
── R CMD build ─────────────────────────────────────────────
✔ checking for file 'C:\Users\yongfengshi\AppData\Local\Temp\Rtmp8gU6Pt\remotesc879e13c09\johncassil-stringr.plus-111503d/DESCRIPTION' (742ms) ─ preparing 'stringr.plus':
✔ checking DESCRIPTION meta-information ... ─ checking for LF line-endings in source and make files and shell scripts ─ checking for empty or unneeded directories Omitted 'LazyData' from DESCRIPTION ─ building 'stringr.plus_0.1.2.tar.gz' 将程序包安装入‘C:/Users/yongfengshi/AppData/Local/R/win-library/4.3’ (因为‘lib’没有被指定) '\Mac\Home\Documents' ����Ϊ��ǰĿ¼������·������� CMD.EXE�� UNC ·������֧�֡�Ĭ��ֵ��Ϊ Windows Ŀ¼�� * installing source package 'stringr.plus' ... ** using staged installation ** R ** byte-compile and prepare package for lazy loading
ERROR: lazy loading failed for package 'stringr.plus' * removing 'C:/Users/yongfengshi/AppData/Local/R/win-library/4.3/stringr.plus' Warning message: In i.p(...) : 安装程序包‘C:/Users/YONGFE~1/AppData/Local/Temp/Rtmp8gU6Pt/filec84c7c4650/stringr.plus_0.1.2.tar.gz’时退出狀態的值不是0

 

| >

New function idea: str_context

Hi there,

A while back I wrote a function to return a window of a given size around a pattern. This is helpful for understanding the context of a detected string (e.g. how is this string used in my data, especially for very long blocks of text) as well as detecting false positives.

For example:
str_context(string = "In a hole in the ground there lived a hobbit.", pattern = "ground", window_size = 6)
would return "...n the ground there..."

There would also be a parameter for how many matches to return.

Does this sound like an interesting/useful function for your package? If so I'd be happy to submit as a Hacktoberfest PR. Thanks!

Installation information in README

Hi,

Thanks for your great package! It has spared me a lot of googling about regex.

I had a little trouble installing the package at first, because I was using a slightly older version of devtools which expected the default branch to be called 'master' rather than 'main':

devtools::install_github("johncassil/stringr.plus")
Error: Failed to install 'unknown package' from GitHub:
  HTTP error 404.
  No commit found for the ref master

  Did you spell the repo owner (`johncassil`) and repo name (`stringr.plus`) correctly?
  - If spelling is correct, check that you have the required permissions to access the repo.

It's not a problem in the latest release of devtools, but for someone unknowingly using an older version, it's a very minor headache.

Would it be possible to update the installation information in the README to

devtools::install_github("johncassil/stringr.plus", ref = "main")

to prevent anyone else with a similarly old version from having the same issue?

Thanks!

Alternative implementation: lookahead/lookbehind

Have you considered leveraging the power of lookahead/lookbehind? It looks like it might be easier to maintain.

str_before <- function(string, pattern, n = NULL) {
  n_str <- ifelse(is.null(n), "*?", glue::glue("{<n>}", .open = "<", .close = ">"))
  new_pattern <- glue::glue(".{n_str}(?={pattern})")
  stringr::str_extract(string, new_pattern)
}
str_before("www.carfax.com/vehicle/3GCPKTE77DG348900", "/")
#> [1] "www.carfax.com"
str_before("www.carfax.com/vehicle/3GCPKTE77DG348900", ".com")
#> [1] "www.carfax"
str_before("www.carfax.com/vehicle/3GCPKTE77DG348900", ".com", 6)
#> [1] "carfax"

str_after <- function(string, pattern, n = NULL) {
  n_str <- ifelse(is.null(n), "*", glue::glue("{<n>}", .open = "<", .close = ">"))
  new_pattern <- glue::glue("(?<={pattern}).{n_str}")
  stringr::str_extract(string, new_pattern)
}
str_after("www.carfax.com/vehicle/3GCPKTE77DG348900", "/")
#> [1] "vehicle/3GCPKTE77DG348900"
str_after("www.carfax.com/vehicle/3GCPKTE77DG348900", "vehicle/")
#> [1] "3GCPKTE77DG348900"
str_after("www.carfax.com/vehicle/3GCPKTE77DG348900", "vehicle/", n = 6)
#> [1] "3GCPKT"

Created on 2020-08-14 by the reprex package (v0.3.0)

Also, have you considered submitting such a proposal to {stringr} itself? I see it's not something {stringr} is interested in (at least in 2018): tidyverse/stringr#222

I use this pattern all the time, so I'm interested in it.

Vectorization weirdness

I would like someone to check all functions in the package and make sure that they work when applied to vectors, per @abidawson

I did a quick test and was unsure why I got these results:

`> test_vector <- c('www.carfax.com/vehicle/3GCPKTE77DG348900', 'www.carfax.com/vehicle/3GCPKTE77DG348900', 'www.carfax.com/vehicle/3GCPKTE77DG348900', 'www.carfax.com/vehicle/3GCPKTE77DG348900')

stringr.plus::str_extract_before('www.carfax.com/vehicle/3GCPKTE77DG348900', "vehicle")
[1] "www.carfax.com/"
stringr.plus::str_extract_after('www.carfax.com/vehicle/3GCPKTE77DG348900', "vehicle")
[1] "/3GCPKTE77DG348900"
stringr.plus::str_extract_before(test_vector, "vehicle")
[1] "www.carfax.com/" "www.carfax.com/" "www.carfax.com/" "www.carfax.com/"
stringr.plus::str_extract_after(test_vector, "vehicle")
[1] "ehicle/3GCPKTE77DG348900" "ehicle/3GCPKTE77DG348900" "ehicle/3GCPKTE77DG348900"
[4] "ehicle/3GCPKTE77DG348900"`

Add which = "first"/"last" arg to before/after functions

The way these functions are set up might lead to some confusing behaviour:

library(stringr.plus)
url <- 'www.carfax.com/vehicle/3GCPKTE77DG348900'

str_extract_before(string = url, pattern = '/')
#> [1] "www.carfax.com"

What if we wanted everything before the last "/" ?
Likewise:

str_extract_after(string = url, pattern = '/')
#> [1] "vehicle/3GCPKTE77DG348900"

what if we wanted everything after the last "/" ?

The default is finding the first location of the pattern and using that, but we could add in a "which" argument that accepts "first" and "last" with the default "first" for finer grain selection. Maybe it could also take a number and extract the text before/after the nth occurrence of a pattern (for cases when you know there are seven slashes/underscores and you want the data after the 5th.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.