Git Product home page Git Product logo

Comments (13)

gulmira19 avatar gulmira19 commented on July 19, 2024 1

@kbenoit ,
Thank you for your quick response. I checked .libPaths() for both R Studio and R Console, and now I have achieved consistency between the R Studio and R Console. In both cases, the packages are installed in my versioned library. Although the issue still persists on R Console, I am happy that quanteda works seamlessly in R Studio. I could have done more if I had more technical expertise, so I am closing this thread. Anyway, your support is greatly appreciated, and I am grateful for the prompt assistance! Wishing you the best in your endeavors.

from quanteda.

koheiw avatar koheiw commented on July 19, 2024

In the upcoming version, RcppParallel::defaultNumThreads(), which is causing the error, is not used. Can you install from Github and test?

from quanteda.

gulmira19 avatar gulmira19 commented on July 19, 2024

Hello, koheiw!
Thank you for your suggestions! I have installed "RcppCore/RcppParallel" from the Github. After that, I have been able to successfully install and call quanteda. However, when I tried to perform tokenization, I faced the same seg fault:

_> library(RcppParallel)

library(quanteda)
Package version: 3.3.1
Unicode version: 14.0
ICU version: 71.1
Parallel computing: 8 of 8 threads used.
See https://quanteda.io for tutorials and examples.

texts <- c("I love programming in R.", "Text analysis is interesting.", "R is a powerful language.")
corpus <- corpus(texts)
tokens <- tokens(corpus)

*** caught segfault ***
address 0x9ffffffe7, cause 'invalid permissions'

Traceback:
1: qatd_cpp_tokens_select(x, type, ids, 2, padding, window[1], window[2], startpos, endpos)
2: tokens_select.tokens(x, ..., selection = "remove")
3: tokens_select(x, ..., selection = "remove")
4: tokens_remove(x, removals[["separators"]], valuetype = "regex", verbose = FALSE)
5: tokens.tokens(result, remove_punct = remove_punct, remove_symbols = remove_symbols, remove_numbers = remove_numbers, remove_url = remove_url, remove_separators = remove_separators, split_hyphens = FALSE, split_tags = FALSE, include_docvars = TRUE, padding = padding, verbose = verbose)
6: tokens.corpus(corpus)
7: tokens(corpus)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:_

I then tried to reduce the number of threads to 1. However, it still remained equal to 8, and the issue with the seg fault persisted:

_> library(RcppParallel)

num_threads <- RcppParallel::defaultNumThreads()
cat("Default number of threads:", num_threads, "\n")
Default number of threads: 8
RcppParallel::setThreadOptions(numThreads = 1)
actual_num_threads <- RcppParallel::defaultNumThreads()
num_threads_after_setting = RcppParallel::defaultNumThreads()
cat("Number of threads after setting:", num_threads_after_setting, "\n")
Number of threads after setting: 8_

Do you have any further ideas to resolve this issue? Thank you.

from quanteda.

koheiw avatar koheiw commented on July 19, 2024

Your quanteda is still v3x, please install it from Github too.

# remotes package required to install quanteda from Github 
remotes::install_github("quanteda/quanteda") 

from quanteda.

gulmira19 avatar gulmira19 commented on July 19, 2024

koheiw, thank you so much for your prompt response and valuable suggestion. I tried to install quanteda from GitHub, but I received a warning message about the non-zero exit status that prevented the installation:

Downloading GitHub repo quanteda/quanteda@HEAD
── R CMD build
building ‘quanteda_4.0.0.tar.gz’ation ...x/l8408syn1x95x6wvrp35d9g00000gn/T/RtmpM7FzQB/remotesef067af610b/quanteda-quanteda-cb80e23/DESCRIPTION’ ...

[...]

Warning message:
In i.p(...) :
installation of package ‘/var/folders/bx/l8408syn1x95x6wvrp35d9g00000gn/T//RtmpWKR3pI/filedbb6b127bb6/quanteda_4.0.0.tar.gz’ had non-zero exit status

from quanteda.

koheiw avatar koheiw commented on July 19, 2024

Do you have all those tools installed to compile the code on your machine?

https://cran.r-project.org/bin/macosx/

from quanteda.

gulmira19 avatar gulmira19 commented on July 19, 2024

Hello again, koheiw! I'm sorry it took me longer to follow your suggestion and respond. I appreciate your assistance!

I've installed R 4.3.2 binary for macOS 11 and XQuartz. When I tried to install binaries "Big Sur" for arm64-based Macs, I could not find quanteda in the list of packages "contrib" - I hope it's not a cause of a problem.

After that, I reinstalled quanteda from the GitHub again and faced the same warning message as I wrote above.

If there are any specific binaries or tools from your link above that I need to install? Thanks a lot.

from quanteda.

koheiw avatar koheiw commented on July 19, 2024

@kbenoit do you have any idea?

from quanteda.

kbenoit avatar kbenoit commented on July 19, 2024
  1. Can you paste the first three lines of output from starting up R? e.g.
R version 4.3.2 (2023-10-31) -- "Eye Holes"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20 (64-bit)

  1. what is the output of .libPaths()? e.g.
> .libPaths()
[1] "/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library"

sometimes conflicting installations can exist if you have a local path defined as well.

As for there not being an arm64 binary, that's just not the case. There is definitely that binary built and on CRAN.

from quanteda.

gulmira19 avatar gulmira19 commented on July 19, 2024

Hello, @kbenoit ! Thank you for your assistance.

  1. R version 4.3.2 (2023-10-31) -- "Eye Holes"
    Copyright (C) 2023 The R Foundation for Statistical Computing
    Platform: aarch64-apple-darwin20 (64-bit)

  2. [1] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"
    Can it cause a problem if I have an older library path and a newer version of R?

from quanteda.

kbenoit avatar kbenoit commented on July 19, 2024

Yes, I suspect that is the problem. Check your .R / .Rprofile etc files to see if that libpath is hardwired. If you remove the hard path reference, then packages will install to your versioned library automatically.

from quanteda.

gulmira19 avatar gulmira19 commented on July 19, 2024

Hello, @kbenoit and @koheiw

Thank you so much for your guidance on this issue.

I removed the hard path reference and updated R to the latest version. I also downloaded R Studio (previously, I used R console only). Luckily, there has been no issue with quanteda on R Studio - I am very happy about it!!!

install.packages("quanteda")
trying URL 'https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/quanteda_3.3.1.tgz'
Content type 'application/x-gzip' length 4232594 bytes (4.0 MB)
==================================================
downloaded 4.0 MB

The downloaded binary packages are in
/var/folders/bx/l8408syn1x95x6wvrp35d9g00000gn/T//RtmpZGYJkn/downloaded_packages

library(quanteda)
Package version: 3.3.1
Unicode version: 14.0
ICU version: 71.1
Parallel computing: 8 of 8 threads used.
See https://quanteda.io for tutorials and examples.
txt <- c(doc1 = "A sentence, showing how tokens() works.",

  •      doc2 = "@quantedainit and #textanalysis https://example.com?p=123.",
    
  •      doc3 = "Self-documenting code??",
    
  •      doc4 = "£1,000,000 for 50¢ is gr8 4ever \U0001f600")
    

tokens(txt)
Tokens consisting of 4 documents.
doc1 :
[1] "A" "sentence" "," "showing" "how" "tokens" "("
[8] ")" "works" "."

doc2 :
[1] "@quantedainit" "and"
[3] "#textanalysis" "https://example.com?p=123."

doc3 :
[1] "Self-documenting" "code" "?" "?"

doc4 :
[1] "£" "1,000,000" "for" "50" "¢" "is"
[7] "gr8" "4ever" "😀"

However, when I tried to run the same code on R Console, it showed the same seg fault I encountered before:

library(quanteda)
Package version: 3.3.1
Unicode version: 14.0
ICU version: 71.1
Parallel computing: 8 of 8 threads used.
See https://quanteda.io for tutorials and examples.
txt <- c(doc1 = "A sentence, showing how tokens() works.",

  •      doc2 = "@quantedainit and #textanalysis https://example.com?p=123.",
    
  •      doc3 = "Self-documenting code??",
    
  •      doc4 = "£1,000,000 for 50¢ is gr8 4ever \U0001f600")
    

tokens(txt)

*** caught segfault ***
address 0x9ffffffe7, cause 'invalid permissions'

Traceback:
1: qatd_cpp_tokens_select(x, type, ids, 2, padding, window[1], window[2], startpos, endpos)
2: tokens_select.tokens(x, ..., selection = "remove")
3: tokens_select(x, ..., selection = "remove")
4: tokens_remove(x, removals[["separators"]], valuetype = "regex", verbose = FALSE)
5: tokens.tokens(result, remove_punct = remove_punct, remove_symbols = remove_symbols, remove_numbers = remove_numbers, remove_url = remove_url, remove_separators = remove_separators, split_hyphens = FALSE, split_tags = FALSE, include_docvars = TRUE, padding = padding, verbose = verbose)
6: tokens.corpus(corpus(x), what = what, remove_punct = remove_punct, remove_symbols = remove_symbols, remove_numbers = remove_numbers, remove_url = remove_url, remove_separators = remove_separators, split_hyphens = split_hyphens, split_tags = split_tags, include_docvars = include_docvars, padding = padding, verbose = verbose, ...)
7: tokens.character(txt)
8: tokens(txt)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:

Still, I am glad that it works well on R studio. Any thoughts on why R Studio is doing a better job than R Console?
Thanks.

from quanteda.

kbenoit avatar kbenoit commented on July 19, 2024

For whatever reason, R console and RStudio are reading different environment variables that determine where your packages are installed. You can compare both using .libPaths() within each instance of running R (console v. RStudio). Check your various .R files to make sure there is not a different one somewhere that is affecting R console's R versus Rstudio.

from quanteda.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.