Git Product home page Git Product logo

getlattesdata's Introduction

About me

I'm an associate professor of Finance at EA/UFRGS.

You can find more details about my work at my personal site.

Fell free to reach me at [email protected].

getlattesdata's People

Contributors

hadley avatar msperlin avatar regisely avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

getlattesdata's Issues

Erro na leitura do curriculum lattes

Quanto tento reproduzir o exemplo que esta na pagina do github observo que é feito o download de 4 arquivos, correspondendo aos 4 pesquisadores mas o tamanho do arquivo .zip é de apenas 92 bytes e em seguida vem a mensagem de erro:

XML content does not seem to be XML: 'C:\Users\elpid\AppData\Local\Temp\Rtmp4mUPyN/curriculo.xml'
In addition: Warning message:
In utils::unzip(zip.in, exdir = my.tempdir) :
error 1 in extracting from zip file

Error in zip file reading

A função gld_get_lattes_data_from_zip funcionou bem, mas em alguns CVs está aparecendo o erro abaixo e não consegui identificar a causa:
Found 11 published papers
Found 0 accepted paper(s)
Found 25 supervisions
Found 0 published books
Found 1 book chapters
Found 55 conference papers Error in $<-.data.frame(*tmp*, "SJR", value = NA) :
replacement has 1 row, data has 0
In addition: Warning message:
In is.na(idx) : is.na() applied to non-(list or vector) of type 'NULL'

Orientações

Nas orientações, poderia acrescentar o tipo de orientação (orientador principal ou coorientador)? Pois como não tem esse campo, não consigo filtrar apenas as orientações, que é um ponto importante quando faço as análises no nosso programa. Obrigado.

Erro ao baixar o curriculo

Pessoal,

Gostaria que ajudassem a resolver o erro que estou tendo ao tentar realizar do download dos currículos ao passar uma lista com os ids.

Erro: XML content does not seem to be XML: '/tmp/Rtmp1TJ66h/curriculo.xml' Além disso: Warning message: In utils::unzip(zip.in, exdir = my.tempdir) : erro 1 na extração a partir de arquivo zip

Obrigado

Multiple ISSN

Olá Marcelo,

Estou tendo problemas com revistas que possuem múltiplos ISSNs.
Creio que se fizer duas modificações pode solucionar o problema (veja abaixo). Espero ter ajudado.
Abraço

  1. função "gld_get_SJR", remover linhas 82 a 84:
    df.sjr$Issn <- paste0(stringr::str_sub(df.sjr$Issn, 1, 4),
    '-',
    stringr::str_sub(df.sjr$Issn, 5, 8))

  2. função "gld_get_lattes_data",
    Linhas 84 e 94:
    substituir "idx <- match(tpublic.published$ISSN, df.sjr$Issn)" por:
    idx <- unlist(sapply(gsub("-", "", tpublic.published$ISSN),
    function(i, x){
    r <- grep(i, x)
    if(length(r) == 0){
    r <- NA
    }
    return(r)
    } , df.sjr$Issn, USE.NAMES=F))

Linha 87 e 98:
substituir "idx <- match(tpublic.accepted$ISSN, df.sjr$Issn)" por:
idx <- unlist(sapply(gsub("-", "", tpublic.accepted$ISSN),
function(i, x){
r <- grep(i, x)
if(length(r) == 0){
r <- NA
}
return(r)
} , df.sjr$Issn, USE.NAMES=F))

Some Lattes Id have less than 10 characters

Hi, thanks for this library. I was testing it and some lattes id I've analysed doesn't have 10 characters. The library doesn't work in this case. Maybe a fix could be a good idea.

Again, thanks for the library. Great job!

Leitura de Lattes extraído do serviço de Extração da Plataforma Lattes

Os currículos extraídos do serviço de extração da Plataforma Lattes para instituições conveniadas apresentam estrutura diferente dos currículos baixados via web manualmente. Um exemplo pode ser visto neste arquivo anexo:
curriculo.zip

Existe a possibilidade de flexibilizar o GetLttesData para este escopo?

Aproveito para parabenizá-lo pelo pacote!:)

Problems reading data based on Qualis and from Lattes

Thanks for your package. It really helps in generating new graduate program proposals, and reports for Capes.

I had two problems when using your package:

  1. Apparently, professores that haven’t done Masters, and went straight for phD the data were not readed from the zip file.
    Error in names(MESTRADO) <- paste0("MSC-", names(MESTRADO)) : attempt to set an attribute on NULL

  2. Problems reading Qualis dataset, when for example, the name of the journal is not exactly as the one in the Qualis table. Had problems a few times with the (PRINT) that is not always in the Lattes, example:

AGROECOLOGY AND SUSTAINABLE FOOD SYSTEMS (PRINT) in Qualis Table
AGROECOLOGY AND SUSTAINABLE FOOD SYSTEMS in Lattes CVs

I will make a pull request.

Thanks once again

a funcao trava no captcha

Oi Perlin,
Tentei rodar o seu tutorial, mas a função "gld_download_lattes_files" não consegue baixar os arquivos, ela abre, em "readLines", a página do captcha. Na descrição do pacote diz que não há necessidade de manualmente preencher o captcha, como você resolve isso ?

Abraços,

Lattes site changed and so...

Hello again,
Just wondering if this is a problem of the new designed site of Lattes or not. As an example, my own department, UFPel and the error message.

By the way, thank you for the new separation between accepted and published functions.

my.ids <- c('K4799675Y3','K4790117H4','K4425685D9','K4248240A9', 'K4719590J1', 'K4231130J6', 'K4790338E3','K4776653U5','K4162067D9', 'K4792802A6')
field.qualis='ECONOMIA'
l.out <- gld_get_lattes_data(id.vec = my.ids, field.qualis = field.qualis)

Downloading file C:\Users\CDSHI_~1\AppData\Local\Temp\RtmpgV4i5j/K4799675Y3_2017-11-16.zipError in utils::download.file(url = my.link, destfile = dest.file, quiet = T, :
cannot open URL 'http://buscacv.cnpq.br/buscacv/rest/download/curriculo/K4799675Y3'
In addition: Warning message:
In utils::download.file(url = my.link, destfile = dest.file, quiet = T, :
unable to resolve 'buscacv.cnpq.br'

Example of usage is not working

I tried to use this package after CPNq removed captcha last week, but the package example of usage is not working for me. I run

library(GetLattesData)

# ids from EA-UFRGS
my.ids <- c('K4713546D3', 'K4440252H7', 
            'K4783858A0', 'K4723925J2')

# qualis for the field of management
field.qualis = 'ADMINISTRAÇÃO PÚBLICA E DE EMPRESAS, CIÊNCIAS CONTÁBEIS E TURISMO'

l.out <- gld_get_lattes_data(id.vec = my.ids, field.qualis = field.qualis)

and the output I get is

Downloading file  /var/folders/gc/8hj8csfj73j4l4w93r1f0_vw0000gn/T//RtmpXTuPoH/K4713546D3_2018-08-16.zip
Downloading file  /var/folders/gc/8hj8csfj73j4l4w93r1f0_vw0000gn/T//RtmpXTuPoH/K4440252H7_2018-08-16.zip
Downloading file  /var/folders/gc/8hj8csfj73j4l4w93r1f0_vw0000gn/T//RtmpXTuPoH/K4783858A0_2018-08-16.zip
Downloading file  /var/folders/gc/8hj8csfj73j4l4w93r1f0_vw0000gn/T//RtmpXTuPoH/K4723925J2_2018-08-16.zip
Reading  K4713546D3_2018-08-16.zipError: XML content does not seem to be XML: '/var/folders/gc/8hj8csfj73j4l4w93r1f0_vw0000gn/T//RtmpXTuPoH/curriculo.xml'
In addition: Warning message:
In utils::unzip(zip.in, exdir = my.tempdir) :
  error 1 in extracting from zip file

For what is worth, this is my sessionInfo() output:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] GetLattesData_1.0

loaded via a namespace (and not attached):
[1] compiler_3.5.1 magrittr_1.5   tools_3.5.1    curl_3.2      
[5] stringi_1.2.4  stringr_1.3.1  XML_3.98-1.15

I am using the development version of the package.

Still getting errors?

Hi Marcelo,
it seems that Lattes is back again without captcha, but I'm still getting an error when running the example script.

After downloading the data from the IDs, I got this:

Error: XML content does not seem to be XML: '/var/folders/kd/_k75gv855qxbyrqq0rtqk1w40000gr/T//Rtmpy6VvBp/curriculo.xml'
In addition: Warning message:
In utils::unzip(zip.in, exdir = my.tempdir) :
  error 1 in extracting from zip file

Is this the same error of the time with captcha or I'm doing something wrong?

add NOME-EM-CITACOES to output

  • in the xml:

NOME-COMPLETO="Ana Claudia Bonatto" NOME-EM-CITACOES-BIBLIOGRAFICAS="BONATTO, A. C.;Bonatto, Ana C;Bonatto, Ana Claudia;Bonatto, Ana C.

  • for all pbulciations data, add the co-authors abreviations for cross-matching

Update Qualis and JCR

Is it possible to update the Qualis and JCR tables for the last version? Or is it possible to update it locally without download the original code and change the respective lines where QUALIS and JCR tables are invocated?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.