Git Product home page Git Product logo

eksir's Introduction

# eksiR: An R package to scrape data from eksisozluk.com

## Installation

```{r}
devtools::install_github("sefabey/eksiR",force=T)
library(eksiR)
```

## Example usage

The goal of this package is to make scraping data from eksisozluk.com easier in R. 

You can scrape entries by providing entry IDs:

```{r}
eksiR::eksi_scrape_entry(1789751)
```

You can also export scraped entries into a .csv file:
```{r}
eksiR::eksi_scrape_entry(1789751,export_csv = T)
```

If you would like to scrape multiple entries and return a neat tibble:

```{r}
entry_ids <- 1:20
purrr::map_df(entry_ids, eksi_scrape_entry)
```

As default, removed/deleted entries are retained. If you want to remove them use:

```{r}
entry_ids <- 1:20
purrr::map_df(entry_ids, eksi_scrape_entry) %>% 
  na.omit()
```

You can also scrape topics:
```{r}
eksi_scrape_topic("https://eksisozluk.com/tayyip-erdogan-fatih-sarac-ses-kaydi--4224837")
eksi_scrape_topic("https://eksisozluk.com/tayyip-erdogan-fatih-sarac-ses-kaydi--4224837",export_csv = T)
```

You can also use utility functions to parse eksisozluk specific date/time objects. This function adds two extra columns to tibbles, namely `original_date_time` and `edited_date_time`
```{r}
eksi_scrape_topic("https://eksisozluk.com/tayyip-erdogan-fatih-sarac-ses-kaydi--4224837") %>% 
  eksi_date_time_parse()
```



## But, What is eksisozluk?

[Wikipedia](https://en.wikipedia.org/wiki/Ek%C5%9Fi_S%C3%B6zl%C3%BCk) defines eksisozluk as:

*Ekşi Sözlük (Turkish pronunciation: [ecˈʃi søzˈlyc]; "Sour Dictionary") is a collaborative hypertext "dictionary" based on the concept of Web sites built up on user contribution.[1] However, Ekşi Sözlük is not a dictionary in the strict sense; users are not required to write correct information. It is currently one of the largest online communities in Turkey with over 400,000 registered users.[2] The number of writers is about 54,000. As an online public sphere, Ekşi Sozluk is not only utilized by thousands for information sharing on various topics ranging from scientific subjects to everyday life issues, but also used as a virtual socio-political community to communicate disputed political contents and to share personal views.[3]*

*The website's founder is Sedat Kapanoğlu. He founded the website for communicating with his friends in 1999 as he was inspired by The Hitchhiker's Guide to the Galaxy.[4][5] Ekşi Sözlük has been successful, many other websites that use this concept has emerged, like İTÜ Sözlük in Turkish.[6]*

*Enrollment periods to the dictionary and the criteria of acceptance are changeable.[7] Ekşi Sözlük does not accept new authors regularly; there are specific times in which new authors are accepted. There is a waiting period for new members who want to become authors in which they must post at least 10 entries. All entries are inspected according to the dictionary rules and their quality, and if they are pass inspection, the new user becomes an author. However, this process might take from months to years.*


AFAIK, there is no popular platform outside Turkey thay may act as eksisozluk's exact counterpart. Although closest candidate would be urbandictionary.com, it must be noted that there are major differences between both platforms. Despite the lack of global attention to platform's format, eksisozluk is one of the 16th most visited websites in Turkey, according to [Alexa](https://www.alexa.com/topsites/countries/TR).

eksir's People

Contributors

brennanpincardiff avatar sefabey avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.