Git Product home page Git Product logo

text-mining's Introduction

Text Mining with Tidy Data Principles Lessons

Strategic and Competitive Intelligence 2020

by Filippo Chiarello

Re-adapted from: Julia Silge Workshop Rstudio::conf 2020


๐Ÿ—“๏ธ October 2020


Overview

Text data is increasingly important for strategic and competitive intelligence. The reasons are many, but the main one is that most of companies' pubblic information is nowadays in text format. Tidy data principles and tidy tools can make text mining easier, and will let you focus on the most important things of a business or technological realeted analysis: the questions you want to answer.

In these lessons, learn how to manipulate, summarize, and visualize the characteristics of text using these methods and R packages from the tidy tool ecosystem. These tools are highly effective for many analytical questions and allow analysts to integrate natural language processing into effective workflows already in wide use. Explore how to implement approaches such as sentiment analysis of texts, measuring tf-idf, network analysis of words, and building both supervised and unsupervised text models.

Learning objectives

At the end of the lessons, students will understand how to:

  • Perform exploratory data analyses of text datasets, including summarization and data visualization
  • Understand and implement both tf-idf and sentiment analysis
  • Build classification models for text using tidy data principles

Prework

During this lessons, we'll share code and slides via a GitHub repo and code interactively together using an RStudio Cloud project. You can log in to RStudio Cloud via Google credentials, GitHub credentials, or email. Go ahead and log in with your choice of method before we meet so you see what the platform looks like.

Instructor

Filippo Chiarello is a data scientist and researcher at University of Pisa. His research focus is on the use of Natural Language Processing systems for understating technological innovations and its impact on the workforce. He is co-founder of the company Texty, research consultant for Errequadro and part of the Research Lab B4DS


This work is licensed under a Creative Commons Attribution 4.0 International License.

text-mining's People

Contributors

juliasilge avatar mine-cetinkaya-rundel avatar filippochiarello avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.