Text Mining with Tidy Data Principles Lessons

Strategic and Competitive Intelligence 2020

by Filippo Chiarello

Re-adapted from: Julia Silge Workshop Rstudio::conf 2020

🗓️ October 2020

Overview

Text data is increasingly important for strategic and competitive intelligence. The reasons are many, but the main one is that most of companies' pubblic information is nowadays in text format. Tidy data principles and tidy tools can make text mining easier, and will let you focus on the most important things of a business or technological realeted analysis: the questions you want to answer.

In these lessons, learn how to manipulate, summarize, and visualize the characteristics of text using these methods and R packages from the tidy tool ecosystem. These tools are highly effective for many analytical questions and allow analysts to integrate natural language processing into effective workflows already in wide use. Explore how to implement approaches such as sentiment analysis of texts, measuring tf-idf, network analysis of words, and building both supervised and unsupervised text models.

Learning objectives

At the end of the lessons, students will understand how to:

Perform exploratory data analyses of text datasets, including summarization and data visualization
Understand and implement both tf-idf and sentiment analysis
Build classification models for text using tidy data principles

Prework

During this lessons, we'll share code and slides via a GitHub repo and code interactively together using an RStudio Cloud project. You can log in to RStudio Cloud via Google credentials, GitHub credentials, or email. Go ahead and log in with your choice of method before we meet so you see what the platform looks like.

Instructor

Filippo Chiarello is a data scientist and researcher at University of Pisa. His research focus is on the use of Natural Language Processing systems for understating technological innovations and its impact on the workforce. He is co-founder of the company Texty, research consultant for Errequadro and part of the Research Lab B4DS

This work is licensed under a Creative Commons Attribution 4.0 International License.

mojitmj / text-mining Goto Github PK

text-mining's Introduction

Text Mining with Tidy Data Principles Lessons

Strategic and Competitive Intelligence 2020

Overview

Learning objectives

Prework

Instructor

text-mining's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent