From Text Analysis to Actionable Knowledge: Data-intensive Methods for Unstructured and Text-heavy Data

Date: Tuesday, November 29,
Time: 10:00-14:00
Location: 1324, 025, Tvillingeauditorium

Speaker

Hilke Reckman (HR), Morten H.J. Fenger (MHJF), Kristoffer L. Nielbo (KLN),

Program

Time	Title	Speaker
10:00-10:30	Text Analytics	KLN
10:30-11:00	Natural Language Processing	HR
11:00-11:30	Models and Algorithms #1	KLN
11:30-12:30	General Discussion	KLN/HR
12:00-12:30	Lunch
12:30-13:00	Models and Algorithms #2	KLN
13:00-13:30	Applied Text Mining	MHJF
13:30-14:00	General Discussion	KLN/MHJF

Keywords

text analytics, business intelligence, language representations,data preparation, sentiment analysis,latent variables, classification,clustering resources and tools, division of labor, domain knowledge

Detailed Program

Text Analytics

Text analytics (~ text mining) is a heterogeneous research field that focuses on extraction of meaningful patterns from unstructured and text-heavy data. The meaningful patterns are typically extracted by applying machine learning to target data sets from large non-relational databases. In this presentation, we will take a look at the composition of text analytics, its generic pipeline, and its potential for social science and humanities (SSH) research.

Natural Language Processing

Natural Language Processing (NLP) offers researchers more and more opportunities to analyze large amounts of text data. In order to take advantage of this it is important to have an understanding of the possibilities and limitations of various techniques, as well as the choices that need to be made along the way. This presentation provides an overview of the most important steps and options, also discussing the division of labor between human and computer in gaining insights from text data.

Models and Algorithms #1

Natural language is fundamentally qualitative and lacks the kind of structure (or model) required by large-scale automated analysis. In order to apply methods from text analytics to natural language data, it is therefore necessary to create a quantitative language model. Such language models typically rely on word probabilities and their co-occurrence structure. In this presentation, we will go through some of the most common language models and simple algorithms for extracting meaningful patterns form word probabilities.

Models and Algorithms #2

Text analytics relies heavily on techniques from machine learning for macro-level document analysis. Machine learning offers a range of techniques for organizing documents according to latent patterns or metadata. An important distinction is between techniques for unsupervised learning, which find grouping in the data independent of previous knowledge, and supervised learning, which maps a set of documents to preexisting classes. In this presentation, we will look at unsupervised learning (clustering and topic modeling) and supervised learning (classifications) and, finally, discuss how to combine these machine learning techniques in SSH research.

Applied Text Mining in Management

Text mining examples will be given in management contexts, including 1) text mining of 1 mio. community text posts to optain proxies for non-available psychographic measure of community goers; 2) inspection for keywords correlated with a binary dependent variable among these community goers; 3) preprocessing and automatic classification of 26,000 news articles about university scholars as "research/findings-driven", "external/commentary", and "otherwise", based on a comprehensive dictionary of keywords obtained from two readers qualitatively classifying 500 articles; and 4) topic modelling of 2,000 press releases related to the above mentioned news articles.

Biographies

HR is a computational semanticist working on Natural Language Processing. She is particularly interested in how people understand language, and tries to investigate this through computational modeling. Computational models of how people deal with language also inform practical applications where computers use human language more intelligently. HR has contributed to Text Mining and Sentiment Analysis tools, and currently works on Information Extraction at UNSILO.

MHJF is a PhD student from ECON, originally enrolled in the MGMT PhD programme. MHJF applies and combines social network analysis and text mining methods, to study not only struture but also content of social networks in business and management contexts. MHJF is interested in understanding complex patterns of human behavior from data rich environments, while acknowleding that quantitative methods alone may oversimplify human behavior. Textual data are recorded in abundance, yet traditional qualitative text-based research methods may be influenced by severe researcher biases, while failing to include all available data - the choice of data (e.g. case companies, and interviewees) may be biased as well. Here text mining both theoretially and methodologically offers valuable triangulation of scientific findings - thus not substituting but potentially complementing traditional qualitative research methods.

KLN is a humanist scholar with specialization in computational and quantitative methods for analysis, interpretation and storage of cultural data. He has participated in a range of collaborative and interdisciplinary research projects involving researchers from the humanities, social sciences, health science, and natural sciences. His research covers two broad areas: automated text analysis and modeling of cultural behavior. Both areas explore the cultural information space in new and innovative ways by combining cultural data and humanities theories with statistics, computer algorithms, and visualization.

kln-courses / tm_mgmt Goto Github PK

tm_mgmt's Introduction

From Text Analysis to Actionable Knowledge: Data-intensive Methods for Unstructured and Text-heavy Data

Speaker

Program

Keywords

Detailed Program

Text Analytics

Natural Language Processing

Models and Algorithms #1

Models and Algorithms #2

Applied Text Mining in Management

Biographies

tm_mgmt's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent