Git Product home page Git Product logo

tm_mgmt's Introduction

From Text Analysis to Actionable Knowledge: Data-intensive Methods for Unstructured and Text-heavy Data

Date: Tuesday, November 29,
Time: 10:00-14:00
Location: 1324, 025, Tvillingeauditorium

Speaker

Hilke Reckman (HR), Morten H.J. Fenger (MHJF), Kristoffer L. Nielbo (KLN),

Program

Time Title Speaker
10:00-10:30 Text Analytics KLN
10:30-11:00 Natural Language Processing HR
11:00-11:30 Models and Algorithms #1 KLN
11:30-12:30 General Discussion KLN/HR
12:00-12:30 Lunch
12:30-13:00 Models and Algorithms #2 KLN
13:00-13:30 Applied Text Mining MHJF
13:30-14:00 General Discussion KLN/MHJF

Keywords

text analytics, business intelligence, language representations,data preparation, sentiment analysis,latent variables, classification,clustering resources and tools, division of labor, domain knowledge

Detailed Program

Text Analytics

Text analytics (~ text mining) is a heterogeneous research field that focuses on extraction of meaningful patterns from unstructured and text-heavy data. The meaningful patterns are typically extracted by applying machine learning to target data sets from large non-relational databases. In this presentation, we will take a look at the composition of text analytics, its generic pipeline, and its potential for social science and humanities (SSH) research.

Natural Language Processing

Natural Language Processing (NLP) offers researchers more and more opportunities to analyze large amounts of text data. In order to take advantage of this it is important to have an understanding of the possibilities and limitations of various techniques, as well as the choices that need to be made along the way. This presentation provides an overview of the most important steps and options, also discussing the division of labor between human and computer in gaining insights from text data.

Models and Algorithms #1

Natural language is fundamentally qualitative and lacks the kind of structure (or model) required by large-scale automated analysis. In order to apply methods from text analytics to natural language data, it is therefore necessary to create a quantitative language model. Such language models typically rely on word probabilities and their co-occurrence structure. In this presentation, we will go through some of the most common language models and simple algorithms for extracting meaningful patterns form word probabilities.

Models and Algorithms #2

Text analytics relies heavily on techniques from machine learning for macro-level document analysis. Machine learning offers a range of techniques for organizing documents according to latent patterns or metadata. An important distinction is between techniques for unsupervised learning, which find grouping in the data independent of previous knowledge, and supervised learning, which maps a set of documents to preexisting classes. In this presentation, we will look at unsupervised learning (clustering and topic modeling) and supervised learning (classifications) and, finally, discuss how to combine these machine learning techniques in SSH research.

Applied Text Mining in Management

Text mining examples will be given in management contexts, including 1) text mining of 1 mio. community text posts to optain proxies for non-available psychographic measure of community goers; 2) inspection for keywords correlated with a binary dependent variable among these community goers; 3) preprocessing and automatic classification of 26,000 news articles about university scholars as "research/findings-driven", "external/commentary", and "otherwise", based on a comprehensive dictionary of keywords obtained from two readers qualitatively classifying 500 articles; and 4) topic modelling of 2,000 press releases related to the above mentioned news articles.

Biographies

HR is a computational semanticist working on Natural Language Processing. She is particularly interested in how people understand language, and tries to investigate this through computational modeling. Computational models of how people deal with language also inform practical applications where computers use human language more intelligently. HR has contributed to Text Mining and Sentiment Analysis tools, and currently works on Information Extraction at UNSILO.

MHJF is a PhD student from ECON, originally enrolled in the MGMT PhD programme. MHJF applies and combines social network analysis and text mining methods, to study not only struture but also content of social networks in business and management contexts. MHJF is interested in understanding complex patterns of human behavior from data rich environments, while acknowleding that quantitative methods alone may oversimplify human behavior. Textual data are recorded in abundance, yet traditional qualitative text-based research methods may be influenced by severe researcher biases, while failing to include all available data - the choice of data (e.g. case companies, and interviewees) may be biased as well. Here text mining both theoretially and methodologically offers valuable triangulation of scientific findings - thus not substituting but potentially complementing traditional qualitative research methods.

KLN is a humanist scholar with specialization in computational and quantitative methods for analysis, interpretation and storage of cultural data. He has participated in a range of collaborative and interdisciplinary research projects involving researchers from the humanities, social sciences, health science, and natural sciences. His research covers two broad areas: automated text analysis and modeling of cultural behavior. Both areas explore the cultural information space in new and innovative ways by combining cultural data and humanities theories with statistics, computer algorithms, and visualization.

tm_mgmt's People

Contributors

knielbo avatar kln-courses avatar hireck avatar mhjfenger avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.