Git Product home page Git Product logo

Ryan Hildebrandt, MS



At a Glance

Broadly, I'm an interdisciplinary data scientist with special interests in psycholinguistics, computational linguistics, statistics, and Japanese. More specifically, I apply statistical and natural language processing techniques to solve problems and enrich my and my team's understanding of whatever data is at hand. I bring a strong statistical background to my work, whether in the form of data handling, ad-hoc statistical analyses, or machine learning applications.


Skills

  • Programming (R, Python, SQL, Ruby, Julia, MATLAB, SAS, Git, VBA, Regex)

  • Data analysis (MiniTab, SPSS, Tableau, KNIME, Google Data Studio, Qualtrics, Excel, Access, Power BI)

  • Research (Research design, data visualization, scientific writing)

  • Statistics (Bayesian & frequentist approaches, multiple regression, ANOVA, SEM/MLM)

  • Language (Advanced Japanese, language instruction, natural language processing)


Resume & CV

Take a look at my Resumes & CV


Projects

My larger projects, more immediately relevant to my research interests and past academic work:

  • particles, contextual particle frequency in written Japanese, taking a swing at the age old question of は vs が
  • thesis, the scripts and analyses from my graduate thesis work, "Investigating Emotion-label and Emotion-laden Words in a Semantic Satiation Paradigm"
  • aozora_corpus, a compilation of Japanese texts pulled from 青空文庫, also available on kaggle
  • embs, a project to provide tools streamlining sentence embedding or clustering techniques
  • siftr, a Shiny app using SIF sentence embeddings to separate out unwanted text data
  • priors, an experiment combining pretrained and bag of words embeddings to incorporate prior semantic knowledge
  • bowts, an experiment combining pretrained and bag of words embedding approaches for embedding vector space manipulation
  • iterate, iterative clustering for sklearn clusterers
  • topics, experiments and utilities for text topic extraction using decision trees
  • simsort, sorting texts by semantic similarity

Some stats/NLP/dataviz side work I've done, partially of personal interest, partially to learn different data techniques, and partially to serve as a quick reference in my day to day work:

  • aozora_annotator, text annotator for Aozora Bunko corpus texts
  • probs, bayesian modeling quick reference for pymc, bambi, rstan, and rstanarm
  • dists, simple reference and tools for working with probability distributions
  • trendsim, simulating social media traffic for Japanese authors using Markov Chains, MongoDB, and Kafka
  • genji, character networks in The Tale of Genji
  • shrimp, a bayesian time series analysis of some very specific tweets
  • radicals, some experiments with embedding kanji in vector spaces based on radical composition, readings, and meanings
  • hanakotoba, a project looking at the use of 花言葉 in literature
  • yoji, applying neural networks to generate novel 四字熟語 idioms
  • ebook_tokenizer, a command to add spaces between Japanese words in eBooks to work with Kindle WordWise
  • kyoto, an exploration of restaurants around train/subway stations in Kyoto
  • manyogana, an application to translate Japanese text to a modern implementation of manyogana, as well as converting arabic numerals to kansuuji
  • movies, a dataviz/exploration dashboard for the 10,000 movies dataset
  • michelin, exploration of Michelin star restaurants
  • tea_temps, a quick dataviz and reference for getting a good cup of tea

Hire Me

If you're in need of tutoring or consultation for any of the following topics, please get in touch! I've worked with students ranging from high school to PhD level, both in person and digitally.

  • Statistics

  • Psychology

  • Japanese

  • Linguistics

  • Natural Language Processing

  • R, Python, Julia, SQL, Ruby

  • Machine Learning


Contact & Connect

Github - LinkedIn - ResearchGate
[email protected]

ryancahildebrandt's Projects

aozora_corpus icon aozora_corpus

Centuries of Japanese literature, all in one convenient csv

bowts icon bowts

An experiment combining pretrained and bag of words embedding approaches for embedding vector space manipulation

cheats icon cheats

Community-sourced cheatsheets for navi

dists icon dists

Simple reference and tools for working with probability distributions

dotfiles icon dotfiles

Dotfiles and configs for my commonly used programs

ebook_tokenizer icon ebook_tokenizer

Add spaces between Japanese words in eBooks to work with Kindle WordWise

embs icon embs

A project to provide tools streamlining sentence embedding or clustering techniques

genji icon genji

Character networks in The Tale of Genji

hanakotoba icon hanakotoba

Exploring 花言葉 in Japanese and other literary corpora

iterate icon iterate

Iterative clustering for sklearn clusterers

kyoto icon kyoto

Restaurants and stations in Kyoto

manyogana icon manyogana

万葉仮名 & 漢数字 transliteration functions & RShiny app

michelin icon michelin

Exploration of Michelin star restaurants

movies icon movies

A dataviz/exploration dashboard with the 10,000 Movies dataset

particles icon particles

Contextual particle frequency in written Japanese, taking a swing at the age old question of は vs が

priors icon priors

An experiment combining pretrained and bag of words embedding approaches for text classification

probs icon probs

Bayesian modeling quick reference for pymc, bambi, rstan, and rstanarm

radicals icon radicals

Playing with different embedding techniques & kanji

resume icon resume

My resume without that pesky single page convention

shrimp icon shrimp

Tracking shrimp fried rice tweets over time

siftr icon siftr

Using SIF sentence embeddings to separate out unwanted text data

simsort icon simsort

Sorting texts by semantic similarity

tea_temps icon tea_temps

Quick table & plots of recommended tea steeping temperature by type

templates icon templates

Simple document and documentation templates

thesis icon thesis

Scripts from my graduate thesis, looking at emotion word processing.

topics icon topics

Experiments and utilities for text topic extraction using decision trees

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.