Code used for experiments in my Master Thesis project that was done in collaboration with NRK:
Automatic Topic Generation for Broadcasters: Usable Metadata from Topic Models on Systematically Preprocessed TV Subtitles
./textPrep
contains the textPrep library as adapted from textPrep. Here, the library has been modified to work for the Norwegian dataset used in the Master Thesis project. The modified version (fork) of textPrep can be found here.