Refinery: An Open Source Topic Modeling Web Platform

Daeil Kim, Benjamin F. Swanson, Michael C. Hughes, Erik B. Sudderth.

Year: 2017, Volume: 18, Issue: 12, Pages: 1−5


Abstract

We introduce Refinery, an open source platform for exploring large text document collections with topic models. Refinery is a standalone web application driven by a graphical interface, so it is usable by those without machine learning or programming expertise. Users can interactively organize articles by topic and also refine this organization with phrase-level analysis. Under the hood, we train Bayesian nonparametric topic models that can adapt model complexity to the provided data with scalable learning algorithms. The project website contains Python code and further documentation.

PDF BibTeX code webpage