Stats 101: Where to stat?

See what I did there? If you're anything like me, you're finding your way backwards into stats after spending a lifetime thinking that all there was to the field was mean, median, mode, sometimes cumulative frequency, and those are things you left behind in high school. Newsflash: not only is stats underlying every quantitative prediction... Continue Reading →


Open tools for Natural Language Processing / NLProc / NLP

Liling Tan of Saarland University compiled a list of open source NLP tools for anyone to get started with. Thanks Liling! We don't know each other, but your list is awesome!   Here's the compiled list of NLP tools. Here are the NLP tools slides that Liling presented at FOSS Asia in 2017.

Guide to planning and analyzing experiments

I discovered this amazing tutorial on planning and analyzing experiments by Eytan Bakshy and Sean Taylor of Facebook research, and I'm so glad I found it while I was still designing my first experiment. This is an amazing, comprehensive resource that takes you from starting to think about experiment design, all the way to what... Continue Reading →

Writing a peer review

Here's a short and sweet list of points to write your next peer review, courtesy ICWSM-19: A brief summary of the main content and contribution of the manuscript. A summary of the strengths of the manuscript. A summary of the weaknesses of the manuscript. The anticipated impact of this work. Ethical considerations of the data/problem,... Continue Reading →

Guide to designing online surveys

Note that most research agencies may provide a nationally representative, probability-based, online panel for the US alone. Online panels in other countries are almost entirely opt-in (nonprobability) panels and aren’t designed for following the same respondents longitudinally.  Furthermore, it is recommended to design a longitudinal study which has a fresh sample in each wave, or at... Continue Reading →

Language modeling on social media: the pros and cons

A number of studies have used social media language to (a) profile individuals, (b) profile linguistic styles, (c) profile communities, and (d) extrapolate the results to other domains, individuals, and communities. However, my work shows that pre-trained models may not scale well to other platforms, or even on the same platform to measure aggregated groups of... Continue Reading →

Create a free website or blog at

Up ↑