Mar 9, 2023 1 min read programming

gsDesign sample sizes

I recently tried to use gsDesign to calculate sample sizes for a language model question-answering experiment. Unfortunately, it didn't seem to work. I was calculating the size for a fixed-sample design with and without the package. THe package was giving me a sample size more than twice as large as the other methods I tried. The other numbers were consistent with each other, so seemed probably correct. I couldn't figure out what was going wrong.

The punchline: gsDesign reports total sample size, which is twice the number of questions, because I am comparing two models on the same set of N questions. I eventually figured this out by trying the similar rpact library. rpact's report gives a breakdown of the sample size by group (control vs. treatment) which showed each group was roughly the size I would have expected from other calculations. Not exactly the expected size – but pretty close.

In retrospect it makes sense. These packages are explicitly meant for planning clinical trials. There you are dividing a sample of human patients between treatment and control groups. I'm definitely not using this code for its original intended purpose.

You might also like...

Things I learned calling shm_open() on a Mac

Lessons learned from battling Godot 3 for twelve hours

Beware map indexing in C++

Eval game rule 10

Counting more extreme orders