Heartbeat detection tasks are often used to measure cardiac interoceptive sensitivity-the ability to detect sensations from one's heart. However, there is little work to guide decisions on the optimum number of trials to use, which should balance reliability and power against task duration and participant burden. Here, 174 participants completed 100 trials of a widely used heartbeat detection task where participants attempt to detect whether presented tones occurred synchronously or asynchronously with their heartbeats. First, we quantified measurement reliability of the participant's accuracy derived from differing numbers of trials of the task using a correlation metric; we found that at least 40-60 trials were required to yield sufficient reliability. Next, we quantified power by simulating how the number of trials influenced the ability to detect a correlation between cardiac interoceptive sensitivity and other variables that differ across participants, including a variable measured from our sample (body mass index) as well as simulated variables of varying effect sizes. Using these simulations, we quantified the trade-offs between sample size, effect size, and number of trials in the heartbeat detection task such that a researcher can easily determine any one of these variables at given values of the other two variables. We conclude that using fewer than 40 trials is typically insufficient due to poor reliability and low power in estimating an effect size, although the optimal number of trials can differ by study.