Camera-traps are increasingly used to estimate wildlife abundance, yet few studies exist for small-sized carnivores or comparing efficacy against traditional methods. We developed a camera-trap to identify the unique ventral patches of American martens (Martes americana). Our method was designed to: (1) determine the optimal trap configuration to photograph ventral patches; (2) evaluate the use of temporally clustered photographs to determine independence and improve identification; and (3) determine factors that influence identification probability. We tested our method by comparing camera- and live-trap density estimates using spatial capture–recapture (SCR) models. The ventral patches of radio-collared martens were most visible when traps were placed 15–20 cm above a feeding platform. Radio-collared martens (n = 14) visited camera-traps for long periods (median = 7 min) with long intervals between visits (median = 419 min), and visits by different martens at the same trap <15 min apart was infrequent (n = 3) during both years. Similarly, there was complete agreement among observers that clustered photos of un-collared martens were always of the same individual. Pairwise agreement was high between observers; eight un-collared martens were identifiable by consensus on 90% (54 of 60) of recorded visits. Factors influencing identification probability were directly related to the time martens spent feeding at traps (β = 0.143, P = 0.01) and inversely proportional to the time that elapsed since traps were baited (β = −0.344, P = 0.006). Density estimates were higher and more precise for camera-trapping (0.60, 0.35–1.01 martens/km2) than live-trapping (0.45, 0.16–1.22 martens/km2), providing evidence that SCR density estimates may be biased when capture heterogeneity is present, yet cannot be accounted for due to small sample size. Our camera-trap method provides a minimally invasive and accurate tool for monitoring marten populations.