Crack detection or segmentation on concrete structures is a vital process in structural health monitoring (SHM). Though supervised machine learning techniques have gained tremendous success in this domain, data collection and annotation continue to be challenging. Image data collection is challenging, tedious, and laborious, including accessing representative datasets and manually labeling training data in the SHM domain. According to the literature, there are significant issues with the hand-annotation of image data. To address this gap, this paper proposes a two-stage weakly supervised learning framework utilizing a novel “crack attention network (CrANET)” with attention mechanism to detect and segment cracks on images with no human annotations in pixel-level labels. This framework classifies concrete surface images into crack or no-cracks and then uses gradient class activation mapping visualization to generate crack segmentation. Professionals and domain experts subsequently evaluate these segmentation results via a human expert validation study. As the literature suggests that weakly supervised learning is a limited practice in SHM, this research title will motivate researchers in SHM to research and develop a weakly supervised learning approach processing as state of the art.