by Francesca Lizzi (National Institute for Nuclear Physics, Scuola Normale Superiore, National Research Council, University of Pisa), Maria Evelina Fantacci (National Institute for Nuclear Physics, University of Pisa) and P. Oliva (National Institute for Nuclear Physics, University of Sassari)
Breast cancer is the most commonly diagnosed cancer among women worldwide. Survival rates strongly depend on early diagnosis, and for this reason mammographic screening is performed in developed countries. New artificial intelligence-based techniques have the potential to include and quantify fibroglandular (or dense) parenchyma in breast cancer risk models.
Breast cancer is the most commonly diagnosed cancer among women worldwide. According to the latest American Cancer Statistics [L1], breast cancer is the second leading cause of death among women, and one in eight women will develop the disease at some point in her life.
Although the incidence of breast cancer is increasing, mortality from this disease is decreasing. This is mainly due to the breast cancer screening programs in which women aged 45-74 are called to have a mammographic exam every two years. Although mammography is still the most widely used screening method, it suffers from two inherent limitations: a low sensitivity (cancer detection rate) in women with dense breast parenchyma, and a low specificity, causing unnecessary recalls. The low sensitivity in women with dense breasts is caused by a “masking effect” of overlying breast parenchyma. Furthermore, the summation of normal breast parenchyma on the conventional mammography may occasionally simulate a cancer. In recent years, new imaging techniques have been developed: tomosynthesis, which can produce 3D and 2D synthetic images of the breast, new MRI techniques with contrast medium and breast CT. However, thanks to screening programs, numerous mammographic images can be collected from hospitals to build large datasets on which it is possible to explore AI techniques.
In recent years, new methods for image analysis have been developed. In 2012, for the first time, ImageNet Large Scale Visual Recognition Competition (ILSVRC), the most important image classification challenge worldwide, was won by a deep learning-based classifier named AlexNet [1]. Starting from this result, the success of deep learning on visual perception problems is inspiring much scientific work, not only on natural images, but on medical images too [2]. Deep learning-based techniques have the advantage of very high accuracy and predictive power at the expense of their interpretability. Furthermore, they usually need a huge amount of data and a large computational power to be trained.
At the Italian National Institute for Nuclear Physics (INFN), within the framework of a PhD in data science [L2] of Scuola Normale Superiore of Pisa, University of Pisa and the ISTI-CNR, we are working to apply deep learning models to find new image biomarkers extracted from screening mammograms that can help with early diagnosis of breast cancer.
Figure 1: The four density classes are shown as reported in the BI-RADS Atlas. The classes are defined through textual description and examples and are named A, B, C and D in order of increasing density.
Previously [3], we trained and evaluated a breast parenchyma classifier in the BI-RADS standard, which is made of four qualitative density classes (Figure 1), using a deep convolutional neural network and we obtained very good results compared to other work. Our research activities are continuing with a larger dataset and more ambitious objectives. We are collecting data from Tuscany screening programs and the ever-expanding dataset currently includes:
- 2,000 mammographic exams (8,000 images, four per subject) of healthy women labelled by the amount of fibroglandular tissue. These exams have been extracted from the Hospital of Pisa database.
- 500 screen-detected cases of cancer, 90 interval cancer cases and 270 control exams along with the histologic reports and a questionnaire with the known breast cancer risk factors, such as parity, height, weight and family history. It is possible to access all the mammograms prior to diagnosis for each woman. These exams have been extracted from the North-West Tuscany screening database.
The goal of our work is multi-fold and may be summarised as follows: - to explore the robustness of deep learning algorithms with respect to the use of different mammographic systems, which usually result in different imaging properties.
- to define a deep learning model able to recognise the kind and nature of the malignant masses depicted in mammographic data based on the related histologic reports.
- to investigate the inclusion of the fibroglandular parenchyma in breast cancer risk models in order to increase the predictive power of current risk prediction models. In this respect, changes in dense parenchyma will be monitored over time through image registration techniques, to understand how its variation may influence cancer risk. Furthermore, we will investigate the role of dense tissue in the onset of interval cancers and the correlation among both local and global fibroglandular tissue and other known risk factors so as to quantify the risk in developing a breast cancer.
Links:
[L1] https://kwz.me/hy5
[L2] https://datasciencephd.eu/
References:
[1] Alex Krizhevsky et al.: “ImageNet classification with deep convolutional neural networks”, NIPS'12 Proceedings, 2012.
[2] Geert Litjens et al.: “A survey on deep learning in medical image analysis”, Medical Image Analysis, 2017.
[3] F. Lizzi et al.: “Residual Convolutional Neural Network for breast density classification”, BIOINFORMATICS Proceedings, 2019, ISBN: 978-989-758-353-7.
Please contact:
Francesca Lizzi, National Institute for Nuclear Physics, Scuola Normale Superiore, National Research Council, University of Pisa, Italy