Coleção de Artigos Acadêmicos
URI permanente para esta coleçãohttps://repositorio.insper.edu.br/handle/11224/3227
Navegar
16 resultados
Resultados da Pesquisa
Artigo Científico Neonatal mortality prediction with routinely collected data: a machine learning approach(2021) ANDRE FILIPE DE MORAES BATISTA; Diniz, Carmen S. G.; Bonilha, Eliana A.; Kawachi, Ichiro; Chiavegatto Filho, Alexandre D. P.Background: Recent decreases in neonatal mortality have been slower than expected for most countries. This study aims to predict the risk of neonatal mortality using only data routinely available from birth records in the largest city of the Americas. Methods: A probabilistic linkage of every birth record occurring in the municipality of São Paulo, Brazil, between 2012 e 2017 was performed with the death records from 2012 to 2018 (1,202,843 births and 447,687 deaths), and a total of 7282 neonatal deaths were identified (a neonatal mortality rate of 6.46 per 1000 live births). Births from 2012 and 2016 (N = 941,308; or 83.44% of the total) were used to train five different machine learning algorithms, while births occurring in 2017 (N = 186,854; or 16.56% of the total) were used to test their predictive performance on new unseen data. Results: The best performance was obtained by the extreme gradient boosting trees (XGBoost) algorithm, with a very high AUC of 0.97 and F1-score of 0.55. The 5% births with the highest predicted risk of neonatal death included more than 90% of the actual neonatal deaths. On the other hand, there were no deaths among the 5% births with the lowest predicted risk. There were no significant differences in predictive performance for vulnerable subgroups. The use of a smaller number of variables (WHO’s five minimum perinatal indicators) decreased overall performance but the results still remained high (AUC of 0.91). With the addition of only three more variables, we achieved the same predictive performance (AUC of 0.97) as using all the 23 variables originally available from the Brazilian birth records. Conclusion: Machine learning algorithms were able to identify with very high predictive performance the neonatal mortality risk of newborns using only routinely collected data.Artigo Científico Predictors of tooth loss: A machine learning approach(2021) Elani, Hawazin W.; ANDRE FILIPE DE MORAES BATISTA; W. Murray Thomson; Kawachi, Ichiro; Chiavegatto Filho, Alexandre D. P.Introduction Little is understood about the socioeconomic predictors of tooth loss, a condition that can negatively impact individual’s quality of life. The goal of this study is to develop a machine-learning algorithm to predict complete and incremental tooth loss among adults and to compare the predictive performance of these models. Methods We used data from the National Health and Nutrition Examination Survey from 2011 to 2014. We developed multiple machine-learning algorithms and assessed their predictive performances by examining the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and positive and negative predictive values. Results The extreme gradient boosting trees presented the highest performance in the prediction of edentulism (AUC = 88.7%; 95%CI: 87.1, 90.2), the absence of a functional dentition (AUC = 88.3% 95%CI: 87.3,89.3) and for predicting missing any tooth (AUC = 83.2%; 95%CI, 82.0, 84.4). Although, as expected, age and routine dental care emerged as strong predictors of tooth loss, the machine learning approach identified additional predictors, including socioeconomic conditions. Indeed, the performance of models incorporating socioeconomic characteristics was better at predicting tooth loss than those relying on clinical dental indicators alone. Conclusions Future application of machine-learning algorithm, with longitudinal cohorts, for identification of individuals at risk for tooth loss could assist clinicians to prioritize interventions directed toward the prevention of tooth loss.Artigo Científico Cause-specific mortality prediction in older residents of São Paulo, Brazil: a machine learning approach(2021) Nascimento, Carla Ferreira do; Hellen Geremias dos Santos; ANDRE FILIPE DE MORAES BATISTA; Lay, Alejandra Andrea Roman; Duarte, Yeda Aparecida OliveiraBackground: Populational ageing has been increasing in a remarkable rate in developing countries. In this scenario, preventive strategies could help to decrease the burden of higher demands for healthcare services. Machine learning algorithms have been increasingly applied for identifying priority candidates for preventive actions, presenting a better predictive performance than traditional parsimonious models. Methods: Data were collected from the Health, Well Being and Aging (SABE) Study, a representative sample of older residents of São Paulo, Brazil. Machine learning algorithms were applied to predict death by diseases of respiratory system (DRS), diseases of circulatory system (DCS), neoplasms and other specific causes within 5 years, using socioeconomic, demographic and health features. The algorithms were trained in a random sample of 70% of subjects, and then tested in the other 30% unseen data. Results: The outcome with highest predictive performance was death by DRS (AUC−ROC = 0.89), followed by the other specific causes (AUC−ROC = 0.87), DCS (AUC−ROC = 0.67) and neoplasms (AUC−ROC = 0.52). Among only the 25% of individuals with the highest predicted risk of mortality from DRS were included 100% of the actual cases. The machine learning algorithms with the highest predictive performance were light gradient boosted machine and extreme gradient boosting. Conclusion: The algorithms had a high predictive performance for DRS, but lower for DCS and neoplasms. Mortality prediction with machine learning can improve clinical decisions especially regarding targeted preventive measures for older individuals.Artigo Científico Data Leakage in Health Outcomes Prediction With Machine Learning. Comment on “Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning"(2021) Chiavegatto Filho, Alexandre; ANDRE FILIPE DE MORAES BATISTA; Santos, Hellen Geremias dosArtigo Científico A multipurpose machine learning approach to predict COVID-19 negative prognosis in São Paulo, Brazil(2021) Fernandes, Fernando Timoteo; Oliveira, Tiago Almeida de; Teixeira, Cristiane Esteves; ANDRE FILIPE DE MORAES BATISTA; Costa, Gabriel Dalla; Chiavegatto Filho, Alexandre Dias PortoThe new coronavirus disease (COVID-19) is a challenge for clinical decision-making and the effective allocation of healthcare resources. An accurate prognostic assessment is necessary to improve survival of patients, especially in developing countries. This study proposes to predict the risk of developing critical conditions in COVID-19 patients by training multipurpose algorithms. We followed a total of 1040 patients with a positive RT-PCR diagnosis for COVID-19 from a large hospital from São Paulo, Brazil, from March to June 2020, of which 288 (28%) presented a severe prognosis, i.e. Intensive Care Unit (ICU) admission, use of mechanical ventilation or death. We used routinely-collected laboratory, clinical and demographic data to train five machine learning algorithms (artificial neural networks, extra trees, random forests, catboost, and extreme gradient boosting). We used a random sample of 70% of patients to train the algorithms and 30% were left for performance assessment, simulating new unseen data. In order to assess if the algorithms could capture general severe prognostic patterns, each model was trained by combining two out of three outcomes to predict the other. All algorithms presented very high predictive performance (average AUROC of 0.92, sensitivity of 0.92, and specificity of 0.82). The three most important variables for the multipurpose algorithms were ratio of lymphocyte per C-reactive protein, C-reactive protein and Braden Scale. The results highlight the possibility that machine learning algorithms are able to predict unspecific negative COVID-19 outcomes from routinely-collected data.Artigo Científico An investigation of the distribution of gaze estimation errors in head mounted gaze trackers using polynomial functions(2018) Mardanbegi, Diako; ANDREW TOSHIAKI NAKAYAMA KURAUCHI; Morimoto, Carlos H.Second order polynomials are commonly used for estimating the point-of-gaze in head-mounted eye trackers. Studies in remote (desktop) eye trackers show that although some non-standard 3rd order polynomial models could provide better accuracy, high-order polynomials do not necessarily provide better results. Different than remote setups though, where gaze is estimated over a relatively narrow field-of-view surface (e.g. less than 30x20 degrees on typical computer displays), head-mounted gaze trackers (HMGT) are often desired to cover a relatively wider field-of-view to make sure that the gaze is detected in the scene image even for extreme eye angles. In this paper we investigate the behavior of the gaze estimation error distribution throughout the image of the scene camera when using polynomial functions. Using simulated scenarios, we describe effects of four different sources of error: interpolation, extrapolation, parallax, and radial distortion. We show that the use of third order polynomials result in more accurate gaze estimates in HMGT, and that the use of wide angle lenses might be beneficial in terms of error reduction.Artigo Científico Identification of alterations associated with age in the clustering structure of functional brain networks(2018) Guzman, Grover E. C.; Sato, Joao R.; MACIEL CALEBE VIDAL; Fujita, AndreInitial studies using resting-state functional magnetic resonance imaging on the trajectories of the brain network from childhood to adulthood found evidence of functional integration and segregation over time. The comprehension of how healthy individuals’ functional integration and segregation occur is crucial to enhance our understanding of possible deviations that may lead to brain disorders. Recent approaches have focused on the framework wherein the functional brain network is organized into spatially distributed modules that have been associated with specific cognitive functions. Here, we tested the hypothesis that the clustering structure of brain networks evolves during development. To address this hypothesis, we defined a measure of how well a brain region is clustered (network fitness index), and developed a method to evaluate its association with age. Then, we applied this method to a functional magnetic resonance imaging data set composed of 397 males under 31 years of age collected as part of the Autism Brain Imaging Data Exchange Consortium. As results, we identified two brain regions for which the clustering change over time, namely, the left middle temporal gyrus and the left putamen. Since the network fitness index is associated with both integration and segregation, our finding suggests that the identified brain region plays a role in the development of brain systems.Artigo Científico Granger Causality among Graphs and Application to Functional Brain Connectivity in Autism Spectrum Disorder(2021) Ribeiro, Adèle Helena; MACIEL CALEBE VIDAL; Sato, João Ricardo; Fujita, AndréGraphs/networks have become a powerful analytical approach for data modeling. Besides, with the advances in sensor technology, dynamic time-evolving data have become more common. In this context, one point of interest is a better understanding of the information flow within and between networks. Thus, we aim to infer Granger causality (G-causality) between networks’ time series. In this case, the straightforward application of the well-established vector autoregressive model is not feasible. Consequently, we require a theoretical framework for modeling time-varying graphs. One possibility would be to consider a mathematical graph model with time-varying parameters (assumed to be random variables) that generates the network. Suppose we identify G-causality between the graph models’ parameters. In that case, we could use it to define a G-causality between graphs. Here, we show that even if the model is unknown, the spectral radius is a reasonable estimate of some random graph model parameters. We illustrate our proposal’s application to study the relationship between brain hemispheres of controls and children diagnosed with Autism Spectrum Disorder (ASD). We show that the G-causality intensity from the brain’s right to the left hemisphere is different between ASD and controls.Artigo Científico Estimation of the tissue composition of the tumour mass in neuroblastoma using segmented CT images(2004) FABIO JOSE AYRES; M. K. Zuffo,; Rangayyan, R. M.; Boag, G. S.; O. Filho, V.; Valente , M.Neuroblastoma is the most common extra-cranial, solid, malignant tumour in children. Advances in radiology have made possible the detection and staging of the disease. Nevertheless, there is no method available at present that can go beyond detection and qualitative analysis, towards quantitative assessment of the tissues composition of the primary tumour mass in neuroblastoma. Such quantitative analysis could provide important information and serve as a decision-support tool to the radiologist and the oncologist, result in better treatment and follow-up and even lead to the avoidance of delayed surgery. The problem investigated was the improvement of the analysis of the primary tumour mass, in patients with neuroblastoma, using X-ray computed tomography (CT) images. A methodology was proposed for the estimation of the tissue content of the mass: it comprised a Gaussian mixture model for estimation, from segmented CT images, of the tissue composition of the primary tumour. To demonstrate the potential of the method, the results are presented of its application to ten CT examinations of four patients. The method provides quantitative information, and it was observed that the tumour in one of the patients reduced from 523 cm3 to 81 cm3 in volume, with an increase in calcification from about 20% to about 88% of the tumour volume, in response to chemotherapy over a period of five months. Results indicate that the proposed technique may be of considerable value in assessing the response to therapy of patients with neuroblastoma.Artigo Científico Gabor filters and phase portraits for the detection of architectural distortion in mammograms(2006) Rangayyan, Rangaraj M.; FABIO JOSE AYRESSegmentation of the tumor in neuroblastoma is complicated by the fact that the mass is almost Always heterogeneous in nature; furthermore, viable Architectural distortion is a subtle abnormality in mammograms, and a source of overlooking errors by radiologists. Computer-aided diagnosis (CAD) techniques can improve the performance of radiologists in detecting masses and calcifications; however, most CAD systems have not been designed to detect architectural distortion. We present a new method to detect and localise architectural distortion by analysing the oriented texture in mammograms. A bank of Gabor filters is used to obtain the orientation field of the given mammogram. The curvilinear structures (CLS) of interest (spicules and fibrous tissue) are separated from confounding structures (pectoral muscle edge, parenchymal tissue edges, breast boundary, and noise). The selected core CLS pixels and the orientation field are filtered and downsampled, to reduce noise and also to reduce the computational effort required by the subsequent methods. The downsampled orientation field is analysed to produce three phase portrait maps: node, saddle, and spiral. The node map is further analysed in order to detect the sites of architectural distortion. The method was tested with 19 mammograms containing architectural distortion. In a preliminary experiment, a sensitivity of 84% was obtained at 7.8 false positives per image.
