Publications

9 Publications visible to you, out of a total of 9

Analysis of Protein-Protein Interactions networks and cross-species transfer learning comparison for seven organisms

Abstract (Expand)

Motivation Protein-protein interactions (PPIs) can be used for a plenty of applications like inferring protein functions or even helping the drug discovery process. For human specie, there is a lot of … validated information and functional annotations for the proteins in its interactome. In other species, the known interactome is much smaller compared with human and there are many proteins with few or no annotations by specialists. Understanding the interactome of other species helps to trace evolutionary characteristics, compare important biological processes and also build interactomes for new organisms according to other organisms more related with it instead of relying just to the human interactome. Results In this study, we evaluate the performance of PredPrIn workflow in predicting interactome for seven organisms in terms of scalability and precision showing that PredPrIn gets over than 70% of precision and it takes less than three days even on the largest datasets. We made a transfer learning analysis predicting an organism interactome from each other organism, we then showed an implication regarding to their evolutionary relation in the number of ortholog proteins shared between these organisms. We also present an analysis of functional enrichment showing the proportion of shared annotations between positive and false interactions predicted and extraction of topological features of each organism interactome such as proteins acting as hubs and bridge between modules. From each organism, one of the most frequent biological processes was selected and the proteins and pairs present in it were compared in terms of quantity in the interactome available in HINT database for that organism and the one predicted by PredPrIn. In this comparison we showed that we covered those proteins and pairs covered in HINT and also enriched these processes for almost all organisms. Conclusions In this work, we have proved the efficiency of PredPrIn workflow for protein interaction prediction for seven different organisms using scalability, performance and transfer learning analyses. We have also made cross-species interactome comparisons showing the most frequent biological processes for each organism as well as the topological features of each organism interactome showing the consistency with hypothesis about biological networks. Finally, we described the enrichment made by PredPrIn in selected biological processes showing that its prediction was important to enhance information about these organisms interactomes.

Author: Yasmmin C Martins

Date Published: 7th Jun 2023

Publication Type: Journal Article

DOI: 10.1101/2023.06.05.543725

Citation: biorxiv;2023.06.05.543725v1,[Preprint]

Created: 23rd Oct 2023 at 15:23, Last updated: 23rd Oct 2023 at 15:24

DSCrank: A Method for Selection and Ranking of Datasets

yPublish - Bioinfo tools

Abstract (Expand)

Considerable efforts have been made to build the Web of Data. One of the main challenges has to do with how to identify the most related datasets to connect to. Another challenge is to publish a local …

Authors: Yasmmin Cortes Martins, Fábio Faria da Mota, Maria Cláudia Cavalcanti

Date Published: 2016

Publication Type: Journal Article

DOI: 10.1007/978-3-319-49157-8_29

Citation: Metadata and Semantics Research 672:333-344,Springer International Publishing

Created: 23rd Oct 2023 at 14:59, Last updated: 23rd Oct 2023 at 15:04

EpiCurator: an immunoinformatic workflow to predict and prioritize SARS-CoV-2 epitopes

yPublish - Bioinfo tools

Abstract (Expand)

The ongoing coronavirus 2019 (COVID-19) pandemic, triggered by the emerging SARS-CoV-2 virus, represents a global public health challenge. Therefore, the development of effective vaccines is an urgent …

Authors: Cristina S. Ferreira, Yasmmin C. Martins, Rangel Celso Souza, Ana Tereza R. Vasconcelos

Date Published: 2021

Publication Type: Journal Article

DOI: 10.7717/peerj.12548

Citation: PeerJ 9:e12548

Created: 23rd Oct 2023 at 15:04, Last updated: 23rd Oct 2023 at 15:06

Large-Scale Protein Interactions Prediction by Multiple Evidence Analysis Associated With an In-Silico Curation Strategy

yPublish - Bioinfo tools

Abstract (Expand)

Predicting the physical or functional associations through protein-protein interactions (PPIs) represents an integral approach for inferring novel protein functions and discovering new drug targets …

Authors: Yasmmin Côrtes Martins, Artur Ziviani, Marisa Fabiana Nicolás, Ana Tereza Ribeiro de Vasconcelos

Date Published: 6th Sep 2021

Publication Type: Journal Article

DOI: 10.3389/fbinf.2021.731345

Citation: Front. Bioinform. 1,731345

Created: 23rd Oct 2023 at 15:13, Last updated: 23rd Oct 2023 at 15:16

Multi-task analysis of gene expression data on cancer public datasets

yPublish - Bioinfo tools

Abstract (Expand)

Background There is an availability of omics and often multi-omics cancer datasets on public databases such as Gene Expression Omnibus (GEO), International Cancer Genome Consortium and The Cancer Genome … Atlas Program. Most of these databases provide at least the gene expression data for the samples contained in the project. Multi-omics has been an advantageous strategy to leverage personalized medicine, but few works explore strategies to extract knowledge relying only on gene expression level for decisions on tasks such as disease outcome prediction and drug response simulation. The models and information acquired on projects based only on expression data could provide decision making background for future projects that have other level of omics data such as DNA methylation or miRNAs. Results We extended previous methodologies to predict disease outcome from the combination of protein interaction networks and gene expression profiling by proposing an automated pipeline to perform the graph feature encoding and further patient networks outcome classification derived from RNA-Seq. We integrated biological networks from protein interactions and gene expression profiling to assess patient specificity combining the treatment/control ratio with the patient normalized counts of the deferentially expressed genes. We also tackled the disease outcome prediction from the gene set enrichment perspective, combining gene expression with pathway gene sets information as features source for this task. We also explored the drug response outcome perspective of the cancer disease still evaluating the relationship among gene expression profiling with single sample gene set enrichment analysis (ssGSEA), proposing a workflow to perform drug response screening according to the patient enriched pathways. Conclusion We showed the importance of the patient network modeling for the clinical task of disease outcome prediction using graph kernel matrices strategy and showed how ssGSEA improved the prediction only using transcriptomic data combined with pathway scores. We also demonstrated a detailed screening analysis showing the impact of pathway-based gene sets and normalization types for the drug response simulation. We deployed two fully automatized Screening workflows following the FAIR principles for the disease outcome prediction and drug response simulation tasks.

Author: Yasmmin Martins

Date Published: 28th Sep 2023

Publication Type: Journal Article

DOI: 10.1101/2023.09.27.23296213

Citation: medrxiv;2023.09.27.23296213v1,[Preprint]

Created: 23rd Oct 2023 at 15:33, Last updated: 23rd Oct 2023 at 15:34

OntoPPI: Towards Data Formalization on the Prediction of Protein Interactions

yPublish - Bioinfo tools

Abstract (Expand)

The Linking Open Data (LOD) cloud is a global data space for publishing and linking structured data on the Web. The idea is to facilitate the integration, exchange, and processing of data. The LOD cloud …

Authors: Yasmmin Cortes Martins, Maria Cláudia Cavalcanti, Luis Willian Pacheco Arge, Artur Ziviani, Ana Tereza Ribeiro de Vasconcelos

Date Published: 2019

Publication Type: Journal Article

DOI: 10.1007/978-3-030-36599-8_23

Citation: Metadata and Semantic Research 1057:260-271,Springer International Publishing

Created: 23rd Oct 2023 at 15:09, Last updated: 23rd Oct 2023 at 15:12

PPIntegrator: semantic integrative system for protein–protein interaction and application for host–pathogen datasets

yPublish - Bioinfo tools

Abstract (Expand)

Semantic web standards have shown importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data …

Authors: Yasmmin Côrtes Martins, Artur Ziviani, Maiana de Oliveira Cerqueira e Costa, Maria Cláudia Reis Cavalcanti, Marisa Fabiana Nicolás, Ana Tereza Ribeiro de Vasconcelos

Date Published: 2023

Publication Type: Journal Article

DOI: 10.1093/bioadv/vbad067

Citation: Bioinformatics Advances 3(1),vbad067

Created: 23rd Oct 2023 at 15:18, Last updated: 23rd Oct 2023 at 15:21

survInTime - Exploring surveillance methods and data analysis on Brazilian respiratory syndrome dataset and community mobility changes

yPublish - Bioinfo tools

Abstract (Expand)

Background The covid-19 pandemic brought negative impacts in almost every country in the world. These impacts were observed mainly in the public health sphere, with a rapid raise and spread of the … disease and failed attempts to restrain it while there was no treatment. However, in developing countries, the impacts were severe in other aspects such as the intensification of social inequality, poverty and food insecurity. Specifically in Brazil, the miscommunication among the government layers conducted the control measures to a complete chaos in a country of continental dimensions. Brazil made an effort to register granular informative data about the case reports and their outcomes, while this data is available and can be consumed freely, there are issues concerning the integrity and inconsistencies between the real number of cases and the number of notifications in this dataset. Results We projected and implemented four types of analysis to explore the Brazilian public dataset of Severe Acute Respiratory Syndrome (srag dataset) notifications and the google dataset of community mobility change (mobility dataset). These analysis provides some diagnosis of data integration issues and strategies to integrate data and experimentation of surveillance analysis. The first type of analysis aims at describing and exploring the data contained in both datasets, starting by assessing the data quality concerning missing data, then summarizing the patterns found in this datasets. The Second type concerns an statistical experiment to estimate the cases from mobility patterns organized in periods of time. We also developed, as the third analysis type, an algorithm to help the understanding of the disease waves by detecting them and compare the time periods across the cities. Lastly, we build time series datasets considering deaths, overall cases and residential mobility change in regular time periods and used as features to group cities with similar behavior. Conclusion The exploratory data analysis showed the under representation of covid-19 cases in many small cities in Brazil that were absent in the srag dataset or with a number of cases very low than real projections. We also assessed the availability of data for the Brazilian cities in the mobility dataset in each state, finding out that not all the states were represented and the best coverage occurred in Rio de Janeiro state. We compared the capacity of place categories mobility change combination on estimating the number of cases measuring the errors and identifying the best components in mobility that could affect the cases. In order to target specific strategies for groups of cities, we compared strategies to cluster cities that obtained similar outcomes behavior along the time, highlighting the divergence on handling the disease.

Authors: Yasmmin Côrtes Martins, Ronaldo Francisco da Silva

Date Published: 27th Sep 2023

Publication Type: Journal Article

DOI: 10.1101/2023.09.26.559599

Citation: biorxiv;2023.09.26.559599v1,[Preprint]

Created: 23rd Oct 2023 at 15:30, Last updated: 23rd Oct 2023 at 15:32

The impact of non-lineage defining mutations in the structural stability for variants of concern of SARS-CoV-2

yPublish - Bioinfo tools

Abstract (Expand)

Motivation The identification of the most important mutations, that lead to a structural and functional change in a highly transmissible virus variants, is essential to understand the impacts and the …

Authors: Yasmmin Martins, Ronaldo Francisco da Silva

Date Published: 22nd Jun 2023

Publication Type: Journal Article

DOI: 10.1101/2023.06.22.546079

Citation: biorxiv;2023.06.22.546079v1,[Preprint]

Created: 23rd Oct 2023 at 15:25, Last updated: 23rd Oct 2023 at 15:28

Publications

Filters ×

Filters