Introduction
Diabetes mellitus type 2 (T2DM) is a major global health problem. A series of multifactorial diseases comprises lifestyle and genetic factors.1 There were 451 million diabetic patients worldwide in 2017 and are expected to increase to 693 million in 2045, as projected by the International Diabetes Federation (IDF).2 The American Diabetes Association (ADA) defines T2DM as a gradual loss of insulin secretion by beta-cells with a background of insulin resistance.3 The bulk ratio in all cases of diabetes mellitus is for T2DM, which forms 90-95%.3 Obesity is recognized as the most significant key component for the pathogenesis of T2DM and impaired glucose homeostasis.4 Taken together, high body mass index (BMI) and impaired insulin action are suggested as the etiology behind glucose intolerance, and yet many uncertainties remain obscure. It should be noted that a possible explanation can be the presence of elevated levels of free fatty acids (FFA) that were recognized in both insulin-resistant patients and obese individuals.5, 6, 7 Impressively, FFA was found to be involved in causing changes in various cellular processes such as cell survival, membrane fluidity, cell signalling, enzymatic functionality, ion homeostasis, transcription, and translation.8
An animal model study (on a fat-fed dog) demonstrated an insulin resistance acquired due to high FFA (near one mM) in both nocturnal and basal stages, at a level high enough to cause in-vitro cellular stress.9 Importantly, such findings allow the acceptance that high FFA induces insulin resistance. In the same way, Muoio et al.10 and Guilherme et al.11 demonstrated that animals and humans with increased FFA have reduced insulin action. Moreover, multiple studies showed evidence of FFA-induced endoplasmic reticulum (ER) stress in animal models12 and tissue cultures.13 This sheds light on the association between ER stress responses and their effect on insulin resistance, and obese people.12, 14 Surprisingly, palmitate; the most common long-chain saturated fatty acid influences ER stress at comparable levels of the total FFA seen in individuals with obesity.15, 16
Despite the improvement achieved in the past decades with chronic complications of T2DM, there are still gaps with associated noticeable morbidity and mortality. Life expectancy has decreased by 6 years and 12 years for those diagnosed early.17 Based on the RISE study,18 patients who were complicated by early T2DM were subjected randomly to metformin, metformin with insulin, or the analogue of glucose-like peptide 1 (GLP-1) plus metformin, exhibited an improvement in beta cells. However, the beneficial effects waned after three months of stopping the treatment. This suggests that a wide range of drugs are available for treating T2DM; none convincingly manifested a considerable modification to reverse over-time reduced beta cell function. Therefore, it is essential to investigate alternative ways for better treatment choices.
Embryonic development involves multiple transcription factors that regulate the formation of beta cells. What is exceptionally striking is that many of these transcription factors are also significant for the functions of pancreatic cells during adulthood (well-expressed in both the exocrine and endocrine).19 The transcription factor family, zinc finger GATA, is essential for organogenesis in both mice and humans.20, 21 For instance, heterozygous mutations of GATA6 are recognized to be the most frequent inducer of pancreatic agenesis.21, 22 Furthermore, GATA6 plays a redundant and vital role in pancreas formation. Of interest, when progenitor cells of the pancreas undergo a single inactivation of GATA6, it does not induce significance in organogenesis. However, a redundant and constant inactivation of GATA6 can cause agenesis of the pancreas due to deficiencies and alterations in the proliferation and differentiation of progenitor cells of the pancreas.20, 21, 22, 23 This suggests that a chronic effect of stressors can induce the susceptibility of GATA6 inhibition.
Additionally, palmitate strengthens and builds on the available shreds of evidence and what is conveniently known and elegantly emphasizes the ability of FFA to alter islet cells’ functionality adversely. Of particular interest is the palmitate capability in inhibiting the genes controlling beta-cell phenotype.24 An example of these genes is the GATA6 transcription factor, which when inhibited causes beta cell apoptosis, the so-called lipotoxicity phenomenon.25 Based on several studies conducted by Cnop et al., Laybutt et al., and Eizirik et al., palmitate can induce ER stress in the beta cells that are either deleterious or adaptive in the type and degree of the activation.24, 26, 27 This, in turn, highlights apparent evidence that palmitate can trigger the unfolded protein response. Notably, TXNIP (the thermoredoxin-interacting protein) is up-regulated and considered a significant causative of apoptosis in hyperglycaemia; despite that, palmitate was down-regulated.28, 29 In particular, these findings build on evidence that glucotoxicity and lipotoxicity have an effect on the survival and function of beta islet cells.
GATA6 emerges as a significant transcription factor that plays an intrinsic role in beta islet cells' function. It is still obscure and remains uncharted with lots of uncertainties. In this paper, we sought to investigate palmitate-mediated ER stress and lipotoxicity in islet cells by exploring the islet transcriptome on a novel molecular basis using bioinformatics analysis in conjunction with previous experimentally evidence-based studies. Furthermore, attempting to highlight biomarkers profile which serves as potentially reliable candidates to reach new insights into the underlying molecular mechanisms that dictate diseases pathogenesis.
Research Design and Methodologies
The current study utilizes experimental data released from various former resources, and herein, the authors provide a detailed data analysis via bioinformatics tools to investigate the authenticity of previous differentially expressed data and hence validate our gene of interest. Previous research methods will be explicated briefly.
Cell culture and islet cell isolation
Human pancreatic islets
The process of collecting samples includes a series of steps. Islet cells of humans were collected and isolated in Pisa, Italy, as previously reported.30 In brief, this was achieved via collagenase digestion and purification through the density gradient technique. The collection of islet cells was carried out in donors of a beating heart organ who had no medical history of T2DM or other metabolic diseases. The characteristics of donors are illustrated in Table 1. Furthermore, in 1-5 days from the isolation of islet cells, M199 culture media was used consisting of 5.5 mmol/L of glucose before shipment to Brussels. Upon arrival, the cells were subjected to culture in 50 units/ml Penicillin, 50 mg/ml Streptomycin, 50 mmol/L 3 isobutyl-1-1methylxanthine, 2 mmol/L GlutaMAX, 1% Charcoalabsorbed BSA, 10% heat-inactivated FBS, and Ham's F-10 medium supplemented with 6.1 mmol/L. Also, the cells were evaluated for purity by insulin immunocytochemistry, as described in. 31 According to the records of previous studies,24, 32, 33, 34 0.5 mmol/L palmitate was exposed to islet cells for 48 hours, whereas the other group was not exposed. In brief, before exposing the islet cells, palmitate was dissolved (in ethanol of 90%) and diluted (1:1000) to a terminal concentration to obtain 0.5 mmol/L, and with a molar ratio of 3.4 for palmitate: BSA based on a precious calculation provided by Cnop et al. 25 For the control, it contained an identical ethanol dilution and 1 % charcoal absorbed BSA.
Rat cells
Rat insulin-producing cells (INS-1E cells) and islets were cultured, and exposed to palmitate as previously described.33 Concisely, RPMI 1640 medium (containing 5% FBS), 50 mmol/L 2-mercaptoethanol, 1 mmol / L Na-pyruvate, and 10 mmol/L HEPES were used to culture Rat INS-1E cells for 2 days. The cells were then harvested to be exposed to palmitate. On the other hand, islet cells were initially isolated from adult male rats (Wistar; Brussels, Belgium) through collagenase digestion, which was then subjected to a manual pick-up using a stereomicroscope. Beta cells were isolated by islets being dispersed and purified for beta cell isolation via autofluorescence-activated cell sorting.
RNA-Seq and data analysis
RNA sequences of five preparations of human islets were analyzed as previously reported31 cited in.24 In summary, the RNeasy Mini Kit was used to purify polyA-selected mRNA from the entire isolated RNA. The purified mRNAs were reverse-transcribed to cDNA. Then, cDNAs of the 200bp length were subjected to amplification, and the results were submitted to Agilent 2100 Bio-analyser (Agilent Technologies, Wokingham, UK) for quality control purpose. The RIN scores (RNA Integrity Number) were greater than 7.5 for all samples. Furthermore, the Illumina Genome Analyzer II System was used to sequence cDNA on one sequencing lane. Raw data was uploaded (GSE53949-Submission number) in GEO (Gene Expression Omnibus). Gem-mapper (Genomic Multi-tool; GEM: http://gemlibrary.sourceforge.net) was used in mapping the paired-end reads of the human genome. To quantify transcripts, the resulting mapped reads were utilized from the reference database of RefSeq,35 with the aid of Flux Capacitor (online: http://flux.sammeth.net).36 Of note, reads per kilobase of the exon model per million mapped reads (RPKM units) were used for gene and transcript measurement.37 Flux Capacitor (scripts in Perl or R) was used to generate lists of expressed genes and transcription factors.
Taking into consideration, genes modified by palmitate in both palmitate and control conditions were identified via obtaining the Log2 calculation of the proportion between the summation value of RPKM of all genes. It is worth mentioning that Fisher’s exact test was corrected with that of Benjamini Hochberg (to consider the five samples of each gene as an independent test), in which a given gene expression difference can be considered significant if the only P value was less than 0.05. A particular gene was defined as a palmitate-modified gene if the expression significantly altered in one direction at least 4-5 preparations of islets, and not a significant alteration is observed in the opposite direction.
Furthermore, GENCODE version 16 annotations were used to allow splicing analysis.38, 39 In addition, differences in splicing indices were compared in both palmitate and control. Interestingly, the data set of GENCODE involves 153008 transcripts that correspond to 25492 protein-coding genes and noncoding vRNA.
Independent validation of expression data
Searching methodology
The transcriptome based on previous studies was obtained and needed further evaluation and validation. An extensive search was carried out to reach multiple studies with different research methodologies. Moreover, the search aimed to compile distinct datasets derived from various subject groups understudies to ensure more reliable results (e.g., cells and subject groups other than islets derived from organ donors with no background of metabolic disorders or diabetes mellitus as reported in Cnop et al. study). 24
To maximize the authenticity of data retrieval, a systematic search was performed that covered the most valid databases, such as PubMed, Google Scholar, Science Direct, NTU student access to multiple databases and journals, and the top-cited articles in medical journal articles. The scope of this study focuses on GATA6 inhibition mediated by palmitate-induced ER stress and lipotoxicity. Therefore, the first step was to establish a clear search strategy, beginning with defining the research topic scientifically. The scientific definition allows more narrowed targets, which serve the aims and objectives of the study. To ensure a highly efficient search, keywords were used to best reflect the scope of the study, fit inclusion and exclusion criteria and topic analysis, for instance, 'T2DM', 'metabolic stress', 'obesity and metabolic stress', 'FFA and metabolic stress', 'palmitate and metabolic stress', 'palmitate and beta-cell dysfunction, 'palmitate mediated metabolic stress', 'palmitate mediated apoptosis or lipotoxicity', ‘pancreatic beta cells’, ‘saturated fatty acids’, ‘ER stress’, ‘expression data and epigenetics’, ‘RNA-sequencing and microarray’, ‘bioinformatics analysis’, 'GATA6 and metabolic disorders', 'palmitate and GATA6', 'GATA6 inhibition by palmitate', and 'GATA6 association on beta-cell function and survival’. Moreover, a filtration method was followed to select the most relevant studies to meet our aims and objectives.
Establishing a Relevance Between Human and Rat
The Basic Local Alignment Search Tool (BLAST)40 is a tool that compares protein or nucleotide sequences to available sequence databases to calculate the presence of statistical significance. Here, it was used to determine the genetic similarity between human GATA6 and Rat (Rattus norvegicus) GATA6. The NCBI gene database41 searched for GATA6 of Rattus norvegicus. Then, its nucleotide sequence was obtained entitled NCBI reference sequence: NM_019185.2. The FASTA sequence of Rattus norvegicus was gathered and placed in the BLASTn program. Additionally, the database search set was adjusted for human RefseqGene sequences (RefSeq-Gene), and the mega blast program search was elected to produce highly similar and accurate sequences. The relevancy is shown in the form of a percentage of identity and the E value. GATA6 nucleotides in FASTA format sequences of both human (NCBI Reference Sequence: NG_032677.2) and Rattus norvegicus (NCBI Reference Sequence: NM_019185.2) are available in supplementary file 2 & supplementary file 3, respectively.
Overall STRING analysis
Two main entries are essential to generate an interaction between two or more proteins, including a ‘list of proteins’ and ‘organism name’ to show an association. Moreover, on the main page of the STRING database, the searching option is adjustable for single protein or multiple proteins entry. The icon is named 'List of names’ and is the fitting area to search inputs where different protein names or identifiers are entered. In addition, there is an icon with the name 'Organisms', which is responsible for the selection of organisms of interest, such as Homo sapiens. After the results are formulated, the protein to proteins interactions network will be displayed. As illustrated on the top side of the STRING website, the ‘Analysis’ option allows the visualization and interpretation of Gene Ontology, protein-protein interactions, pathway enrichment analysis, protein domains, annotated keywords analysis, and STRING network status. On the bottom side of the database, a ‘statistical background’ option is available, which enables users to adjust the assumed scale of coverage. In other words, the enrichment analysis is adjusted either for the whole genome, the Druggable genome, or kinases and associates. The current study covers an enrichment analysis of the whole genome.
Gene ontology analysis (GO)
PANTHER GO
The GATA6 gene ontology was analyzed through PANTHER (Protein Analysis Through Evolutionary Relationships). PANTHER allows for high-throughput analysis. The classification of PANTHER is based on complex bioinformatics algorithms and human curation data. 42, 43 Through PANTHER, Homo sapiens was selected to obtain human GATA6 data that were analyzed for molecular function (MF), biological process (BP), cellular components (CC), protein class, and pathway. GO results are explicated in Figure 1.
String GO
The STRING tool was used to elucidate the analysis of gene ontology enrichment of different modules, providing convoluted molecular details between GATA6 and differentially expressed proteins. This allows a clear display of enrichment analysis sorted by biological process, molecular function, and cellular components. The significance enrichment value was considered to be less than 0.05 and expressed in the false discovery rate (FDR).
Protein to protein interactions network integration
The predictive protein-protein interaction network was obtained through the STRING database tool. All associations between GATA6 and other molecules are recovered with a confidence-interacting score. Also, the confidence interacting score was adjusted through the STRING setting, at a minimum value of a medium interacting score of 0.400. Also, the network type was adjusted to be ‘full’ to indicate both physical and functional protein associations. Furthermore, the active interaction sources were set to highlight only interactions based on those derived from text mining, co-expression, neighbourhood, gene fusion, and co-occurrence. High-throughput experiments and databases-based interactions were omitted most of the time in the protein-to-protein analysis due to the reason of attempting to highlight novel interaction that is more based on predictive genes similarity.
Pathway enrichment analysis
A predictive analysis of pathways was conducted via the STRING tool covering two pathways; the Kyoto Encyclopaedia of Genes and Genomes pathway (KEGG) and the Reactome pathway (RCTM). In more detail, KEGG pathway analysis allowed the understanding of the biological system at a high level of utilities from a molecular perspective covering the ecosystem, organisms, and cells. This pathway enrichment analysis is carried out on a scale of high magnitude of datasets based on molecular biology by sequencing the genome and high yield experiments. RCTM pathway was used due to its ability to group complex entities of biological interactions into pathways. As the name implies, the RCTM database is based on reactions. Both pathway enrichment analysis is set at a value of less than 0.05 to establish a statistical significance difference.
Statistical analysis
Experimental and expression data retrieved from previous studies were illustrated as derived. The section 'RNA-Seq and data analysis' explains in detail how data have been analysed. BLAST, STRING, and PANTHER are used to conduct transcriptomic analysis. The mean values are presented as they were obtained, and a P value of less than 0.05 indicates a statistically significant correlation. The false discovery rate was determined to be less than 0.05 to indicate a value of significance. Paired two-tailed Student's t-test or ratio t-test and nonparametric paired test (Wilcoxon) were used to express statistical values of different previous datasets.
Ethical statement
Ethical consideration was appreciated and taken into account. Practices in human and animal subjects were all in agreement with the 1964 Helsinki Declaration. Ethical approval was granted in all previous studies. In brief, islet cells of human collecting and processing were ethically approved by the local Pisa (Italy) committee of ethics, and the ethical committee approved rat handling for animal experiments in Brussels, Belgium (Université Libre de Bruxelles). The ethics committee also approved the second study by Lund (Sweden) on obtaining informed consent from organ donors.
Results
Selection criteria of datasets
Several criteria were taken into account when selecting the datasets. This includes predetermined inclusion and exclusion criteria. The inclusion criteria defined a study to be eligible for inclusion, such as experimental and transcriptomic-based studies, gene expression data released by RNA-sequencing analysis or microarray analysis, different patient cohorts, and published research in the past seven years. On the contrary, exclusion criteria were defined as any elements that disqualify a study from being part of this work including, qualitative based methodologies, studies with no expression outcomes and not conducted by RNA-seq or microarray analysis, and studies of more than seven years old. The main criterion for discriminating inclusion and exclusion was set to be expression data and different patient groups.
Table 1
Age expressed in mean ± SD, BMI; body mass index, CA; cardiac arrest, T; Trauma, CVD; cardiovascular disease, CI; cerebral ischemia, CH, cerebral hemorrhage, NA; Not available, SEM; Standard error of the mean, FDR; False discovery rate.
GATA6 Gene ontology analysis
PANTHER gene ontology
Human GATA6 was analyzed using PANTHER to define its characteristics of gene ontology characteristics. As evidenced by PANTHER, it involves multiple aspects of the human body. For example, molecular function regulators and binding processes indicate GATA6 molecular function. Metabolic, development, cellular, and biological regulation processes reflected the biological process of GATA6. The cellular component was presented as an anatomical entity and intracellular involvements, and it emerged to be a gene-specific transcription factor (protein class). Besides, the pathway as part of the gene ontology definition showed no results, indicating no relationship between GATA6 and showed no possible interacting molecules as per PANTHER gene analysis.
STRING gene ontology
It is noteworthy that the STRING analysis 45 highlighted the GO of the interactions of GATA6 with other genes. It showed GO terms that are significantly enriched in biological process (BP), molecular function (MF), and cellular component (CC) with 449 GO terms, 45 GO terms, and 12 GO terms, respectively. Biological process; FDR ranged from 1.45e-22 (positive regulation of gene expression) and 0.0477 (regulation of wound healing), molecular function; FDR varies between 2.68e-24 (transcription regulatory region) and 0.0448 (identical protein binding), cellular component; FDR values scaled low at 4.76e-10 (nucleus) and high at 0.0297 (sarcoplasm), as illustrated.
Protein to protein interactions (PPI) network integration
STRING protein-to-protein interactions analysis, 45 illustrated 29 nodes, 167 edges, an average node degree of 11.5, an average local clustering coefficient value of 0.781, 22 expected number of edges, and protein-protein interactions enrichment P-value of < 1.0e-16. The Figure 2 highlights a total number of 29 proteins that form a network of interactions.
Table 2
Pathway enrichment, STRING local network cluster, protein domains, and annotated keywords analysis
Predictive pathway analysis45 showed 15 KEGG pathways and one single Reactome (RCTM) pathway that were significantly enriched. Additionally, local STRING network clusters were obtained hierarchically by a clustering process of the entire STRING network using average linkages. Redundancy was attenuated by omitting all clusters that have a size of less than five proteins as a difference in relation to the nearest small cluster in the clustering hierarchy. Clusters’ names were derived from annotated proteins of Pfam, InterPro, SMART, UniProt, RCTM, KEGG, and GO. Furthermore, there were 14 significantly enriched clusters at the STRING local network cluster.45 Furthermore, protein domains45 were obtained from three sources: Pfam, InterPro, and SMART that produced 10, 19, and 8 domains, respectively, which were significantly enriched. Also, 21 annotated keywords45 were significantly enriched.
Establishing valid relevancy of GATA6 of both humans and rats through BLAST analysis
The nucleotide sequence of human GATA6 (NCBI reference sequence: NG_032677.2) was compared with Rattus norvegicus GATA6 (NCBI reference sequence: NM_019185.2) in order to establish relevancy between the two species. The relevancy showed an identity of 85% and an E value of 0.0.40, 41
Discussion
Obesity is a major factor in the pathogenesis of T2DM due to the presence of FFA (e.g., palmitic acid) that induces insulin resistance, ER stress, and lipotoxicity.4, 10, 11, 12, 13, 14 GATA6 has a significant role in the survival and function of islet beta cells with many obscured details. A scarce GATA6-related islets data is available in the current literature. In this paper, the purpose of the study was to assess the significance and validity of previous expression findings by integrating its transcriptome through bioinformatic analysis.
Following a systematic search of the available literature, to the best of our knowledge, the relevancy between GATA6 of both rats (Rattus norvegicus) and humans has not been previously highlighted. These outcomes are explained by the fact that suggests how well Rattus norvegicus can mimic human physiology once used as a study model due to the close molecular materials. The novel data of GATA6 by Cnop et al. 24 have been subjected to considerable criticism. For example, Nolan et al.42 argued that the experimental conditions vary from those that occurred in vivo. Furthermore, the Cnop et al. experiment only investigates the transcriptome of islets at one time point, 48 hours after palmitate exposure, which does not identify both late and acute changes that can develop in vivo. Despite that, the long-term lipid induced toxicity is relatively altered inside the human body. As evidenced by Cnop et al.,25 the induced effect of lipotoxicity by saturated FFAs could be attenuated by unsaturated FFAs present in the in vivo environment.
Based on Hall et al.,45 sundry observations have been reported to have significant global DNA methylation in islet cells that are exposed to palmitate compared with the control group of islets. However, the diversified changes were mostly minor. According to the previous results, the exposure time may play an intrinsic role in the degree of alterations that could occur in islet beta cells after palmitate exposure. This allows the speculation at a larger magnitude that DNA methylation altered configurations may need a lengthy period of exposure to hyperlipidaemia, an event seen in T2DM patients. However, quantitative research is never meant to and capable of mirroring the vivo sittings due to its divine complexity. Moreover, there is no such tool to understand the totality of a human being. However, having an animal model with such a high identity and an E value of 0.0 is an excellent marker for in vitro settings, as it provides more reliable and confident results.
In order to assess the validity of GATA6 expression data, two datasets were used; Cnop et al. study compared and contrasted with Hall et al. study which has different patient cohorts, and the results are explicated in Table 1 24, 44 Impressively, both studies were conducted in the same year 2014 with a similar palmitate exposure time of 48 hours and yielded different results, some novel. Hall et al. conducted microarray analysis to differentially express genes upon palmitate exposure and presented with 1860 expressed genes. Of that, 630 genes were upregulated, and 1230 were downregulated. On the other hand, Cnop et al. utilized RNA-Seq analysis that expressed 1325 genes with 428 and 897 genes being upregulated and downregulated, respectively. A possible explanation for the difference in the number of expressed genes could be proposed to the total number of donor cells subjected to palmitate. In other words, Cnop et al. used 5 donor islet cells, while 13 donors were used by Hall et al.
Of particular interest, the Cnop findings provide more significant expression data that goes beyond the results of the previous microarrays.43, 46, 47, 48, 49, 50 The microarray has a major limitation as the detection of transcripts is only restricted to probe transcripts found on the arrays. One reason that explains Cnop et al. novel findings via RNA-sequencing analysis which were able to recognize the expression of transcription factors as GATA6 that were not previously documented.24 In transcriptomic analysis, RNA-seq emerged as the standard gold method due to its ability to detect novel transcripts, genes that are poorly expressed, and alternative splice variants.51 Consistent with the literature, a recent study by Lien et al.52 corroborates the results with a great deal of the previous argument. Lien and colleagues conducted RNA sequences on a cohort of fetal and neonatal islet cells and were able to identify GATA6 among 917 differentially expressed genes (411 upregulated: 506 downregulated). These findings broadly support the evidenced work by Cnop et al 24, in this area linking RNA-seq and novel appreciation of expression data such as GATA6 placing RNA-seq to be a superior approach over microarray analysis. Surprisingly, the analysis of the RNA sequence by Cnop et al.24 reported palmitate-induced inhibition of the well-expressed transcription factor GATA6.
Another observation was GATA6 silencing with two distinct siRNAs. These data were confirmed by qRT-PCR and provided a significance of a P value <0.05. Inhibition of GATA6 showed an induction of apoptosis in palmitate-treated islets, which was able to accentuate lipotoxicity. Many studies reported that the heterogeneous mutations in GATA6 cause neonatal diabetes, adult-onset diabetes, and agenesis of the pancreas.20, 21, 22, 23 Contrary to expectations, Hall et al.45 found no significant alteration due to apoptosis. It is somewhat surprising that no palmitate-induced apoptosis is being reported. A proposed explanation could be the different approaches used in apoptosis assessment. Another possible clarification is the stearoyl-CoA desaturase (SCD), reported by Hall et al. to be up-regulated in human islets upon palmitate exposure. SCD is a member of the biosynthesis pathway of unsaturated fatty acids that catalyzes the alteration of deleterious saturated fatty acids to unsaturated fatty acids.53 This pathway was significantly enriched by Hall et al.44 during the KEGG pathway analysis. According to previous observations, the conversion of saturated to unsaturated fatty acids was evidenced to safeguard human and rodent beta cells from apoptosis and ER stress induced by palmitate.54 This might partially explain the reason behind Hall et al. findings being unable to highlight the significant effect of apoptosis.
The occurrence of these diverse outcomes between the two compared studies could best be explained by self-validation, which plays a substantial role in maintaining the integrity of sound findings even with the presence of a robust methodology. Cnop et al.24 conducted an internal validation to confirm expression data using qRT-PCR for 7 genes in the same islets used for RNA-sequencing. In addition, qRT-PCR and RNA sequences of 30 genes of human islets were compared and illustrated a correlation coefficient value of 0.63. On the contrary, Hall et al.45 showed the absence of internal-technical validation of microarray analysis.
Through PANTHER, the ‘pathway’ option did not reveal any results. The pathway indicates relationships between interacting molecules. What worth to mention is that GATA6 showed to cause lipotoxicity once inhibited by palmitate. Interestingly, palmitate is evidenced to inhibit TXNIP, leading to glucotoxicity.28, 29 Such findings suggest lipotoxicity-glucotoxicity interactions. However, being overlooked by PANTHER indicates one of its limitations, which demands further evidence-based investigations. Similarly, when STRING could not show any interacting association between the GATA6 and TXNIP, which can be considered a limitation. Additionally, STRING is not a beta-cell specific database. Notably, it gives a general picture of the biological and metabolic interactions but is not specific enough to indicate an association with something as deep as the beta cell. To date, the STRING database involves 24 584 628 proteins of 5 090 organisms.45 The interactions (direct, physical, and indirect; functional) are determined from several sources, including genomic context predictions, high-yield laboratory experiments, automated text mining, former knowledge in databases, and co-expression.
According to the predictive analysis of identified differentially expressed genes (DEGs), PPI was structured to recognize the most relevant and remarkable biological modules which could serve dignified functions in the pathogenesis of T2DM or possible association with GATA6 and its other likely novel diverse roles rather than the current notion of being principally involved in the differentiation and functionality of islet beta cells. Additionally, GATA6 and the interacting genes of the GO enrichment analysis of DEG by STRING were significantly enriched with an FDR value <0.05. The biological process showed significant enrichment in a number of processes including ‘glucose homeostasis’, ‘regulation of angiogenesis’, ‘transforming growth factor beta receptor signalling pathway’, ‘WNT signalling pathway’, ‘regulation of cell cycle’, ‘aging’, ‘regulation of molecular function,’, ‘organ growth’, metabolic process’, ‘negative response to apoptotic process’, ‘endocrine pancreas development’, ‘response to stress’, ‘cell fate determination’, ‘cell differentiation’, ‘gene expression’. Furthermore, the molecular function was significantly enriched in ‘transforming growth factor beta receptor binding’, ‘transcription factor, protein and DNA binding’, ‘zinc ion binding', and 'beta-catenin binding' (Table 2). The cellular component was also significantly enriched in ‘the nucleus’, ‘protein-containing complex’, ‘transcription regulator complex’, ‘RNA polymerase II transcription factor complex’, ‘cell’.
The GATA6 PPI network and associated molecules illustrated a value of significance in the analysis of the KEGG pathway, demonstrating 15 pathways, of which, 'mature diabetes of the young', 'TGF-beta signaling pathway', 'signal pathways that regulate stem cell pluripotency', 'WNT, MAPK, and Hippo signaling pathways', 'adherence junction', 'Chagas disease', 'HTLV-1 infection', 'thyroid, gastric and colorectal cancer’, ‘proteoglycans in cancer’, ‘pathways in cancer' and ‘transcriptional misregulation in cancer’. The list of the genes in every significant pathway based on KEGG analysis. Noteworthy, enrichment analysis appreciated significant KEGG pathways which correlate with T2DM and different types of cancer development. For instance, GATA6 was associated with HHEX, HNF4A, FOXA2, and PAX6 in ‘maturity onset diabetes of the young pathway’, which builds agreement with Cnop et al. data as GATA6 inhibition affects islet beta-cells differentiation and function. In the same way, the MYC gene was associated with GATA6 and was significantly enriched in all identified KEGG pathways with the exception of 'young' diabetes with maturity, 'Adherence junction', and ‘Chagas disease’. These findings are widely consistent with previous studies as MYC has amplifications at high frequency in lung cancer,55 and it plays substantial functions in cell differentiation and proliferation, associated with the progression, occurrence, and prognosis of different tumors.56 Furthermore, MYC promotes VEGFA production that activates angiogenesis, a physiological process for oxygen supply in tumor development.57
Another gene significantly associated with GATA6 is BMP2 which showed a confidence interacting score of 0.477, as indicated in Table 2. BMP2 stimulates TMEM119 and EIF2A, which upregulate the expression of ATF4. This suggests that GATA6-induced alterations may affect BMP2 that was significantly enriched in ‘TGF-beta and Hippo signalling pathway’ and ‘pathways in cancer’. Of particular interest, ATF4 plays a major role in gene expression of stress-induced autophagy58 and is a principal target in cancer therapy.59 Not to mention, GATA6 showed a significant interaction with CALR that promotes quality control of the endoplasmic reticulum (folding and assembly of proteins). An interesting finding was recently released by Liu et al.,60 suggesting that inhibition of GATA6 may interact effectively to reverse and prevent Trastuzumab in events such as gastric cancer. The previous findings extend to highlight a downregulation of the CALR gene upon GATA6 inhibition. A major key to the failure of effective tumor chemotherapy is resistance to trastuzumab.61 Of significance, these findings support evidence from previous observations,62, 63, 64 where GATA6 expression was noticed to have a relationship with malignancy and metastasis, function diversity, and tissue specificity of different types of tumors. Of that, GATA6 works as an oncogene in oesophagus cancer, colon cancer, and pancreas cancer.
An unanticipated finding was that GATA6 demonstrated an association during the protein-protein interactions network, with ENSG00000258724. The protein showed a confidence-interacting core of 0.767 with GATA6. More specifically, ENSG00000258724 turns out to be an uncharacterized protein upon analysis. It adheres and is able to stimulate the CDH5 promoter. Therefore, it acts as an important function in regulating the transcriptionally expressed genes involved in hemogenic endothelium and prevents the differentiation of blood precursors. Besides, its necessity in the hematopoietic and endothelial precursors’ survival. Also, it suppresses WNT/beta-Catenin-Stimulated transcription proposed via placing CTNNB1 as a target for proteasomal degradation. And its role in competing with GATA4 for activation and adherence to the FGF3 promoter.45 According to the present results, a previous study65 has shown that among hematopoietic cells, it was demonstrated that GATA6 is significantly expressed in resident peritoneal macrophages. This indicates its importance in the survival, differentiation, and metabolism of these cells. The study focused only on peritoneal macrophages type F4/80. In addition to that, these findings were in agreement with the data obtained by examining ICAM-2+ peritoneal macrophages throughout the hematopoietic system.66 This also accords with earlier studies’ observations which indicate its vital role in the hematopoietic cells.67, 68 Similarly, GATA6 showed a significant association with SOX7, which is essential for hematopoietic endothelial precursors. Protein-to-protein interactions exhibited a confidence interacting score of 0.767. In the same fashion, GATA6 presented with a confidence interacting score of 0.757 with HHEX in the PPI analysis, a protein that is also involved in hematopoietic survival, as shown in Table 2. This allows the introduction of more details on GATA6 and highlighted possible candidate genes in the pathogenesis of several diseases besides T2DM.
One of the most vital points in the current study is the internal validation of expression data and bioinformatics analysis tools. This allowed a thorough investigation of the transcriptome of interest to yield possible novel insights. However, the previously mentioned methods suffer from some drawbacks. There were a limited number of sources of our gene of interest (GATA6) for a reason that was novel to the literature. Moreover, PANTHER is limited to a small number of identifier types and four organisms. Not to mention, PANTHER was not able to give the proper ‘pathway’ details during the GO analysis. Despite STRING being supported by a sound analytical algorithm, it failed to generate related islet beta cell interactions because it is a non beta cell specific database. Another limitation of STRING that is worth mentioning is that a large number of proteins are challenging to visualize and induce interpretations due to the large number of nodes within the generated network. The reason behind that is referred to the designed nature of the web interface. However, occasionally this problem can be overcome by downloading the data via ‘download files’ whenever convenient. Ultimately, comparing different datasets as part of self-validation, the study was relatively subjected to selection bias. However, the selection criterion was mainly to get at least two datasets with various patient cohorts without investigating differentially expressed genes data which meets our aim of interest and could cause bias. Worth mentioning, there were few data available in the scientific literature which carry expression data on our gene of interest as a consequence of being novel.
Despite these promising results suggesting potential gene candidates, questions remain, and further work is required to establish its validity. In fact, there is abundant room to stretch and build on these findings to make further progress. This distinction is further exemplified in a serious limitation of the available literature as researchers just focused on palmitate investigations and neglected other saturated fatty acids in vivo, which proposed to cause lipotoxicity, thus, the degree of effect is yet to be determined by future studies.
Conclusion
In conclusion, the current transcriptomic study highlighted cellular pathways and several genes, which allowed a precise understanding of the molecular mechanisms underlying T2DM and associated diseases based on the analysis of predictive pathways’ analysis. This plays a significant function in providing more insights on GATA6 and other recognized possible candidates, perhaps novel in the pathogenesis of complex diseases such as diabetes and cancer, which indicates potentially possible targets for advanced future therapeutic and diagnostic strategies. Future prospective studies are urgently demanded to validate the identified findings.