Dataset information

This report has been verified by Polly as per framework v1.0 Learn More

Dataset information Value
Dataset ID GSE137143_GPL24676_raw
Title Cell type-specific transcriptomics identifies neddylation as a novel therapeutic target in multiple sclerosis
Summary Multiple sclerosis (MS) is an autoimmune disease of the central nervous system in which both genetic and environmental factors are thought to be involved. Genome-wide association studies revealed more than 200 risk loci, most of which harbor genes primarily expressed in immune cells. However, whether genetic differences are translated into cell-specific gene expression profiles and to what extent these are altered in MS are not well understood. To assess cell-type-specific gene expression in a large cohort of MS patients, we sequenced the whole transcriptome of sorted T cells (CD4+ and CD8+) and CD14+ monocytes from treatment-naive MS patients (n=122) and healthy subjects (n=22). Next, we performed a comprehensive analysis of the RNA sequencing dataset and identified 612 differentially expressed genes (DEGs) in CD14+ monocytes, 464 in CD4+ T cells, and 93 in CD8+ T cells. Notably, about one third (36.6%) of DEGs were non-coding RNAs, the majority of which (88.2%) were down-regulated in MS. We identified large co-expressed gene modules and cis-eQTLs with key MS genes in each cell subset. Importantly, we discovered dysregulation of NAE1, a subunit of NEDD8 activating enzyme (NAE), in CD4+ T cells which activates the neddylation pathway. Finally, we demonstrated that NAE inhibition using Pevonedistat (MLN4924) dampened disease severity in murine experimental autoimmune encephalomyelitis (EAE). Our findings provide novel insights into MS-associated gene regulation unraveling neddylation as a crucial pathway in MS pathogenesis with implications for the development of tailored disease-modifying agents.
Overall Design Whole transcriptome profile of sorted CD4+ T cells, CD8+ T cells, and CD14+ monocytes in treatment naïve MS patients (n=122) and healthy controls (n=22)
Number of samples 427
Publication Link Link
Abstract Multiple sclerosis is an autoimmune disease of the CNS in which both genetic and environmental factors are involved. Genome-wide association studies revealed more than 200 risk loci, most of which harbour genes primarily expressed in immune cells. However, whether genetic differences are translated into cell-specific gene expression profiles and to what extent these are altered in patients with multiple sclerosis are still open questions in the field. To assess cell type-specific gene expression in a large cohort of patients with multiple sclerosis, we sequenced the whole transcriptome of fluorescence-activated cell sorted T cells (CD4+ and CD8+) and CD14+ monocytes from treatment-naive patients with multiple sclerosis (n = 106) and healthy subjects (n = 22). We identified 479 differentially expressed genes in CD4+ T cells, 435 in monocytes, and 54 in CD8+ T cells. Importantly, in CD4+ T cells, we discovered upregulated transcripts from the NAE1 gene, a critical subunit of the NEDD8 activating enzyme, which activates the neddylation pathway, a post-translational modification analogous to ubiquitination. Finally, we demonstrated that inhibition of NEDD8 activating enzyme using the specific inhibitor pevonedistat (MLN4924) significantly ameliorated disease severity in murine experimental autoimmune encephalomyelitis. Our findings provide novel insights into multiple sclerosis-associated gene regulation unravelling neddylation as a crucial pathway in multiple sclerosis pathogenesis with implications for the development of tailored disease-modifying agents.
Disease Demyelinating Diseases, Multiple Sclerosis, Multiple Sclerosis, Chronic Progressive, Multiple Sclerosis, Relapsing-Remitting, Normal
Tissue Blood
Drug None
Cell Lines None
Cell Type Cd14-Positive Monocyte, Cd4-Positive Helper T Cell, Cd8-Positive, Alpha-Beta Cytotoxic T Cell
Organism Homo Sapiens
Custom Curation N/A

Processing information

The section provides processing details for the data coming from source.

Data Processing SRA files are converted to fastq files using fasterq dump, then QC'ed using FastQC with short read threshold of 20. MinION adapter search with adapter threshold 2 is performed on Fastq file(s) and skewer quality trimming is done, with min. read length (18), and phred quality threshold (10). Kallisto quantification with fragment length (100) and standard deviation (20) is used to get read counts. These parameters ensure robust analysis and reliable interpretation of bulk RNA-seq data.
1. Metadata information
Metadata information Value
Polly curated metadata fields are present at dataset level Pass
Polly curated metadata fields are present at sample level Pass
Polly curated metadata fields are present in gct file Pass
Publication Link is provided Pass
Publication Link is valid Pass
Dataset-Level vs. Sample-Level Metadata: concordance check Pass
Custom fields are present and valid N/A

2. Feature identifier
Feature Identifier Check Value
Ensembl Gene IDs present Fail
Ensembl Gene IDs are valid Fail
Gene Symbol present Pass
Gene Symbol are valid Pass

3. Data Matrix
Data Matrix Value
Data Matrix Values Valid Pass
Data Matrix Range 0.00 to 1931907.00


4. Histogram for expression distribution

Figure 1: Histogram showing frequency and distribution of TPM normalised expression values across all samples.

The histogram displays data distribution from counts matrix. The Raw count values are TPM normalized and log2(x+1) transformed for clarity.


5. Sample wise distribution of expression values using a boxplot.

Figure 2:  Boxplot showing TPM expression values across all samples.

The boxplot displays sample-wise distribution of counts matrix. The Raw count values are TPM normalized and log2(x+1) transformed for clarity.


6. Sample wise distribution of number of genes expressing using a barplot.

Figure 3: Barplot showing the distribution of number of genes with expresion value equal to 0 per sample.

This barplot helps identify if there are any samples with significantly number of genes which are lowly expressed which may indicate low mapping of reads to the genome.


1. Polly's curated metadata field distribution

Figure 1: The umap plot(s) represent different samples in a reduced dimensional space, with colors indicating the Polly standard and custom curated fields.

The plot(s) aid in understanding the biological differences between different samples as described by different metadata fields. Note: Umap plot for the raw counts will not be a reflective of correct distribution as the data requires normalisation

Figure 2: The sunburst plot(s) represent counts of different samples, with colors representing values from the Polly standard and custom curated fields.

The plot(s) aid in understanding the distribution of different samples as per the categorical metadata variables of Polly standard curated fields


2. Source metadata field distribution

Figure 3: The umap plot(s) represent different samples in a reduced dimensional space, with colors indicating the source metadata fields.

The plot(s) aid in understanding the biological differences between different samples as described by different metadata fields. Note: Umap plot for the raw counts will not be a reflective of correct distribution as the data requires normalisation


Figure 4: The sunburst plot represent counts of different samples, with colors representing values from the source.

The plot(s) aid in understanding the distribution of different samples as per the categorical metadata variables of source fields