Table of Contents generated with DocToc
- Introduction
- Metadatabase
- Using the MDb
- Prepared datasets
- Loading and extracting the data
- Use cases
- Session info
- References
Introduction
Metadatabase
Using the MDb
The BIOS project has generated for over 4000 individuals RNA-sequencing and DNA methylation data. A part from these data, GoNL imputed genotypes were generated from existing genotypes and several phenotypes/demographic variables were collected for the same set of samples. A highly flexible sample-oriented metadatabase (MDb) was created in order to manage the dynamic generation of this large-scale multiple-omic data set.
The MDb is a non-relation database (http://couchdb.apache.org/) that uses JSON to store records and JavaScript for querying. Furthermore, it has an HTTP API suitable to programmatically access the database from the GRID, e.g, the alignment pipeline.
Each record or document is a sample (individual) within the BIOS project and has a unique identifier. Each document has a predefined structure according to our database schema (https://git.lumc.nl/rp3/bios-schema). Custom Python scripts are use to update or modify the database (https://git.lumc.nl/rp3/bios-mdb.)
Access to the metadatabase (MDb) is restricted; please contact (Leon Mei or Maarten van Iterson).
Brief description of MDb content
The MDb contains as much as meta-information as possible from all samples and datatypes: location of (raw) data on srm, md5 checksum verification, quality control information, links between the different identifiers used (person_id, dna_id, etc) and phenotype information.
Every sample's meta information is encoded in a CouchDB document. Each document has a unique identifier (the bios_id) which is biobankname (CODAM, LL, LLS, NTR, RS and PAN) concatenated with person_id separated by a "-", e.g. CODAM-2001. This unique bios_id is not suitable for use in the public domain, e.g., EGA upload, therefore a unique not identifiable identifier has been created for each individual; the uuid.
Every update of a sample in the database is recorded by increasing a revision number. Therefore it is always possible to undo wrong updates. The attachment of this page has a json file representing a sample's information in the metadatabase (The content of the file can be past on a JSON viewer e.g. http://jsonviewer.stack.hu).
Brief description available views
Views are the way to extract information form a couchDb. Views are organized into designs; each design contains a number of views related to a particular kind of information that can be extracted from the MDb. For example, there is a design 'EGA' which contains currently two views 1) 'freeze1RNASeq' to extract those samples for which RNAseq data has been uploaded to EGA and 2) 'freeze1Methylation' for the DNA methylation data.
Other relevant views are:
design: | view: |
---|---|
EGA | freeze1RNASeq, freeze1Methylation |
Files | getFastq, getIdat |
Identifiers | getIds |
Phenotypes | allPhenotypes, cellCounts, minimalPhenotypes |
Runs | getGenotypes, getMethylationRuns, getRNASeqRuns |
Samplesheets | rnaseqSamplesheet, methylationSamplesheet |
Verification | md5 |
Note: We can always add views if necessary; please contact Maarten van Iterson.
Accessing the MDb
Views can be downloaded as JSON documents by making a GET request. Most programming languages have utilities for making GET requests and to transform JSON documents. Some programming languages have an API for CouchDB e.g. JAVA and Python. There are several online tools available for transforming JSON documents to csv files.
Please note that it is usually better to download the view separately and work on the downloaded file. This way you only have to enter your password once and you're resilient to network connectivity problems.
Access the metadatabase using R
We have developed the R package BIOSRutils (https://git.lumc.nl/rp3/biosrutils) for easy access to the MDb and processed datasets. BIOSRutils is available on the VM for R version 3.2.0 (start R using command R-3.2.0 from the commandline). The current version 0.0.1 this is still a development version, several of our aimed features are not yet fully implemented.
BIOSRutils uses a configuration file to read in your MDb username and password, so that you do not have to type it every time you use the MDb.
Create a file called .biosrutils and stored it in your home directory on the VM (/home/username) and add as the first line:
usrpwd: 'username:password'
Note: if your password contains any characters that bash treats specially (' / ^ & # etc.), make sure to escape them appropriately using \ or \\.
Start R-3.2.0 and load the library:
library(BIOSRutils)
Several predefined variables are available, such as, the urls to the current MDb and Rdb, as well as, your provide username and password (USRPWD). All the variables are capitalized to minimize interference with your own code.
ls()
## [1] "BIOBANKS" "DATASETS" "MDB" "PROXY" "RDB"
## [6] "RP3DATADIR" "SRMBASE" "USRPWD" "VIEWS"
The BIOSRutils package provides the function getView to extract a particular view from the MDb. All available views are stored in the global variable VIEWS. Use the regular way to get help in R, e.g.:
`?`(getView)
For example, we want to extract all phenotype information from all samples we use the allPhenotype view from the design Phenotypes.
## curl -X GET https://metadatabase.bbmrirp3-lumc.vm.surfsara.nl:6984/bios/_design/Phenotypes/_view/allPhenotypes?reduce=false -u 'username:password' -k -g
## Got4600records from database
Basic R manipulations can be use to select particular information. e.g.:
LLSMalesAbove70 <- subset(phenotypes, grepl("LLS", ids) & Sex == 0 & DNA_BloodSampling_Age >
70)
LLSMalesAbove70[1:5, 1:5]
## ids RNA_A260280ratio Lipids_BloodSampling_Date TotChol LDLchol
## 1809 LLS-1068 NA 2003-12-10 3.81 2.2725
## 1838 LLS-1195 2.13 2004-01-26 9.80 NA
## 1853 LLS-1265 2.15 2004-02-23 4.97 2.3010
## 1854 LLS-1279 2.13 2004-02-27 7.39 5.2850
## 1884 LLS-1361 2.13 2004-03-10 6.30 3.6815
Prepared datasets
The BIOS gene and exon counts are stored in a R-object of type SummarizedExperiment
. This is a data container that can store feature data like gene counts but annotation on the features as well as annotation on the samples. Furthermore, the feature annotation is RGanges
-object which has several advantages. Both data set can be loaded into R using the data
. Use colData
, rowData
or assays
to extract information from the object.
More about SummarizedExperiments
: 1. package vignette 2. course material 3. BioConductor nature paper
The BIOSRutils
global variable DATASETS
lists all available data sets.
DATASETS
## [1] "metabolomics_RP3RP4_overlap"
## [2] "methData_Betas_CODAM_F2"
## [3] "methData_Betas_LL_F2"
## [4] "methData_Betas_LLS_F2"
## [5] "methData_Betas_NTR_F2"
## [6] "methData_Betas_PAN_F2"
## [7] "methData_Betas_RS_F2"
## [8] "methData_BIOS_02042015"
## [9] "methData_CODAM"
## [10] "methData_LL"
## [11] "methData_LLS"
## [12] "methData_Mvalues_CODAM_F2"
## [13] "methData_Mvalues_LL_F2"
## [14] "methData_Mvalues_LLS_F2"
## [15] "methData_Mvalues_NTR_F2"
## [16] "methData_Mvalues_PAN_F2"
## [17] "methData_Mvalues_RS_F2"
## [18] "methData_NTR"
## [19] "methData_PAN"
## [20] "methData_RS"
## [21] "rnaSeqData_freeze1_06032015BIOS"
## [22] "rnaSeqData_freeze1_exon_14042015BIOS"
Loading and extracting the data
Load a specific data set usign data
, check the name of the loaded data with ls
and view its content by just typing the name in the console which will automatically call the buildin show
-method.
DNA methylation data
data(methData_LLS)
ls()
## [1] "BIOBANKS" "DATASETS" "LLSMalesAbove70"
## [4] "MDB" "methData" "phenotypes"
## [7] "PROXY" "RDB" "RP3DATADIR"
## [10] "SRMBASE" "USRPWD" "VIEWS"
methData
## Warning: The SummarizedExperiment class defined in the GenomicRanges package is
## deprecated and being replaced with the RangedSummarizedExperiment class
## defined in the new SummarizedExperiment package. You can use
## updateObject() on any SummarizedExperiment object to turn it into a
## RangedSummarizedExperiment.
## class: SummarizedExperiment
## dim: 485512 784
## exptData(0):
## assays(1): beta
## rownames(485512): cg00050873 cg00212031 ... ch.22.47579720R
## ch.22.48274842R
## rowRanges metadata column names(10): addressA addressB ...
## probeEnd probeTarget
## colnames(784): 9374343010_R04C02 8691803012_R04C02 ...
## 8667045031_R01C01 8655685028_R05C02
## colData names(27): uuid dna_id ... Basename filenames
class(methData)
## [1] "SummarizedExperiment"
## attr(,"package")
## [1] "GenomicRanges"
colData(methData)
## DataFrame with 784 rows and 27 columns
## uuid dna_id biobank_id Sample_Plate
## <character> <character> <character> <character>
## 9374343010_R04C02 BIOS648CBD1C 1002 LLS 10
## 8691803012_R04C02 BIOS33DC8FBC 104 LLS 3
## 8454787132_R02C01 BIOS275BFCF8 1076 LLS 1
## 8655685041_R02C02 BIOSC7D66E13 1133 LLS 4
## 8655685197_R04C01 BIOSE7E8110D 124 LLS 2
## ... ... ... ... ...
## 8691803077_R03C01 BIOS0CA69A11 727 LLS 3
## 8691803074_R04C01 BIOS275638C1 849 LLS 3
## 8655685009_R04C02 BIOS884EDA9D 885 LLS 2
## 8667045031_R01C01 BIOSBABA99DE 924 LLS 2
## 8655685028_R05C02 BIOS7708CCB4 997 LLS 2
## Sample_Well Sentrix_Barcode Sentrix_Lotnumber
## <character> <character> <character>
## 9374343010_R04C02 B05 9374343010 9374343
## 8691803012_R04C02 B05 8691803012 8691803
## 8454787132_R02C01 F02 8454787132 8454787
## 8655685041_R02C02 H10 8655685041 8655685
## 8655685197_R04C01 H11 8655685197 8655685
## ... ... ... ...
## 8691803077_R03C01 G08 8691803077 8691803
## 8691803074_R04C01 H11 8691803074 8691803
## 8655685009_R04C02 B11 8655685009 8655685
## 8667045031_R01C01 E05 8667045031 8667045
## 8655685028_R05C02 G03 8655685028 8655685
## Sentrix_Position C1_Barcode C1_Lotnumber
## <character> <character> <character>
## 9374343010_R04C02 R04C02 wg2472386-xc1 9426762
## 8691803012_R04C02 R04C02 wg0511987-xc1 8784516
## 8454787132_R02C01 R02C01 wg0513402-xc1 8784516
## 8655685041_R02C02 R02C02 wg0514733-xc1 8784516
## 8655685197_R04C01 R04C01 wg0513724-xc1 8784516
## ... ... ... ...
## 8691803077_R03C01 R03C01 wg0511957-xc1 8784516
## 8691803074_R04C01 R04C01 wg0511957-xc1 8784516
## 8655685009_R04C02 R04C02 wg0513724-xc1 8784516
## 8667045031_R01C01 R01C01 wg0513714-xc1 8784516
## 8655685028_R05C02 R05C02 wg0513714-xc1 8784516
## C2_Barcode C2_Lotnumber TEM_Barcode TEM_Lotnumber
## <character> <character> <character> <character>
## 9374343010_R04C02 wg2473694-xc2 9430495 wg2482594-tem 9429751
## 8691803012_R04C02 wg0527887-xc2 8783644 wg0591558-tem 8615771
## 8454787132_R02C01 wg0527876-xc2 8783644 wg0591239-tem 8615771
## 8655685041_R02C02 wg0526031-xc2 8783644 wg0592298-tem 8615771
## 8655685197_R04C01 wg0527860-xc2 8783644 wg0597544-tem 8615771
## ... ... ... ... ...
## 8691803077_R03C01 wg0527888-xc2 8783644 wg0597543-tem 8615771
## 8691803074_R04C01 wg0527888-xc2 8783644 wg0597543-tem 8615771
## 8655685009_R04C02 wg0527860-xc2 8783644 wg0597544-tem 8615771
## 8667045031_R01C01 wg0527877-xc2 8783644 wg0591559-tem 8615771
## 8655685028_R05C02 wg0527877-xc2 8783644 wg0591559-tem 8615771
## STM_Barcode STM_Lotnumber ATM_Barcode ATM_Lotnumber
## <character> <character> <character> <character>
## 9374343010_R04C02 wg1611625-stm 9370714 wg2519407-atm 9419844
## 8691803012_R04C02 wg1566697-stm 9269715 wg0537309-atm 8762691
## 8454787132_R02C01 wg1577012-stm 9284859 wg0537570-atm 8762691
## 8655685041_R02C02 wg1577003-stm 9284859 wg0535309-atm 8762691
## 8655685197_R04C01 wg1567608-stm 9269715 wg0537310-atm 8762691
## ... ... ... ... ...
## 8691803077_R03C01 wg1577008-stm 9284859 wg0537569-atm 8762691
## 8691803074_R04C01 wg1577008-stm 9284859 wg0537569-atm 8762691
## 8655685009_R04C02 wg1567608-stm 9269715 wg0537310-atm 8762691
## 8667045031_R01C01 wg1566700-stm 9269715 wg0537320-atm 8762691
## 8655685028_R05C02 wg1566700-stm 9269715 wg0537320-atm 8762691
## Library_Date Hybridization_Date Stain_Date Scan_Date
## <character> <character> <character> <character>
## 9374343010_R04C02 03-02-2014 04-02-2014 05-02-2014 06-02-2014
## 8691803012_R04C02 18-06-2013 20-06-2013 21-06-2013 22-06-2013
## 8454787132_R02C01 18-06-2013 20-06-2013 21-06-2013 21-06-2013
## 8655685041_R02C02 18-06-2013 24-06-2013 25-06-2013 27-06-2013
## 8655685197_R04C01 18-06-2013 20-06-2013 21-06-2013 21-06-2013
## ... ... ... ... ...
## 8691803077_R03C01 18-06-2013 20-06-2013 21-06-2013 21-06-2013
## 8691803074_R04C01 18-06-2013 20-06-2013 21-06-2013 22-06-2013
## 8655685009_R04C02 18-06-2013 20-06-2013 21-06-2013 21-06-2013
## 8667045031_R01C01 18-06-2013 20-06-2013 21-06-2013 21-06-2013
## 8655685028_R05C02 18-06-2013 20-06-2013 21-06-2013 22-06-2013
## Scan_Time Scanner_Name bios_id
## <character> <character> <character>
## 9374343010_R04C02 09:47:41.7432+01:00 N140 LLS-1002
## 8691803012_R04C02 01:04:19.2928+02:00 N140 LLS-104
## 8454787132_R02C01 14:51:16.1618+02:00 N140 LLS-1076
## 8655685041_R02C02 22:25:06.015+02:00 N219 LLS-1133
## 8655685197_R04C01 19:01:55.1268+02:00 N140 LLS-124
## ... ... ... ...
## 8691803077_R03C01 21:52:23.8738+02:00 N140 LLS-727
## 8691803074_R04C01 03:08:03.7458+02:00 N140 LLS-849
## 8655685009_R04C02 20:25:29.2708+02:00 N140 LLS-885
## 8667045031_R01C01 19:28:52.5858+02:00 N140 LLS-924
## 8655685028_R05C02 02:52:41.1678+02:00 N140 LLS-997
## Basename
## <character>
## 9374343010_R04C02 /virdir/Scratch/RP3_data/450k//LLS/9374343010/9374343010_R04C02
## 8691803012_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8691803012/8691803012_R04C02
## 8454787132_R02C01 /virdir/Scratch/RP3_data/450k//LLS/8454787132/8454787132_R02C01
## 8655685041_R02C02 /virdir/Scratch/RP3_data/450k//LLS/8655685041/8655685041_R02C02
## 8655685197_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8655685197/8655685197_R04C01
## ... ...
## 8691803077_R03C01 /virdir/Scratch/RP3_data/450k//LLS/8691803077/8691803077_R03C01
## 8691803074_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8691803074/8691803074_R04C01
## 8655685009_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8655685009/8655685009_R04C02
## 8667045031_R01C01 /virdir/Scratch/RP3_data/450k//LLS/8667045031/8667045031_R01C01
## 8655685028_R05C02 /virdir/Scratch/RP3_data/450k//LLS/8655685028/8655685028_R05C02
## filenames
## <character>
## 9374343010_R04C02 /virdir/Scratch/RP3_data/450k//LLS/9374343010/9374343010_R04C02
## 8691803012_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8691803012/8691803012_R04C02
## 8454787132_R02C01 /virdir/Scratch/RP3_data/450k//LLS/8454787132/8454787132_R02C01
## 8655685041_R02C02 /virdir/Scratch/RP3_data/450k//LLS/8655685041/8655685041_R02C02
## 8655685197_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8655685197/8655685197_R04C01
## ... ...
## 8691803077_R03C01 /virdir/Scratch/RP3_data/450k//LLS/8691803077/8691803077_R03C01
## 8691803074_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8691803074/8691803074_R04C01
## 8655685009_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8655685009/8655685009_R04C02
## 8667045031_R01C01 /virdir/Scratch/RP3_data/450k//LLS/8667045031/8667045031_R01C01
## 8655685028_R05C02 /virdir/Scratch/RP3_data/450k//LLS/8655685028/8655685028_R05C02
rowRanges(methData)
## GRanges object with 485512 ranges and 10 metadata columns:
## seqnames ranges strand | addressA
## <Rle> <IRanges> <Rle> | <character>
## cg00050873 chrY [ 9363356, 9363357] * | 32735311
## cg00212031 chrY [21239348, 21239349] * | 29674443
## cg00213748 chrY [ 8148233, 8148234] * | 30703409
## cg00214611 chrY [15815688, 15815689] * | 69792329
## cg00455876 chrY [ 9385539, 9385540] * | 27653438
## ... ... ... ... ... ...
## ch.22.909671F chr22 [46114168, 46114168] * | 47797398
## ch.22.46830341F chr22 [48451677, 48451677] * | 29618504
## ch.22.1008279F chr22 [48731367, 48731367] * | 49664383
## ch.22.47579720R chr22 [49193714, 49193714] * | 53733426
## ch.22.48274842R chr22 [49888838, 49888838] * | 62659432
## addressB channel platform percentGC
## <character> <Rle> <Rle> <numeric>
## cg00050873 31717405 Red HM450 0.62
## cg00212031 38703326 Red HM450 0.64
## cg00213748 36767301 Red HM450 0.56
## cg00214611 46723459 Red HM450 0.72
## cg00455876 69732350 Red HM450 0.64
## ... ... ... ... ...
## ch.22.909671F Both HM450 0.34
## ch.22.46830341F Both HM450 0.46
## ch.22.1008279F Both HM450 0.56
## ch.22.47579720R Both HM450 0.60
## ch.22.48274842R Both HM450 0.58
## sourceSeq
## <DNAStringSet>
## cg00050873 CGGGGTCCACCCACTCCAAAAACCACCACAGTTGTGCGTTGCCTCCTCGC
## cg00212031 CGCACGTCTTCCCGACCGCATAACTTGCTCAGTCCCTGCGGCCAACTGGG
## cg00213748 CGCCCCCTCCTGCAGAACCTCCATCGTTAAAACGGTGCCAGGCGTTAAAA
## cg00214611 CGCCCGCGCCACACTGCAGCCCAGCACACAAAGCGCGGCCCGGAAGCTAG
## cg00455876 GACTCTGAGCTACCCGGCACAAGCTCCAAGGGCTTCTCGGAGGAGGCTCG
## ... ...
## ch.22.909671F CAGCAAATCAAAAATTCACTGAAAAGAAATGCTTTTGTGTGTAAGTGGTG
## ch.22.46830341F CAGCATCACATGTAGAAGGCATTCTGCTCAGAGAATGGCCTCCATTTTTC
## ch.22.1008279F CAAGACTCATTCAACACAGACCCAGCCTCAGGCCCAGGAAGACTGTAGGG
## ch.22.47579720R CAGGCAAGGGGCCTCAGAGATCACCAGCAAACCCCAGAAGCTGGAGAGAG
## ch.22.48274842R ACTGACTGCAGGTGCTCACCAGCAACAGGGTGCTCACCCACAACAGGAAC
## probeType probeStart probeEnd probeTarget
## <Rle> <character> <character> <numeric>
## cg00050873 cg 9363308 9363357 9363356
## cg00212031 cg 21239300 21239349 21239348
## cg00213748 cg 8148185 8148234 8148233
## cg00214611 cg 15815640 15815689 15815688
## cg00455876 cg 9385491 9385540 9385539
## ... ... ... ... ...
## ch.22.909671F ch 46114168 46114217 46114168
## ch.22.46830341F ch 48451677 48451726 48451677
## ch.22.1008279F ch 48731367 48731416 48731367
## ch.22.47579720R ch 49193714 49193763 49193714
## ch.22.48274842R ch 49888838 49888887 49888838
## -------
## seqinfo: 24 sequences from hg19 genome
assays(methData)$beta[1:5, 1:5]
## 9374343010_R04C02 8691803012_R04C02 8454787132_R02C01
## cg00050873 NA NA NA
## cg00212031 NA NA NA
## cg00213748 NA NA NA
## cg00214611 NA NA NA
## cg00455876 NA NA NA
## 8655685041_R02C02 8655685197_R04C01
## cg00050873 NA NA
## cg00212031 NA NA
## cg00213748 NA NA
## cg00214611 NA NA
## cg00455876 NA NA
RNA seq data
Gene-level SummarizedExperiment
data(rnaSeqData_freeze1_06032015BIOS)
ls()
## [1] "BIOBANKS" "DATASETS" "LLSMalesAbove70"
## [4] "MDB" "methData" "phenotypes"
## [7] "PROXY" "RDB" "rnaSeqData"
## [10] "RP3DATADIR" "SRMBASE" "USRPWD"
## [13] "VIEWS"
rnaSeqData
## Warning: The SummarizedExperiment class defined in the GenomicRanges package is
## deprecated and being replaced with the RangedSummarizedExperiment class
## defined in the new SummarizedExperiment package. You can use
## updateObject() on any SummarizedExperiment object to turn it into a
## RangedSummarizedExperiment.
## class: SummarizedExperiment
## dim: 46628 2116
## exptData(0):
## assays(1): counts
## rownames(46628): ENSG00000000419 ENSG00000000457 ...
## ENSG00000270182 ENSG00000270184
## rowRanges metadata column names(2): gc length
## colnames(2116): BD1NYRACXX-5-1 AD10W1ACXX-4-1 ... BC1KAVACXX-1-14
## BC1KAVACXX-8-16
## colData names(140): group lib.size ...
## fastqc_clean.R2_clean_GC_std fastqc_clean.R1_clean_GC_std
colData(rnaSeqData)
## DataFrame with 2116 rows and 140 columns
## group lib.size norm.factors rnaseq_run_id
## <factor> <numeric> <numeric> <character>
## BD1NYRACXX-5-1 CODAM 1259404830 1 BD1NYRACXX-5-1
## AD10W1ACXX-4-1 CODAM 1632462474 1 AD10W1ACXX-4-1
## BD1NYRACXX-5-2 CODAM 1978420658 1 BD1NYRACXX-5-2
## AD10W1ACXX-4-2 CODAM 1334043187 1 AD10W1ACXX-4-2
## BD1NYRACXX-5-3 CODAM 1222613586 1 BD1NYRACXX-5-3
## ... ... ... ... ...
## AD1NFNACXX-1-1 RS 1709905424 1 AD1NFNACXX-1-1
## AC1JV9ACXX-5-10 RS 765091757 1 AC1JV9ACXX-5-10
## AD1NFNACXX-1-20 RS 2327049556 1 AD1NFNACXX-1-20
## BC1KAVACXX-1-14 RS 2401508849 1 BC1KAVACXX-1-14
## BC1KAVACXX-8-16 RS 1710394939 1 BC1KAVACXX-8-16
## bios_id uuid biobank_id person_id
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 CODAM-2001 BIOS6DB3BAD1 CODAM 2001
## AD10W1ACXX-4-1 CODAM-2002 BIOSCFA14234 CODAM 2002
## BD1NYRACXX-5-2 CODAM-2009 BIOSCA449668 CODAM 2009
## AD10W1ACXX-4-2 CODAM-2013 BIOS415A8BFB CODAM 2013
## BD1NYRACXX-5-3 CODAM-2016 BIOSD16ED999 CODAM 2016
## ... ... ... ... ...
## AD1NFNACXX-1-1 RS-942 BIOSCC469FF2 RS 942
## AC1JV9ACXX-5-10 RS-9420 BIOSB1058B1B RS 9420
## AD1NFNACXX-1-20 RS-969 BIOSA2EF6C80 RS 969
## BC1KAVACXX-1-14 RS-982 BIOS027136BA RS 982
## BC1KAVACXX-8-16 RS-984 BIOSC01C4781 RS 984
## nreruns rnaseq_qc methylation_run_id pheno_id
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 1 0 8667053102_R05C02 2001
## AD10W1ACXX-4-1 1 0 8667053157_R01C02 2002
## BD1NYRACXX-5-2 1 0 8667053152_R02C02 2009
## AD10W1ACXX-4-2 1 0 8655685053_R04C02 2013
## BD1NYRACXX-5-3 1 0 8655685094_R01C01 2016
## ... ... ... ... ...
## AD1NFNACXX-1-1 1 0 8691803030_R05C01 942
## AC1JV9ACXX-5-10 1 0 8691803046_R04C02 9420
## AD1NFNACXX-1-20 1 0 8691803032_R01C01 969
## BC1KAVACXX-1-14 1 0 8454787105_R02C02 982
## BC1KAVACXX-8-16 1 0 8691803032_R06C01 984
## gwas_id dna_id rna_id gonl_id
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 2001 2001 2001 NA
## AD10W1ACXX-4-1 2002 2002 2002 NA
## BD1NYRACXX-5-2 2009 2009 2009 NA
## AD10W1ACXX-4-2 2013 2013 2013 NA
## BD1NYRACXX-5-3 2016 2016 2016 NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 942 942 942 NA
## AC1JV9ACXX-5-10 9420 9420 9420 NA
## AD1NFNACXX-1-20 969 969 969 NA
## BC1KAVACXX-1-14 982 982 982 NA
## BC1KAVACXX-8-16 984 984 984 NA
## cg_id in_rp3 rnaseq_freeze methylation_freeze
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA TRUE 1 1
## AD10W1ACXX-4-1 NA TRUE 1 1
## BD1NYRACXX-5-2 NA TRUE 1 1
## AD10W1ACXX-4-2 NA TRUE 1 1
## BD1NYRACXX-5-3 NA TRUE 1 1
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA TRUE 1 1
## AC1JV9ACXX-5-10 NA TRUE 1 1
## AD1NFNACXX-1-20 NA TRUE 1 1
## BC1KAVACXX-1-14 NA TRUE 1 1
## BC1KAVACXX-8-16 NA TRUE 1 1
## gonlv5imputed
## <character>
## BD1NYRACXX-5-1 TRUE
## AD10W1ACXX-4-1 TRUE
## BD1NYRACXX-5-2 TRUE
## AD10W1ACXX-4-2 TRUE
## BD1NYRACXX-5-3 TRUE
## ... ...
## AD1NFNACXX-1-1 TRUE
## AC1JV9ACXX-5-10 TRUE
## AD1NFNACXX-1-20 TRUE
## BC1KAVACXX-1-14 TRUE
## BC1KAVACXX-8-16 TRUE
## Ascertainment_criterion
## <character>
## BD1NYRACXX-5-1 Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-1 Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-2 Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-2 Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-3 Selected for mildly increased DM2 /CVD risk factors
## ... ...
## AD1NFNACXX-1-1 NA
## AC1JV9ACXX-5-10 NA
## AD1NFNACXX-1-20 NA
## BC1KAVACXX-1-14 NA
## BC1KAVACXX-8-16 NA
## GWAS_Chip GWAS_DataGeneration_Date
## <character> <character>
## BD1NYRACXX-5-1 Illumina human omni express 2012
## AD10W1ACXX-4-1 Illumina human omni express 2012
## BD1NYRACXX-5-2 Illumina human omni express 2012
## AD10W1ACXX-4-2 Illumina human omni express 2012
## BD1NYRACXX-5-3 Illumina human omni express 2012
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## DNA_BloodSampling_Age DNA_BloodSampling_Date
## <character> <character>
## BD1NYRACXX-5-1 77.9 2006-08-08
## AD10W1ACXX-4-1 70.5 2006-08-09
## BD1NYRACXX-5-2 66.3 2006-09-14
## AD10W1ACXX-4-2 76.5 2006-09-26
## BD1NYRACXX-5-3 71.9 2006-06-07
## ... ... ...
## AD1NFNACXX-1-1 70.357 2011-10-05
## AC1JV9ACXX-5-10 51.535 2012-03-29
## AD1NFNACXX-1-20 68.233 2011-09-29
## BC1KAVACXX-1-14 66.379 2011-05-19
## BC1KAVACXX-8-16 68.783 2011-10-04
## DNA_BloodSampling_Time DNA_Source
## <character> <character>
## BD1NYRACXX-5-1 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-1 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-2 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-2 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-3 8-11 am whole blood (buffy coat)
## ... ... ...
## AD1NFNACXX-1-1 9:50:00 NA
## AC1JV9ACXX-5-10 8:20:00 NA
## AD1NFNACXX-1-20 9:30:00 NA
## BC1KAVACXX-1-14 10:25:00 NA
## BC1KAVACXX-8-16 9:00:00 NA
## DNA_Extraction_Method DNA_Extraction_Date
## <character> <character>
## BD1NYRACXX-5-1 QIAamp DNA minikit 2012-05-01
## AD10W1ACXX-4-1 QIAamp DNA minikit 2012-05-01
## BD1NYRACXX-5-2 QIAamp DNA minikit 2012-05-01
## AD10W1ACXX-4-2 QIAamp DNA minikit 2012-05-01
## BD1NYRACXX-5-3 QIAamp DNA minikit 2012-05-01
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## DNA_QuantificationMethod DNA_A260A280ratio
## <character> <character>
## BD1NYRACXX-5-1 nanodrop 1.9
## AD10W1ACXX-4-1 nanodrop 1.92
## BD1NYRACXX-5-2 nanodrop 1.89
## AD10W1ACXX-4-2 nanodrop 1.89
## BD1NYRACXX-5-3 nanodrop 1.89
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## RNA_BloodSampling_Age RNA_Sampling_Date RNA_Sampling_Time
## <character> <character> <character>
## BD1NYRACXX-5-1 77.9 2006-08-08 8-11 am
## AD10W1ACXX-4-1 70.5 2006-08-09 8-11 am
## BD1NYRACXX-5-2 66.3 2006-09-14 8-11 am
## AD10W1ACXX-4-2 76.5 2006-09-26 8-11 am
## BD1NYRACXX-5-3 71.9 2006-06-07 8-11 am
## ... ... ... ...
## AD1NFNACXX-1-1 70.357 2011-10-05 9:50:00
## AC1JV9ACXX-5-10 51.535 2012-03-29 8:20:00
## AD1NFNACXX-1-20 68.233 2011-09-29 9:30:00
## BC1KAVACXX-1-14 66.379 2011-05-19 10:25:00
## BC1KAVACXX-8-16 68.783 2011-10-04 9:00:00
## RNA_Source RNA_Extraction_Date
## <character> <character>
## BD1NYRACXX-5-1 PAX gene 2010-07-01
## AD10W1ACXX-4-1 PAX gene 2010-07-01
## BD1NYRACXX-5-2 PAX gene 2010-07-01
## AD10W1ACXX-4-2 PAX gene 2010-07-01
## BD1NYRACXX-5-3 PAX gene 2010-07-01
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## RNA_Extraction_Method RNA_RIN
## <character> <character>
## BD1NYRACXX-5-1 PAXgene blood miRNA kit (Qiacube) 9.1
## AD10W1ACXX-4-1 PAXgene blood miRNA kit (Qiacube) 9
## BD1NYRACXX-5-2 PAXgene blood miRNA kit (Qiacube) 9
## AD10W1ACXX-4-2 PAXgene blood miRNA kit (Qiacube) 8.8
## BD1NYRACXX-5-3 PAXgene blood miRNA kit (Qiacube) 9
## ... ... ...
## AD1NFNACXX-1-1 NA 8.539
## AC1JV9ACXX-5-10 NA 8.1775
## AD1NFNACXX-1-20 NA 8.1436
## BC1KAVACXX-1-14 NA 8.5
## BC1KAVACXX-8-16 NA 8.7492
## RNA_A260280ratio BirthYear Sex Smoking_Age
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 2 1928 0 77.9
## AD10W1ACXX-4-1 2 1936 1 70.5
## BD1NYRACXX-5-2 2.2 1940 0 66.3
## AD10W1ACXX-4-2 2.2 1930 0 76.5
## BD1NYRACXX-5-3 2.1 1934 0 71.9
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA 1941 0 NA
## AC1JV9ACXX-5-10 NA 1960 0 NA
## AD1NFNACXX-1-20 NA 1943 1 NA
## BC1KAVACXX-1-14 NA 1944 0 NA
## BC1KAVACXX-8-16 NA 1942 1 NA
## Smoking Lipids_BloodSampling_Age
## <character> <character>
## BD1NYRACXX-5-1 1 77.9
## AD10W1ACXX-4-1 0 70.5
## BD1NYRACXX-5-2 2 66.3
## AD10W1ACXX-4-2 1 76.5
## BD1NYRACXX-5-3 1 71.9
## ... ... ...
## AD1NFNACXX-1-1 NA 70.357
## AC1JV9ACXX-5-10 NA 51.535
## AD1NFNACXX-1-20 NA 68.233
## BC1KAVACXX-1-14 NA 66.379
## BC1KAVACXX-8-16 NA 68.783
## Lipids_BloodSampling_Date Lipids_BloodSampling_Time
## <character> <character>
## BD1NYRACXX-5-1 2006-08-08 8-11 am
## AD10W1ACXX-4-1 2006-08-09 8-11 am
## BD1NYRACXX-5-2 2006-09-14 8-11 am
## AD10W1ACXX-4-2 2006-09-26 8-11 am
## BD1NYRACXX-5-3 2006-06-07 8-11 am
## ... ... ...
## AD1NFNACXX-1-1 2011-10-05 9:50:00
## AC1JV9ACXX-5-10 2012-03-29 8:20:00
## AD1NFNACXX-1-20 2011-09-29 9:30:00
## BC1KAVACXX-1-14 2011-05-19 10:25:00
## BC1KAVACXX-8-16 2011-10-04 9:00:00
## Lipids_BloodSampling_Fasting TotChol HDLchol
## <character> <character> <character>
## BD1NYRACXX-5-1 1 5.6 1.28
## AD10W1ACXX-4-1 1 4.3 1.24
## BD1NYRACXX-5-2 1 5.4 1.4
## AD10W1ACXX-4-2 1 6 1.08
## BD1NYRACXX-5-3 1 5.7 1.22
## ... ... ... ...
## AD1NFNACXX-1-1 1 4.1 1.6
## AC1JV9ACXX-5-10 1 5.9 1.63
## AD1NFNACXX-1-20 1 6.6 2.22
## BC1KAVACXX-1-14 1 5.3 0.97
## BC1KAVACXX-8-16 1 5.9 1.7
## Triglycerides LDLchol LDLcholMethod LipidsMed_Age
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 1.5 NA NA NA
## AD10W1ACXX-4-1 1.1 NA NA NA
## BD1NYRACXX-5-2 0.7 NA NA NA
## AD10W1ACXX-4-2 2.1 NA NA NA
## BD1NYRACXX-5-3 1 NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 0.91 NA NA NA
## AC1JV9ACXX-5-10 0.92 NA NA NA
## AD1NFNACXX-1-20 1.22 NA NA NA
## BC1KAVACXX-1-14 1.4 NA NA NA
## BC1KAVACXX-8-16 0.65 NA NA NA
## LipidMed Anthropometry_Age Height Weight
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 1 77.9 175.5 76.25
## AD10W1ACXX-4-1 1 70.5 166 116.6
## BD1NYRACXX-5-2 0 66.3 170 83
## AD10W1ACXX-4-2 0 76.5 172 86.3
## BD1NYRACXX-5-3 0 71.9 174.5 74.75
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA 172 87.5
## AC1JV9ACXX-5-10 NA NA 180 99.9
## AD1NFNACXX-1-20 NA NA 162 66.7
## BC1KAVACXX-1-14 NA NA 183.7 84.3
## BC1KAVACXX-8-16 NA NA 162.7 73.3
## CRP_BloodSampling_Age CRP_BloodSampling_Date
## <character> <character>
## BD1NYRACXX-5-1 77.9 2006-08-08
## AD10W1ACXX-4-1 70.5 2006-08-09
## BD1NYRACXX-5-2 66.3 2006-09-14
## AD10W1ACXX-4-2 76.5 2006-09-26
## BD1NYRACXX-5-3 71.9 2006-06-07
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## CRP_BloodSampling_Time hsCRP
## <character> <character>
## BD1NYRACXX-5-1 NA 0.95
## AD10W1ACXX-4-1 NA 4.61
## BD1NYRACXX-5-2 NA 0.78
## AD10W1ACXX-4-2 NA 8.48
## BD1NYRACXX-5-3 NA 0.94
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## CellCount_BloodSampling_Age CellCount_BloodSampling_Date
## <character> <character>
## BD1NYRACXX-5-1 NA NA
## AD10W1ACXX-4-1 NA NA
## BD1NYRACXX-5-2 NA NA
## AD10W1ACXX-4-2 NA NA
## BD1NYRACXX-5-3 NA NA
## ... ... ...
## AD1NFNACXX-1-1 70.357 2011-10-05
## AC1JV9ACXX-5-10 51.535 2012-03-29
## AD1NFNACXX-1-20 68.233 2011-09-29
## BC1KAVACXX-1-14 66.379 2011-05-19
## BC1KAVACXX-8-16 68.783 2011-10-04
## CellCount_BloodSampling_Time WBC RBC
## <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA
## AD10W1ACXX-4-1 NA NA NA
## BD1NYRACXX-5-2 NA NA NA
## AD10W1ACXX-4-2 NA NA NA
## BD1NYRACXX-5-3 NA NA NA
## ... ... ... ...
## AD1NFNACXX-1-1 9:50:00 8 4.82
## AC1JV9ACXX-5-10 8:20:00 6.4 5.02
## AD1NFNACXX-1-20 9:30:00 5.5 4.32
## BC1KAVACXX-1-14 10:25:00 6.5 5.21
## BC1KAVACXX-8-16 9:00:00 5.6 4.55
## HGB HCT MCV MCH
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 8.9 0.43 90 1.85
## AC1JV9ACXX-5-10 9.8 0.46 92.5 1.94
## AD1NFNACXX-1-20 8.3 0.4 93.4 1.92
## BC1KAVACXX-1-14 9.4 0.45 87.1 1.81
## BC1KAVACXX-8-16 8.5 0.42 93.2 1.86
## MCHC CHCM CH RDW
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 20.5 NA NA NA
## AC1JV9ACXX-5-10 21 NA NA NA
## AD1NFNACXX-1-20 20.6 NA NA NA
## BC1KAVACXX-1-14 20.8 NA NA NA
## BC1KAVACXX-8-16 20 NA NA NA
## HDW PLT MPV Neut
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA 248 8 NA
## AC1JV9ACXX-5-10 NA 345 6.7 NA
## AD1NFNACXX-1-20 NA 241 7.1 NA
## BC1KAVACXX-1-14 NA 265 7.4 NA
## BC1KAVACXX-8-16 NA 225 7.6 NA
## Lymph Mono Eos Baso
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA NA NA
## AC1JV9ACXX-5-10 NA NA NA NA
## AD1NFNACXX-1-20 NA NA NA NA
## BC1KAVACXX-1-14 NA NA NA NA
## BC1KAVACXX-8-16 NA NA NA NA
## LUC Neut_Perc Lymph_Perc Mono_Perc
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA 42.6 8.5
## AC1JV9ACXX-5-10 NA NA 31.9 7.9
## AD1NFNACXX-1-20 NA NA 29.9 8.7
## BC1KAVACXX-1-14 NA NA 37.2 3.9
## BC1KAVACXX-8-16 NA NA 41.6 9.7
## Eos_Perc Baso_Perc LUC_Perc run_number
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA 125
## AD10W1ACXX-4-1 NA NA NA 234
## BD1NYRACXX-5-2 NA NA NA 125
## AD10W1ACXX-4-2 NA NA NA 234
## BD1NYRACXX-5-3 NA NA NA 125
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA NA 124
## AC1JV9ACXX-5-10 NA NA NA 243
## AD1NFNACXX-1-20 NA NA NA 124
## BC1KAVACXX-1-14 NA NA NA 123
## BC1KAVACXX-8-16 NA NA NA 123
## flowcell_number machine raw past_filter
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 2b SN1013 40434000 32052000
## AD10W1ACXX-4-1 3a SN505 46622000 41461000
## BD1NYRACXX-5-2 2b SN1013 61132000 48674000
## AD10W1ACXX-4-2 3a SN505 36635000 33116000
## BD1NYRACXX-5-3 2b SN1013 40919000 31762000
## ... ... ... ... ...
## AD1NFNACXX-1-1 2a SN1013 50351000 43968000
## AC1JV9ACXX-5-10 10a SN505 23,696,872 21,347,978
## AD1NFNACXX-1-20 2a SN1013 66222000 58360000
## BC1KAVACXX-1-14 1b SN1013 63089000 57536000
## BC1KAVACXX-8-16 1b SN1013 44481000 41267000
## date insert_size star.avg_deletion_length
## <character> <character> <character>
## BD1NYRACXX-5-1 2013-03-29 325 1.47
## AD10W1ACXX-4-1 2013-04-17 313 1.42
## BD1NYRACXX-5-2 2013-03-29 325 1.46
## AD10W1ACXX-4-2 2013-04-17 305 1.44
## BD1NYRACXX-5-3 2013-03-29 308 1.43
## ... ... ... ...
## AD1NFNACXX-1-1 2013-03-29 298 1.47
## AC1JV9ACXX-5-10 2013-07-09 304 1.58
## AD1NFNACXX-1-20 2013-03-29 325 1.47
## BC1KAVACXX-1-14 2013-03-19 326 1.43
## BC1KAVACXX-8-16 2013-03-19 314 1.45
## star.start_mapping_time star.pct_unique_mapped
## <character> <character>
## BD1NYRACXX-5-1 Oct 10 21:39:51 93.19
## AD10W1ACXX-4-1 Oct 07 17:40:58 92.4
## BD1NYRACXX-5-2 Oct 11 00:56:26 92.49
## AD10W1ACXX-4-2 Oct 16 20:38:57 92.63
## BD1NYRACXX-5-3 Oct 10 21:55:25 92.9
## ... ... ...
## AD1NFNACXX-1-1 Oct 08 10:43:53 92.68
## AC1JV9ACXX-5-10 Oct 06 12:24:09 89.53
## AD1NFNACXX-1-20 Oct 09 03:56:44 92.2
## BC1KAVACXX-1-14 Oct 10 05:01:36 93.06
## BC1KAVACXX-8-16 Oct 10 08:05:07 90.16
## star.num_unique_mapped star.num_splice_annotated
## <character> <character>
## BD1NYRACXX-5-1 13384007 3603631
## AD10W1ACXX-4-1 16673749 4812839
## BD1NYRACXX-5-2 20220194 5536547
## AD10W1ACXX-4-2 13579658 3876835
## BD1NYRACXX-5-3 12571418 3404880
## ... ... ...
## AD1NFNACXX-1-1 18203944 4744204
## AC1JV9ACXX-5-10 8140547 2414167
## AD1NFNACXX-1-20 24617486 6577238
## BC1KAVACXX-1-14 24967281 6457378
## BC1KAVACXX-8-16 17642273 4919424
## star.num_splice_noncanonical star.pct_unmapped_other
## <character> <character>
## BD1NYRACXX-5-1 3291 0.06
## AD10W1ACXX-4-1 4690 0.05
## BD1NYRACXX-5-2 5662 0.06
## AD10W1ACXX-4-2 4701 0.05
## BD1NYRACXX-5-3 3997 0.05
## ... ... ...
## AD1NFNACXX-1-1 5975 0.07
## AC1JV9ACXX-5-10 3276 0.05
## AD1NFNACXX-1-20 6869 0.06
## BC1KAVACXX-1-14 7585 0.05
## BC1KAVACXX-8-16 5405 0.06
## star.num_splice_total star.num_splice_atac
## <character> <character>
## BD1NYRACXX-5-1 3631134 3204
## AD10W1ACXX-4-1 4852111 4039
## BD1NYRACXX-5-2 5584443 4996
## AD10W1ACXX-4-2 3909894 3384
## BD1NYRACXX-5-3 3433262 3033
## ... ... ...
## AD1NFNACXX-1-1 4783352 4325
## AC1JV9ACXX-5-10 2438071 2838
## AD1NFNACXX-1-20 6630389 5836
## BC1KAVACXX-1-14 6511885 5826
## BC1KAVACXX-8-16 4962823 4477
## star.num_splice_gcag star.num_input
## <character> <character>
## BD1NYRACXX-5-1 24928 14362274
## AD10W1ACXX-4-1 32581 18044680
## BD1NYRACXX-5-2 37530 21861429
## AD10W1ACXX-4-2 27794 14660267
## BD1NYRACXX-5-3 23072 13532066
## ... ... ...
## AD1NFNACXX-1-1 31893 19641011
## AC1JV9ACXX-5-10 18193 9092744
## AD1NFNACXX-1-20 46290 26699100
## BC1KAVACXX-1-14 44465 26828880
## BC1KAVACXX-8-16 33675 19567453
## star.rate_deletion_per_base star.pct_mapped_multiple
## <character> <character>
## BD1NYRACXX-5-1 0 3.92
## AD10W1ACXX-4-1 0 4.25
## BD1NYRACXX-5-2 0 4.47
## AD10W1ACXX-4-2 0 4.12
## BD1NYRACXX-5-3 0 4.11
## ... ... ...
## AD1NFNACXX-1-1 0 4.19
## AC1JV9ACXX-5-10 0 3.69
## AD1NFNACXX-1-20 0 3.55
## BC1KAVACXX-1-14 0 3.87
## BC1KAVACXX-8-16 0 3.8
## star.rate_mismatch_per_base star.start_job_time
## <character> <character>
## BD1NYRACXX-5-1 0.23 Oct 10 21:38:54
## AD10W1ACXX-4-1 0.22 Oct 07 17:39:59
## BD1NYRACXX-5-2 0.25 Oct 11 00:55:27
## AD10W1ACXX-4-2 0.22 Oct 16 20:29:47
## BD1NYRACXX-5-3 0.25 Oct 10 21:53:03
## ... ... ...
## AD1NFNACXX-1-1 0.2 Oct 08 10:41:20
## AC1JV9ACXX-5-10 0.26 Oct 06 12:22:56
## AD1NFNACXX-1-20 0.2 Oct 09 03:55:17
## BC1KAVACXX-1-14 0.21 Oct 10 04:59:50
## BC1KAVACXX-8-16 0.19 Oct 10 08:02:10
## star.pct_unmapped_short star.mapping_speed
## <character> <character>
## BD1NYRACXX-5-1 2.56 544.25
## AD10W1ACXX-4-1 3.02 595.97
## BD1NYRACXX-5-2 2.75 554.23
## AD10W1ACXX-4-2 2.95 593
## BD1NYRACXX-5-3 2.74 624.56
## ... ... ...
## AD1NFNACXX-1-1 2.73 625.73
## AC1JV9ACXX-5-10 6.35 503.6
## AD1NFNACXX-1-20 3.85 279.41
## BC1KAVACXX-1-14 2.7 555.08
## BC1KAVACXX-8-16 5.71 489.19
## star.avg_insertion_length star.pct_mapped_many
## <character> <character>
## BD1NYRACXX-5-1 1.2 0.26
## AD10W1ACXX-4-1 1.19 0.28
## BD1NYRACXX-5-2 1.2 0.23
## AD10W1ACXX-4-2 1.19 0.24
## BD1NYRACXX-5-3 1.2 0.2
## ... ... ...
## AD1NFNACXX-1-1 1.2 0.32
## AC1JV9ACXX-5-10 1.21 0.38
## AD1NFNACXX-1-20 1.2 0.34
## BC1KAVACXX-1-14 1.19 0.32
## BC1KAVACXX-8-16 1.19 0.28
## star.rate_insertion_per_base star.num_splice_gtag
## <character> <character>
## BD1NYRACXX-5-1 0.01 3599711
## AD10W1ACXX-4-1 0.01 4810801
## BD1NYRACXX-5-2 0.01 5536255
## AD10W1ACXX-4-2 0.01 3874015
## BD1NYRACXX-5-3 0.01 3403160
## ... ... ...
## AD1NFNACXX-1-1 0.01 4741159
## AC1JV9ACXX-5-10 0 2413764
## AD1NFNACXX-1-20 0.01 6571394
## BC1KAVACXX-1-14 0.01 6454009
## BC1KAVACXX-8-16 0.01 4919266
## star.num_mapped_many star.num_mapped_multiple
## <character> <character>
## BD1NYRACXX-5-1 38000 563715
## AD10W1ACXX-4-1 50826 766081
## BD1NYRACXX-5-2 49664 976688
## AD10W1ACXX-4-2 35799 604307
## BD1NYRACXX-5-3 27346 555867
## ... ... ...
## AD1NFNACXX-1-1 63747 823711
## AC1JV9ACXX-5-10 34925 335743
## AD1NFNACXX-1-20 91494 947493
## BC1KAVACXX-1-14 85467 1037749
## BC1KAVACXX-8-16 54001 742645
## star.avg_mapped_length star.avg_input_length
## <character> <character>
## BD1NYRACXX-5-1 96.67 97
## AD10W1ACXX-4-1 98.17 98
## BD1NYRACXX-5-2 96.68 97
## AD10W1ACXX-4-2 98.17 98
## BD1NYRACXX-5-3 96.46 96
## ... ... ...
## AD1NFNACXX-1-1 98.17 98
## AC1JV9ACXX-5-10 97.49 98
## AD1NFNACXX-1-20 98.15 98
## BC1KAVACXX-1-14 98.33 98
## BC1KAVACXX-8-16 98.48 98
## star.end_time star.pct_unmapped_mismatch
## <character> <character>
## BD1NYRACXX-5-1 Oct 10 21:41:26 0
## AD10W1ACXX-4-1 Oct 07 17:42:47 0
## BD1NYRACXX-5-2 Oct 11 00:58:48 0
## AD10W1ACXX-4-2 Oct 16 20:40:26 0
## BD1NYRACXX-5-3 Oct 10 21:56:43 0
## ... ... ...
## AD1NFNACXX-1-1 Oct 08 10:45:46 0
## AC1JV9ACXX-5-10 Oct 06 12:25:14 0
## AD1NFNACXX-1-20 Oct 09 04:02:28 0
## BC1KAVACXX-1-14 Oct 10 05:04:30 0
## BC1KAVACXX-8-16 Oct 10 08:07:31 0
## bam.genome_insert_mean bam.genome_insert_std
## <character> <character>
## BD1NYRACXX-5-1 288.648256495878 234.108811011201
## AD10W1ACXX-4-1 294.584736018414 245.447514390202
## BD1NYRACXX-5-2 286.413248406994 230.288439069619
## AD10W1ACXX-4-2 291.046940453097 240.342521371492
## BD1NYRACXX-5-3 275.199058717648 216.256191245783
## ... ... ...
## AD1NFNACXX-1-1 243.377767271685 174.593146049674
## AC1JV9ACXX-5-10 270.98717085555 234.077479869603
## AD1NFNACXX-1-20 271.426005179576 206.029383007847
## BC1KAVACXX-1-14 262.341694194699 188.846941324494
## BC1KAVACXX-8-16 281.656613101494 222.606885771599
## bam.genome_duplicates bam.exon_duplicates bam.exon_mapped
## <character> <character> <character>
## BD1NYRACXX-5-1 2661291 0 26186566
## AD10W1ACXX-4-1 4967141 0 33376848
## BD1NYRACXX-5-2 7305149 0 41080270
## AD10W1ACXX-4-2 3560547 0 27278995
## BD1NYRACXX-5-3 3668799 0 25458963
## ... ... ... ...
## AD1NFNACXX-1-1 6078591 0 34995158
## AC1JV9ACXX-5-10 5175009 0 15785461
## AD1NFNACXX-1-20 8338397 0 47655190
## BC1KAVACXX-1-14 8125608 0 49057971
## BC1KAVACXX-8-16 5190439 0 34874136
## bam.genome_total bam.genome_mapped bam.exon_total
## <character> <character> <character>
## BD1NYRACXX-5-1 30369701 29537099 26186566
## AD10W1ACXX-4-1 38317327 37103536 33376848
## BD1NYRACXX-5-2 46554232 45220039 41080270
## AD10W1ACXX-4-2 31054670 30098674 27278995
## BD1NYRACXX-5-3 28654414 27841451 25458963
## ... ... ... ...
## AD1NFNACXX-1-1 41723982 40491606 34995158
## AC1JV9ACXX-5-10 19163800 17928954 15785461
## AD1NFNACXX-1-20 56174753 53899515 47655190
## BC1KAVACXX-1-14 56673162 55019984 49057971
## BC1KAVACXX-8-16 41274638 38906239 34874136
## fastqc_raw.R2_raw_GC_mean fastqc_raw.R2_raw_GC_std
## <character> <character>
## BD1NYRACXX-5-1 50.6134742425783 11.9436168503436
## AD10W1ACXX-4-1 52.5497727984934 11.6764709027634
## BD1NYRACXX-5-2 51.3511595055147 11.8771215708693
## AD10W1ACXX-4-2 52.9603163325893 11.7375629125041
## BD1NYRACXX-5-3 51.6023001132948 11.7728320915493
## ... ... ...
## AD1NFNACXX-1-1 51.1126529412655 12.1481063937884
## AC1JV9ACXX-5-10 56.1104199342259 10.8167704633456
## AD1NFNACXX-1-20 52.2642327963524 11.8520406039121
## BC1KAVACXX-1-14 52.1173262764685 11.9197420032885
## BC1KAVACXX-8-16 52.6479899386612 11.7668179866764
## fastqc_raw.R1_raw_GC_mean fastqc_raw.R1_raw_GC_std
## <character> <character>
## BD1NYRACXX-5-1 50.2351419723759 11.8848917281142
## AD10W1ACXX-4-1 51.8356043001923 11.2889096664749
## BD1NYRACXX-5-2 51.0789575779742 11.8219118904248
## AD10W1ACXX-4-2 52.4489872339274 11.4721063444188
## BD1NYRACXX-5-3 50.9744983067781 11.6161522540398
## ... ... ...
## AD1NFNACXX-1-1 50.6593415093687 11.84707799697
## AC1JV9ACXX-5-10 54.909296338238 10.5315750859386
## AD1NFNACXX-1-20 52.0007684863848 11.7174692643988
## BC1KAVACXX-1-14 51.9082951558929 11.7670706654902
## BC1KAVACXX-8-16 52.4404759544934 11.5887318822462
## fastqc_clean.R1_clean_GC_mean
## <character>
## BD1NYRACXX-5-1 50.164378341129
## AD10W1ACXX-4-1 51.8220336816596
## BD1NYRACXX-5-2 51.0271786120152
## AD10W1ACXX-4-2 52.4373139201977
## BD1NYRACXX-5-3 50.9028924407951
## ... ...
## AD1NFNACXX-1-1 50.5797529326033
## AC1JV9ACXX-5-10 55.634548872841
## AD1NFNACXX-1-20 51.9339059552363
## BC1KAVACXX-1-14 51.8790750218106
## BC1KAVACXX-8-16 52.3904075032816
## fastqc_clean.R2_clean_GC_mean fastqc_clean.R2_clean_GC_std
## <character> <character>
## BD1NYRACXX-5-1 50.3968946725711 12.1131258203262
## AD10W1ACXX-4-1 52.0100726108017 11.811466538215
## BD1NYRACXX-5-2 51.1828528749685 12.0540137218588
## AD10W1ACXX-4-2 52.5987985271214 11.9084620443752
## BD1NYRACXX-5-3 51.1013302404673 12.082642631825
## ... ... ...
## AD1NFNACXX-1-1 50.7719921815163 12.2156418850006
## AC1JV9ACXX-5-10 55.7497819654327 11.0483150261391
## AD1NFNACXX-1-20 52.1394176386261 11.9282296293526
## BC1KAVACXX-1-14 52.0255556861058 11.9945544323497
## BC1KAVACXX-8-16 52.5096132946266 11.7511126114672
## fastqc_clean.R1_clean_GC_std
## <character>
## BD1NYRACXX-5-1 12.2507376336796
## AD10W1ACXX-4-1 11.7524863028
## BD1NYRACXX-5-2 12.1621853153794
## AD10W1ACXX-4-2 11.8501461812698
## BD1NYRACXX-5-3 12.2171170446126
## ... ...
## AD1NFNACXX-1-1 12.1562080492483
## AC1JV9ACXX-5-10 10.9872330877667
## AD1NFNACXX-1-20 11.8857304444009
## BC1KAVACXX-1-14 11.9727270527311
## BC1KAVACXX-8-16 11.7504095784964
rowRanges(rnaSeqData)
## GRanges object with 46628 ranges and 2 metadata columns:
## seqnames ranges strand |
## <Rle> <IRanges> <Rle> |
## ENSG00000000419 chr20 [ 49551404, 49575092] - |
## ENSG00000000457 chr1 [169818772, 169863408] - |
## ENSG00000000460 chr1 [169631245, 169823221] + |
## ENSG00000000938 chr1 [ 27938575, 27961788] - |
## ENSG00000000971 chr1 [196621008, 196716634] + |
## ... ... ... ... ...
## ENSG00000270174 chr6 [ 5665218, 5695505] - |
## ENSG00000270177 chr5 [133562101, 133563518] + |
## ENSG00000270178 chr3 [179521851, 179522154] + |
## ENSG00000270182 chr7 [ 27197963, 27198595] + |
## ENSG00000270184 chr16 [ 85817988, 85821223] + |
## gc length
## <numeric> <numeric>
## ENSG00000000419 0.397680198840099 1207
## ENSG00000000457 0.466715435259693 2734
## ENSG00000000460 0.430529977491303 4887
## ENSG00000000938 0.573114565342545 3474
## ENSG00000000971 0.361493123772102 8144
## ... ... ...
## ENSG00000270174 0.501240694789082 806
## ENSG00000270177 0.539492242595205 1418
## ENSG00000270178 0.375 304
## ENSG00000270182 0.53870458135861 633
## ENSG00000270184 0.44267053701016 689
## -------
## seqinfo: 93 sequences (1 circular) from hg19 genome
assays(rnaSeqData)$counts[1:5, 1:5]
## BD1NYRACXX-5-1 AD10W1ACXX-4-1 BD1NYRACXX-5-2
## ENSG00000000419 18910 26042 33868
## ENSG00000000457 22340 26380 30769
## ENSG00000000460 6793 6177 7889
## ENSG00000000938 1129953 1387162 1616590
## ENSG00000000971 8526 5429 8833
## AD10W1ACXX-4-2 BD1NYRACXX-5-3
## ENSG00000000419 16600 16277
## ENSG00000000457 17890 19584
## ENSG00000000460 5670 5630
## ENSG00000000938 1488194 1305774
## ENSG00000000971 7549 3290
Exon-level SummarizedExperiment
data(rnaSeqData_freeze1_exon_14042015BIOS)
colData(rnaSeqData)
## DataFrame with 2116 rows and 140 columns
## group lib.size norm.factors rnaseq_run_id
## <factor> <numeric> <numeric> <character>
## BD1NYRACXX-5-1 CODAM 1259404830 1 BD1NYRACXX-5-1
## AD10W1ACXX-4-1 CODAM 1632462474 1 AD10W1ACXX-4-1
## BD1NYRACXX-5-2 CODAM 1978420658 1 BD1NYRACXX-5-2
## AD10W1ACXX-4-2 CODAM 1334043187 1 AD10W1ACXX-4-2
## BD1NYRACXX-5-3 CODAM 1222613586 1 BD1NYRACXX-5-3
## ... ... ... ... ...
## AD1NFNACXX-1-1 RS 1709905424 1 AD1NFNACXX-1-1
## AC1JV9ACXX-5-10 RS 765091757 1 AC1JV9ACXX-5-10
## AD1NFNACXX-1-20 RS 2327049556 1 AD1NFNACXX-1-20
## BC1KAVACXX-1-14 RS 2401508849 1 BC1KAVACXX-1-14
## BC1KAVACXX-8-16 RS 1710394939 1 BC1KAVACXX-8-16
## bios_id uuid biobank_id person_id
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 CODAM-2001 BIOS6DB3BAD1 CODAM 2001
## AD10W1ACXX-4-1 CODAM-2002 BIOSCFA14234 CODAM 2002
## BD1NYRACXX-5-2 CODAM-2009 BIOSCA449668 CODAM 2009
## AD10W1ACXX-4-2 CODAM-2013 BIOS415A8BFB CODAM 2013
## BD1NYRACXX-5-3 CODAM-2016 BIOSD16ED999 CODAM 2016
## ... ... ... ... ...
## AD1NFNACXX-1-1 RS-942 BIOSCC469FF2 RS 942
## AC1JV9ACXX-5-10 RS-9420 BIOSB1058B1B RS 9420
## AD1NFNACXX-1-20 RS-969 BIOSA2EF6C80 RS 969
## BC1KAVACXX-1-14 RS-982 BIOS027136BA RS 982
## BC1KAVACXX-8-16 RS-984 BIOSC01C4781 RS 984
## nreruns rnaseq_qc methylation_run_id pheno_id
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 1 0 8667053102_R05C02 2001
## AD10W1ACXX-4-1 1 0 8667053157_R01C02 2002
## BD1NYRACXX-5-2 1 0 8667053152_R02C02 2009
## AD10W1ACXX-4-2 1 0 8655685053_R04C02 2013
## BD1NYRACXX-5-3 1 0 8655685094_R01C01 2016
## ... ... ... ... ...
## AD1NFNACXX-1-1 1 0 8691803030_R05C01 942
## AC1JV9ACXX-5-10 1 0 8691803046_R04C02 9420
## AD1NFNACXX-1-20 1 0 8691803032_R01C01 969
## BC1KAVACXX-1-14 1 0 8454787105_R02C02 982
## BC1KAVACXX-8-16 1 0 8691803032_R06C01 984
## gwas_id dna_id rna_id gonl_id
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 2001 2001 2001 NA
## AD10W1ACXX-4-1 2002 2002 2002 NA
## BD1NYRACXX-5-2 2009 2009 2009 NA
## AD10W1ACXX-4-2 2013 2013 2013 NA
## BD1NYRACXX-5-3 2016 2016 2016 NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 942 942 942 NA
## AC1JV9ACXX-5-10 9420 9420 9420 NA
## AD1NFNACXX-1-20 969 969 969 NA
## BC1KAVACXX-1-14 982 982 982 NA
## BC1KAVACXX-8-16 984 984 984 NA
## cg_id in_rp3 rnaseq_freeze methylation_freeze
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA TRUE 1 1
## AD10W1ACXX-4-1 NA TRUE 1 1
## BD1NYRACXX-5-2 NA TRUE 1 1
## AD10W1ACXX-4-2 NA TRUE 1 1
## BD1NYRACXX-5-3 NA TRUE 1 1
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA TRUE 1 1
## AC1JV9ACXX-5-10 NA TRUE 1 1
## AD1NFNACXX-1-20 NA TRUE 1 1
## BC1KAVACXX-1-14 NA TRUE 1 1
## BC1KAVACXX-8-16 NA TRUE 1 1
## gonlv5imputed
## <character>
## BD1NYRACXX-5-1 TRUE
## AD10W1ACXX-4-1 TRUE
## BD1NYRACXX-5-2 TRUE
## AD10W1ACXX-4-2 TRUE
## BD1NYRACXX-5-3 TRUE
## ... ...
## AD1NFNACXX-1-1 TRUE
## AC1JV9ACXX-5-10 TRUE
## AD1NFNACXX-1-20 TRUE
## BC1KAVACXX-1-14 TRUE
## BC1KAVACXX-8-16 TRUE
## Ascertainment_criterion
## <character>
## BD1NYRACXX-5-1 Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-1 Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-2 Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-2 Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-3 Selected for mildly increased DM2 /CVD risk factors
## ... ...
## AD1NFNACXX-1-1 NA
## AC1JV9ACXX-5-10 NA
## AD1NFNACXX-1-20 NA
## BC1KAVACXX-1-14 NA
## BC1KAVACXX-8-16 NA
## GWAS_Chip GWAS_DataGeneration_Date
## <character> <character>
## BD1NYRACXX-5-1 Illumina human omni express 2012
## AD10W1ACXX-4-1 Illumina human omni express 2012
## BD1NYRACXX-5-2 Illumina human omni express 2012
## AD10W1ACXX-4-2 Illumina human omni express 2012
## BD1NYRACXX-5-3 Illumina human omni express 2012
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## DNA_BloodSampling_Age DNA_BloodSampling_Date
## <character> <character>
## BD1NYRACXX-5-1 77.9 2006-08-08
## AD10W1ACXX-4-1 70.5 2006-08-09
## BD1NYRACXX-5-2 66.3 2006-09-14
## AD10W1ACXX-4-2 76.5 2006-09-26
## BD1NYRACXX-5-3 71.9 2006-06-07
## ... ... ...
## AD1NFNACXX-1-1 70.357 2011-10-05
## AC1JV9ACXX-5-10 51.535 2012-03-29
## AD1NFNACXX-1-20 68.233 2011-09-29
## BC1KAVACXX-1-14 66.379 2011-05-19
## BC1KAVACXX-8-16 68.783 2011-10-04
## DNA_BloodSampling_Time DNA_Source
## <character> <character>
## BD1NYRACXX-5-1 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-1 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-2 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-2 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-3 8-11 am whole blood (buffy coat)
## ... ... ...
## AD1NFNACXX-1-1 9:50:00 NA
## AC1JV9ACXX-5-10 8:20:00 NA
## AD1NFNACXX-1-20 9:30:00 NA
## BC1KAVACXX-1-14 10:25:00 NA
## BC1KAVACXX-8-16 9:00:00 NA
## DNA_Extraction_Method DNA_Extraction_Date
## <character> <character>
## BD1NYRACXX-5-1 QIAamp DNA minikit 2012-05-01
## AD10W1ACXX-4-1 QIAamp DNA minikit 2012-05-01
## BD1NYRACXX-5-2 QIAamp DNA minikit 2012-05-01
## AD10W1ACXX-4-2 QIAamp DNA minikit 2012-05-01
## BD1NYRACXX-5-3 QIAamp DNA minikit 2012-05-01
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## DNA_QuantificationMethod DNA_A260A280ratio
## <character> <character>
## BD1NYRACXX-5-1 nanodrop 1.9
## AD10W1ACXX-4-1 nanodrop 1.92
## BD1NYRACXX-5-2 nanodrop 1.89
## AD10W1ACXX-4-2 nanodrop 1.89
## BD1NYRACXX-5-3 nanodrop 1.89
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## RNA_BloodSampling_Age RNA_Sampling_Date RNA_Sampling_Time
## <character> <character> <character>
## BD1NYRACXX-5-1 77.9 2006-08-08 8-11 am
## AD10W1ACXX-4-1 70.5 2006-08-09 8-11 am
## BD1NYRACXX-5-2 66.3 2006-09-14 8-11 am
## AD10W1ACXX-4-2 76.5 2006-09-26 8-11 am
## BD1NYRACXX-5-3 71.9 2006-06-07 8-11 am
## ... ... ... ...
## AD1NFNACXX-1-1 70.357 2011-10-05 9:50:00
## AC1JV9ACXX-5-10 51.535 2012-03-29 8:20:00
## AD1NFNACXX-1-20 68.233 2011-09-29 9:30:00
## BC1KAVACXX-1-14 66.379 2011-05-19 10:25:00
## BC1KAVACXX-8-16 68.783 2011-10-04 9:00:00
## RNA_Source RNA_Extraction_Date
## <character> <character>
## BD1NYRACXX-5-1 PAX gene 2010-07-01
## AD10W1ACXX-4-1 PAX gene 2010-07-01
## BD1NYRACXX-5-2 PAX gene 2010-07-01
## AD10W1ACXX-4-2 PAX gene 2010-07-01
## BD1NYRACXX-5-3 PAX gene 2010-07-01
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## RNA_Extraction_Method RNA_RIN
## <character> <character>
## BD1NYRACXX-5-1 PAXgene blood miRNA kit (Qiacube) 9.1
## AD10W1ACXX-4-1 PAXgene blood miRNA kit (Qiacube) 9
## BD1NYRACXX-5-2 PAXgene blood miRNA kit (Qiacube) 9
## AD10W1ACXX-4-2 PAXgene blood miRNA kit (Qiacube) 8.8
## BD1NYRACXX-5-3 PAXgene blood miRNA kit (Qiacube) 9
## ... ... ...
## AD1NFNACXX-1-1 NA 8.539
## AC1JV9ACXX-5-10 NA 8.1775
## AD1NFNACXX-1-20 NA 8.1436
## BC1KAVACXX-1-14 NA 8.5
## BC1KAVACXX-8-16 NA 8.7492
## RNA_A260280ratio BirthYear Sex Smoking_Age
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 2 1928 0 77.9
## AD10W1ACXX-4-1 2 1936 1 70.5
## BD1NYRACXX-5-2 2.2 1940 0 66.3
## AD10W1ACXX-4-2 2.2 1930 0 76.5
## BD1NYRACXX-5-3 2.1 1934 0 71.9
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA 1941 0 NA
## AC1JV9ACXX-5-10 NA 1960 0 NA
## AD1NFNACXX-1-20 NA 1943 1 NA
## BC1KAVACXX-1-14 NA 1944 0 NA
## BC1KAVACXX-8-16 NA 1942 1 NA
## Smoking Lipids_BloodSampling_Age
## <character> <character>
## BD1NYRACXX-5-1 1 77.9
## AD10W1ACXX-4-1 0 70.5
## BD1NYRACXX-5-2 2 66.3
## AD10W1ACXX-4-2 1 76.5
## BD1NYRACXX-5-3 1 71.9
## ... ... ...
## AD1NFNACXX-1-1 NA 70.357
## AC1JV9ACXX-5-10 NA 51.535
## AD1NFNACXX-1-20 NA 68.233
## BC1KAVACXX-1-14 NA 66.379
## BC1KAVACXX-8-16 NA 68.783
## Lipids_BloodSampling_Date Lipids_BloodSampling_Time
## <character> <character>
## BD1NYRACXX-5-1 2006-08-08 8-11 am
## AD10W1ACXX-4-1 2006-08-09 8-11 am
## BD1NYRACXX-5-2 2006-09-14 8-11 am
## AD10W1ACXX-4-2 2006-09-26 8-11 am
## BD1NYRACXX-5-3 2006-06-07 8-11 am
## ... ... ...
## AD1NFNACXX-1-1 2011-10-05 9:50:00
## AC1JV9ACXX-5-10 2012-03-29 8:20:00
## AD1NFNACXX-1-20 2011-09-29 9:30:00
## BC1KAVACXX-1-14 2011-05-19 10:25:00
## BC1KAVACXX-8-16 2011-10-04 9:00:00
## Lipids_BloodSampling_Fasting TotChol HDLchol
## <character> <character> <character>
## BD1NYRACXX-5-1 1 5.6 1.28
## AD10W1ACXX-4-1 1 4.3 1.24
## BD1NYRACXX-5-2 1 5.4 1.4
## AD10W1ACXX-4-2 1 6 1.08
## BD1NYRACXX-5-3 1 5.7 1.22
## ... ... ... ...
## AD1NFNACXX-1-1 1 4.1 1.6
## AC1JV9ACXX-5-10 1 5.9 1.63
## AD1NFNACXX-1-20 1 6.6 2.22
## BC1KAVACXX-1-14 1 5.3 0.97
## BC1KAVACXX-8-16 1 5.9 1.7
## Triglycerides LDLchol LDLcholMethod LipidsMed_Age
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 1.5 NA NA NA
## AD10W1ACXX-4-1 1.1 NA NA NA
## BD1NYRACXX-5-2 0.7 NA NA NA
## AD10W1ACXX-4-2 2.1 NA NA NA
## BD1NYRACXX-5-3 1 NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 0.91 NA NA NA
## AC1JV9ACXX-5-10 0.92 NA NA NA
## AD1NFNACXX-1-20 1.22 NA NA NA
## BC1KAVACXX-1-14 1.4 NA NA NA
## BC1KAVACXX-8-16 0.65 NA NA NA
## LipidMed Anthropometry_Age Height Weight
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 1 77.9 175.5 76.25
## AD10W1ACXX-4-1 1 70.5 166 116.6
## BD1NYRACXX-5-2 0 66.3 170 83
## AD10W1ACXX-4-2 0 76.5 172 86.3
## BD1NYRACXX-5-3 0 71.9 174.5 74.75
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA 172 87.5
## AC1JV9ACXX-5-10 NA NA 180 99.9
## AD1NFNACXX-1-20 NA NA 162 66.7
## BC1KAVACXX-1-14 NA NA 183.7 84.3
## BC1KAVACXX-8-16 NA NA 162.7 73.3
## CRP_BloodSampling_Age CRP_BloodSampling_Date
## <character> <character>
## BD1NYRACXX-5-1 77.9 2006-08-08
## AD10W1ACXX-4-1 70.5 2006-08-09
## BD1NYRACXX-5-2 66.3 2006-09-14
## AD10W1ACXX-4-2 76.5 2006-09-26
## BD1NYRACXX-5-3 71.9 2006-06-07
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## CRP_BloodSampling_Time hsCRP
## <character> <character>
## BD1NYRACXX-5-1 NA 0.95
## AD10W1ACXX-4-1 NA 4.61
## BD1NYRACXX-5-2 NA 0.78
## AD10W1ACXX-4-2 NA 8.48
## BD1NYRACXX-5-3 NA 0.94
## ... ... ...
## AD1NFNACXX-1-1 NA NA
## AC1JV9ACXX-5-10 NA NA
## AD1NFNACXX-1-20 NA NA
## BC1KAVACXX-1-14 NA NA
## BC1KAVACXX-8-16 NA NA
## CellCount_BloodSampling_Age CellCount_BloodSampling_Date
## <character> <character>
## BD1NYRACXX-5-1 NA NA
## AD10W1ACXX-4-1 NA NA
## BD1NYRACXX-5-2 NA NA
## AD10W1ACXX-4-2 NA NA
## BD1NYRACXX-5-3 NA NA
## ... ... ...
## AD1NFNACXX-1-1 70.357 2011-10-05
## AC1JV9ACXX-5-10 51.535 2012-03-29
## AD1NFNACXX-1-20 68.233 2011-09-29
## BC1KAVACXX-1-14 66.379 2011-05-19
## BC1KAVACXX-8-16 68.783 2011-10-04
## CellCount_BloodSampling_Time WBC RBC
## <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA
## AD10W1ACXX-4-1 NA NA NA
## BD1NYRACXX-5-2 NA NA NA
## AD10W1ACXX-4-2 NA NA NA
## BD1NYRACXX-5-3 NA NA NA
## ... ... ... ...
## AD1NFNACXX-1-1 9:50:00 8 4.82
## AC1JV9ACXX-5-10 8:20:00 6.4 5.02
## AD1NFNACXX-1-20 9:30:00 5.5 4.32
## BC1KAVACXX-1-14 10:25:00 6.5 5.21
## BC1KAVACXX-8-16 9:00:00 5.6 4.55
## HGB HCT MCV MCH
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 8.9 0.43 90 1.85
## AC1JV9ACXX-5-10 9.8 0.46 92.5 1.94
## AD1NFNACXX-1-20 8.3 0.4 93.4 1.92
## BC1KAVACXX-1-14 9.4 0.45 87.1 1.81
## BC1KAVACXX-8-16 8.5 0.42 93.2 1.86
## MCHC CHCM CH RDW
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 20.5 NA NA NA
## AC1JV9ACXX-5-10 21 NA NA NA
## AD1NFNACXX-1-20 20.6 NA NA NA
## BC1KAVACXX-1-14 20.8 NA NA NA
## BC1KAVACXX-8-16 20 NA NA NA
## HDW PLT MPV Neut
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA 248 8 NA
## AC1JV9ACXX-5-10 NA 345 6.7 NA
## AD1NFNACXX-1-20 NA 241 7.1 NA
## BC1KAVACXX-1-14 NA 265 7.4 NA
## BC1KAVACXX-8-16 NA 225 7.6 NA
## Lymph Mono Eos Baso
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA NA NA
## AC1JV9ACXX-5-10 NA NA NA NA
## AD1NFNACXX-1-20 NA NA NA NA
## BC1KAVACXX-1-14 NA NA NA NA
## BC1KAVACXX-8-16 NA NA NA NA
## LUC Neut_Perc Lymph_Perc Mono_Perc
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA NA
## AD10W1ACXX-4-1 NA NA NA NA
## BD1NYRACXX-5-2 NA NA NA NA
## AD10W1ACXX-4-2 NA NA NA NA
## BD1NYRACXX-5-3 NA NA NA NA
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA 42.6 8.5
## AC1JV9ACXX-5-10 NA NA 31.9 7.9
## AD1NFNACXX-1-20 NA NA 29.9 8.7
## BC1KAVACXX-1-14 NA NA 37.2 3.9
## BC1KAVACXX-8-16 NA NA 41.6 9.7
## Eos_Perc Baso_Perc LUC_Perc run_number
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 NA NA NA 125
## AD10W1ACXX-4-1 NA NA NA 234
## BD1NYRACXX-5-2 NA NA NA 125
## AD10W1ACXX-4-2 NA NA NA 234
## BD1NYRACXX-5-3 NA NA NA 125
## ... ... ... ... ...
## AD1NFNACXX-1-1 NA NA NA 124
## AC1JV9ACXX-5-10 NA NA NA 243
## AD1NFNACXX-1-20 NA NA NA 124
## BC1KAVACXX-1-14 NA NA NA 123
## BC1KAVACXX-8-16 NA NA NA 123
## flowcell_number machine raw past_filter
## <character> <character> <character> <character>
## BD1NYRACXX-5-1 2b SN1013 40434000 32052000
## AD10W1ACXX-4-1 3a SN505 46622000 41461000
## BD1NYRACXX-5-2 2b SN1013 61132000 48674000
## AD10W1ACXX-4-2 3a SN505 36635000 33116000
## BD1NYRACXX-5-3 2b SN1013 40919000 31762000
## ... ... ... ... ...
## AD1NFNACXX-1-1 2a SN1013 50351000 43968000
## AC1JV9ACXX-5-10 10a SN505 23,696,872 21,347,978
## AD1NFNACXX-1-20 2a SN1013 66222000 58360000
## BC1KAVACXX-1-14 1b SN1013 63089000 57536000
## BC1KAVACXX-8-16 1b SN1013 44481000 41267000
## date insert_size star.avg_deletion_length
## <character> <character> <character>
## BD1NYRACXX-5-1 2013-03-29 325 1.47
## AD10W1ACXX-4-1 2013-04-17 313 1.42
## BD1NYRACXX-5-2 2013-03-29 325 1.46
## AD10W1ACXX-4-2 2013-04-17 305 1.44
## BD1NYRACXX-5-3 2013-03-29 308 1.43
## ... ... ... ...
## AD1NFNACXX-1-1 2013-03-29 298 1.47
## AC1JV9ACXX-5-10 2013-07-09 304 1.58
## AD1NFNACXX-1-20 2013-03-29 325 1.47
## BC1KAVACXX-1-14 2013-03-19 326 1.43
## BC1KAVACXX-8-16 2013-03-19 314 1.45
## star.start_mapping_time star.pct_unique_mapped
## <character> <character>
## BD1NYRACXX-5-1 Oct 10 21:39:51 93.19
## AD10W1ACXX-4-1 Oct 07 17:40:58 92.4
## BD1NYRACXX-5-2 Oct 11 00:56:26 92.49
## AD10W1ACXX-4-2 Oct 16 20:38:57 92.63
## BD1NYRACXX-5-3 Oct 10 21:55:25 92.9
## ... ... ...
## AD1NFNACXX-1-1 Oct 08 10:43:53 92.68
## AC1JV9ACXX-5-10 Oct 06 12:24:09 89.53
## AD1NFNACXX-1-20 Oct 09 03:56:44 92.2
## BC1KAVACXX-1-14 Oct 10 05:01:36 93.06
## BC1KAVACXX-8-16 Oct 10 08:05:07 90.16
## star.num_unique_mapped star.num_splice_annotated
## <character> <character>
## BD1NYRACXX-5-1 13384007 3603631
## AD10W1ACXX-4-1 16673749 4812839
## BD1NYRACXX-5-2 20220194 5536547
## AD10W1ACXX-4-2 13579658 3876835
## BD1NYRACXX-5-3 12571418 3404880
## ... ... ...
## AD1NFNACXX-1-1 18203944 4744204
## AC1JV9ACXX-5-10 8140547 2414167
## AD1NFNACXX-1-20 24617486 6577238
## BC1KAVACXX-1-14 24967281 6457378
## BC1KAVACXX-8-16 17642273 4919424
## star.num_splice_noncanonical star.pct_unmapped_other
## <character> <character>
## BD1NYRACXX-5-1 3291 0.06
## AD10W1ACXX-4-1 4690 0.05
## BD1NYRACXX-5-2 5662 0.06
## AD10W1ACXX-4-2 4701 0.05
## BD1NYRACXX-5-3 3997 0.05
## ... ... ...
## AD1NFNACXX-1-1 5975 0.07
## AC1JV9ACXX-5-10 3276 0.05
## AD1NFNACXX-1-20 6869 0.06
## BC1KAVACXX-1-14 7585 0.05
## BC1KAVACXX-8-16 5405 0.06
## star.num_splice_total star.num_splice_atac
## <character> <character>
## BD1NYRACXX-5-1 3631134 3204
## AD10W1ACXX-4-1 4852111 4039
## BD1NYRACXX-5-2 5584443 4996
## AD10W1ACXX-4-2 3909894 3384
## BD1NYRACXX-5-3 3433262 3033
## ... ... ...
## AD1NFNACXX-1-1 4783352 4325
## AC1JV9ACXX-5-10 2438071 2838
## AD1NFNACXX-1-20 6630389 5836
## BC1KAVACXX-1-14 6511885 5826
## BC1KAVACXX-8-16 4962823 4477
## star.num_splice_gcag star.num_input
## <character> <character>
## BD1NYRACXX-5-1 24928 14362274
## AD10W1ACXX-4-1 32581 18044680
## BD1NYRACXX-5-2 37530 21861429
## AD10W1ACXX-4-2 27794 14660267
## BD1NYRACXX-5-3 23072 13532066
## ... ... ...
## AD1NFNACXX-1-1 31893 19641011
## AC1JV9ACXX-5-10 18193 9092744
## AD1NFNACXX-1-20 46290 26699100
## BC1KAVACXX-1-14 44465 26828880
## BC1KAVACXX-8-16 33675 19567453
## star.rate_deletion_per_base star.pct_mapped_multiple
## <character> <character>
## BD1NYRACXX-5-1 0 3.92
## AD10W1ACXX-4-1 0 4.25
## BD1NYRACXX-5-2 0 4.47
## AD10W1ACXX-4-2 0 4.12
## BD1NYRACXX-5-3 0 4.11
## ... ... ...
## AD1NFNACXX-1-1 0 4.19
## AC1JV9ACXX-5-10 0 3.69
## AD1NFNACXX-1-20 0 3.55
## BC1KAVACXX-1-14 0 3.87
## BC1KAVACXX-8-16 0 3.8
## star.rate_mismatch_per_base star.start_job_time
## <character> <character>
## BD1NYRACXX-5-1 0.23 Oct 10 21:38:54
## AD10W1ACXX-4-1 0.22 Oct 07 17:39:59
## BD1NYRACXX-5-2 0.25 Oct 11 00:55:27
## AD10W1ACXX-4-2 0.22 Oct 16 20:29:47
## BD1NYRACXX-5-3 0.25 Oct 10 21:53:03
## ... ... ...
## AD1NFNACXX-1-1 0.2 Oct 08 10:41:20
## AC1JV9ACXX-5-10 0.26 Oct 06 12:22:56
## AD1NFNACXX-1-20 0.2 Oct 09 03:55:17
## BC1KAVACXX-1-14 0.21 Oct 10 04:59:50
## BC1KAVACXX-8-16 0.19 Oct 10 08:02:10
## star.pct_unmapped_short star.mapping_speed
## <character> <character>
## BD1NYRACXX-5-1 2.56 544.25
## AD10W1ACXX-4-1 3.02 595.97
## BD1NYRACXX-5-2 2.75 554.23
## AD10W1ACXX-4-2 2.95 593
## BD1NYRACXX-5-3 2.74 624.56
## ... ... ...
## AD1NFNACXX-1-1 2.73 625.73
## AC1JV9ACXX-5-10 6.35 503.6
## AD1NFNACXX-1-20 3.85 279.41
## BC1KAVACXX-1-14 2.7 555.08
## BC1KAVACXX-8-16 5.71 489.19
## star.avg_insertion_length star.pct_mapped_many
## <character> <character>
## BD1NYRACXX-5-1 1.2 0.26
## AD10W1ACXX-4-1 1.19 0.28
## BD1NYRACXX-5-2 1.2 0.23
## AD10W1ACXX-4-2 1.19 0.24
## BD1NYRACXX-5-3 1.2 0.2
## ... ... ...
## AD1NFNACXX-1-1 1.2 0.32
## AC1JV9ACXX-5-10 1.21 0.38
## AD1NFNACXX-1-20 1.2 0.34
## BC1KAVACXX-1-14 1.19 0.32
## BC1KAVACXX-8-16 1.19 0.28
## star.rate_insertion_per_base star.num_splice_gtag
## <character> <character>
## BD1NYRACXX-5-1 0.01 3599711
## AD10W1ACXX-4-1 0.01 4810801
## BD1NYRACXX-5-2 0.01 5536255
## AD10W1ACXX-4-2 0.01 3874015
## BD1NYRACXX-5-3 0.01 3403160
## ... ... ...
## AD1NFNACXX-1-1 0.01 4741159
## AC1JV9ACXX-5-10 0 2413764
## AD1NFNACXX-1-20 0.01 6571394
## BC1KAVACXX-1-14 0.01 6454009
## BC1KAVACXX-8-16 0.01 4919266
## star.num_mapped_many star.num_mapped_multiple
## <character> <character>
## BD1NYRACXX-5-1 38000 563715
## AD10W1ACXX-4-1 50826 766081
## BD1NYRACXX-5-2 49664 976688
## AD10W1ACXX-4-2 35799 604307
## BD1NYRACXX-5-3 27346 555867
## ... ... ...
## AD1NFNACXX-1-1 63747 823711
## AC1JV9ACXX-5-10 34925 335743
## AD1NFNACXX-1-20 91494 947493
## BC1KAVACXX-1-14 85467 1037749
## BC1KAVACXX-8-16 54001 742645
## star.avg_mapped_length star.avg_input_length
## <character> <character>
## BD1NYRACXX-5-1 96.67 97
## AD10W1ACXX-4-1 98.17 98
## BD1NYRACXX-5-2 96.68 97
## AD10W1ACXX-4-2 98.17 98
## BD1NYRACXX-5-3 96.46 96
## ... ... ...
## AD1NFNACXX-1-1 98.17 98
## AC1JV9ACXX-5-10 97.49 98
## AD1NFNACXX-1-20 98.15 98
## BC1KAVACXX-1-14 98.33 98
## BC1KAVACXX-8-16 98.48 98
## star.end_time star.pct_unmapped_mismatch
## <character> <character>
## BD1NYRACXX-5-1 Oct 10 21:41:26 0
## AD10W1ACXX-4-1 Oct 07 17:42:47 0
## BD1NYRACXX-5-2 Oct 11 00:58:48 0
## AD10W1ACXX-4-2 Oct 16 20:40:26 0
## BD1NYRACXX-5-3 Oct 10 21:56:43 0
## ... ... ...
## AD1NFNACXX-1-1 Oct 08 10:45:46 0
## AC1JV9ACXX-5-10 Oct 06 12:25:14 0
## AD1NFNACXX-1-20 Oct 09 04:02:28 0
## BC1KAVACXX-1-14 Oct 10 05:04:30 0
## BC1KAVACXX-8-16 Oct 10 08:07:31 0
## bam.genome_insert_mean bam.genome_insert_std
## <character> <character>
## BD1NYRACXX-5-1 288.648256495878 234.108811011201
## AD10W1ACXX-4-1 294.584736018414 245.447514390202
## BD1NYRACXX-5-2 286.413248406994 230.288439069619
## AD10W1ACXX-4-2 291.046940453097 240.342521371492
## BD1NYRACXX-5-3 275.199058717648 216.256191245783
## ... ... ...
## AD1NFNACXX-1-1 243.377767271685 174.593146049674
## AC1JV9ACXX-5-10 270.98717085555 234.077479869603
## AD1NFNACXX-1-20 271.426005179576 206.029383007847
## BC1KAVACXX-1-14 262.341694194699 188.846941324494
## BC1KAVACXX-8-16 281.656613101494 222.606885771599
## bam.genome_duplicates bam.exon_duplicates bam.exon_mapped
## <character> <character> <character>
## BD1NYRACXX-5-1 2661291 0 26186566
## AD10W1ACXX-4-1 4967141 0 33376848
## BD1NYRACXX-5-2 7305149 0 41080270
## AD10W1ACXX-4-2 3560547 0 27278995
## BD1NYRACXX-5-3 3668799 0 25458963
## ... ... ... ...
## AD1NFNACXX-1-1 6078591 0 34995158
## AC1JV9ACXX-5-10 5175009 0 15785461
## AD1NFNACXX-1-20 8338397 0 47655190
## BC1KAVACXX-1-14 8125608 0 49057971
## BC1KAVACXX-8-16 5190439 0 34874136
## bam.genome_total bam.genome_mapped bam.exon_total
## <character> <character> <character>
## BD1NYRACXX-5-1 30369701 29537099 26186566
## AD10W1ACXX-4-1 38317327 37103536 33376848
## BD1NYRACXX-5-2 46554232 45220039 41080270
## AD10W1ACXX-4-2 31054670 30098674 27278995
## BD1NYRACXX-5-3 28654414 27841451 25458963
## ... ... ... ...
## AD1NFNACXX-1-1 41723982 40491606 34995158
## AC1JV9ACXX-5-10 19163800 17928954 15785461
## AD1NFNACXX-1-20 56174753 53899515 47655190
## BC1KAVACXX-1-14 56673162 55019984 49057971
## BC1KAVACXX-8-16 41274638 38906239 34874136
## fastqc_raw.R2_raw_GC_mean fastqc_raw.R2_raw_GC_std
## <character> <character>
## BD1NYRACXX-5-1 50.6134742425783 11.9436168503436
## AD10W1ACXX-4-1 52.5497727984934 11.6764709027634
## BD1NYRACXX-5-2 51.3511595055147 11.8771215708693
## AD10W1ACXX-4-2 52.9603163325893 11.7375629125041
## BD1NYRACXX-5-3 51.6023001132948 11.7728320915493
## ... ... ...
## AD1NFNACXX-1-1 51.1126529412655 12.1481063937884
## AC1JV9ACXX-5-10 56.1104199342259 10.8167704633456
## AD1NFNACXX-1-20 52.2642327963524 11.8520406039121
## BC1KAVACXX-1-14 52.1173262764685 11.9197420032885
## BC1KAVACXX-8-16 52.6479899386612 11.7668179866764
## fastqc_raw.R1_raw_GC_mean fastqc_raw.R1_raw_GC_std
## <character> <character>
## BD1NYRACXX-5-1 50.2351419723759 11.8848917281142
## AD10W1ACXX-4-1 51.8356043001923 11.2889096664749
## BD1NYRACXX-5-2 51.0789575779742 11.8219118904248
## AD10W1ACXX-4-2 52.4489872339274 11.4721063444188
## BD1NYRACXX-5-3 50.9744983067781 11.6161522540398
## ... ... ...
## AD1NFNACXX-1-1 50.6593415093687 11.84707799697
## AC1JV9ACXX-5-10 54.909296338238 10.5315750859386
## AD1NFNACXX-1-20 52.0007684863848 11.7174692643988
## BC1KAVACXX-1-14 51.9082951558929 11.7670706654902
## BC1KAVACXX-8-16 52.4404759544934 11.5887318822462
## fastqc_clean.R1_clean_GC_mean
## <character>
## BD1NYRACXX-5-1 50.164378341129
## AD10W1ACXX-4-1 51.8220336816596
## BD1NYRACXX-5-2 51.0271786120152
## AD10W1ACXX-4-2 52.4373139201977
## BD1NYRACXX-5-3 50.9028924407951
## ... ...
## AD1NFNACXX-1-1 50.5797529326033
## AC1JV9ACXX-5-10 55.634548872841
## AD1NFNACXX-1-20 51.9339059552363
## BC1KAVACXX-1-14 51.8790750218106
## BC1KAVACXX-8-16 52.3904075032816
## fastqc_clean.R2_clean_GC_mean fastqc_clean.R2_clean_GC_std
## <character> <character>
## BD1NYRACXX-5-1 50.3968946725711 12.1131258203262
## AD10W1ACXX-4-1 52.0100726108017 11.811466538215
## BD1NYRACXX-5-2 51.1828528749685 12.0540137218588
## AD10W1ACXX-4-2 52.5987985271214 11.9084620443752
## BD1NYRACXX-5-3 51.1013302404673 12.082642631825
## ... ... ...
## AD1NFNACXX-1-1 50.7719921815163 12.2156418850006
## AC1JV9ACXX-5-10 55.7497819654327 11.0483150261391
## AD1NFNACXX-1-20 52.1394176386261 11.9282296293526
## BC1KAVACXX-1-14 52.0255556861058 11.9945544323497
## BC1KAVACXX-8-16 52.5096132946266 11.7511126114672
## fastqc_clean.R1_clean_GC_std
## <character>
## BD1NYRACXX-5-1 12.2507376336796
## AD10W1ACXX-4-1 11.7524863028
## BD1NYRACXX-5-2 12.1621853153794
## AD10W1ACXX-4-2 11.8501461812698
## BD1NYRACXX-5-3 12.2171170446126
## ... ...
## AD1NFNACXX-1-1 12.1562080492483
## AC1JV9ACXX-5-10 10.9872330877667
## AD1NFNACXX-1-20 11.8857304444009
## BC1KAVACXX-1-14 11.9727270527311
## BC1KAVACXX-8-16 11.7504095784964
rowRanges(rnaSeqData)
## GRanges object with 303544 ranges and 5 metadata columns:
## seqnames
## <Rle>
## ENSE00001544499,ENSE00001544501 chrM
## ENSE00001544498,ENSE00001544497 chrM
## ENSE00002006242 chrM
## ENSE00001435714 chrM
## ENSE00001544494,ENSE00001993597 chrM
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 chr22
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 chr22
## ENSE00002513195 chr22
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 chr22
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 chr22
## ranges
## <IRanges>
## ENSE00001544499,ENSE00001544501 [ 577, 1601]
## ENSE00001544498,ENSE00001544497 [1602, 3229]
## ENSE00002006242 [3230, 3304]
## ENSE00001435714 [3307, 4262]
## ENSE00001544494,ENSE00001993597 [4263, 4400]
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 [51221929, 51222091]
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 [51222185, 51222500]
## ENSE00002513195 [51223601, 51223721]
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 [51227178, 51227781]
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 [51237083, 51239737]
## strand
## <Rle>
## ENSE00001544499,ENSE00001544501 +
## ENSE00001544498,ENSE00001544497 +
## ENSE00002006242 +
## ENSE00001435714 +
## ENSE00001544494,ENSE00001993597 +
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 -
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 +
## ENSE00002513195 +
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 +
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 +
## |
## |
## ENSE00001544499,ENSE00001544501 |
## ENSE00001544498,ENSE00001544497 |
## ENSE00002006242 |
## ENSE00001435714 |
## ENSE00001544494,ENSE00001993597 |
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 |
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 |
## ENSE00002513195 |
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 |
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 |
## ENSEMBL
## <character>
## ENSE00001544499,ENSE00001544501 MT-TF,MT-RNR1
## ENSE00001544498,ENSE00001544497 MT-TV,MT-RNR2
## ENSE00002006242 MT-TL1
## ENSE00001435714 MT-ND1
## ENSE00001544494,ENSE00001993597 MT-TQ,MT-TI
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 RABL2B
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 RPL23AP82
## ENSE00002513195 RPL23AP82
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 RPL23AP82
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 RPL23AP82
## metaexondid
## <character>
## ENSE00001544499,ENSE00001544501 MT_577_1601
## ENSE00001544498,ENSE00001544497 MT_1602_3229
## ENSE00002006242 MT_3230_3304
## ENSE00001435714 MT_3307_4262
## ENSE00001544494,ENSE00001993597 MT_4263_4400
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 22_51221929_51222091
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 22_51222185_51222500
## ENSE00002513195 22_51223601_51223721
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 22_51227178_51227781
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 22_51237083_51239737
## exon_id
## <character>
## ENSE00001544499,ENSE00001544501 ENSE00001544499,ENSE00001544501
## ENSE00001544498,ENSE00001544497 ENSE00001544498,ENSE00001544497
## ENSE00002006242 ENSE00002006242
## ENSE00001435714 ENSE00001435714
## ENSE00001544494,ENSE00001993597 ENSE00001544494,ENSE00001993597
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865
## ENSE00002513195 ENSE00002513195
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668
## gc
## <numeric>
## ENSE00001544499,ENSE00001544501 0.4536585
## ENSE00001544498,ENSE00001544497 0.4299754
## ENSE00002006242 0.3866667
## ENSE00001435714 0.4780335
## ENSE00001544494,ENSE00001993597 0.3913043
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 0.7484663
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 0.6012658
## ENSE00002513195 0.5206612
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 0.3460265
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 0.3954802
## length
## <numeric>
## ENSE00001544499,ENSE00001544501 1025
## ENSE00001544498,ENSE00001544497 1628
## ENSE00002006242 75
## ENSE00001435714 956
## ENSE00001544494,ENSE00001993597 138
## ... ...
## ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 163
## ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 316
## ENSE00002513195 121
## ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 604
## ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 2655
## -------
## seqinfo: 25 sequences from an unspecified genome; no seqlengths
assays(rnaSeqData)$counts[1:5, 1:5]
## BD1NYRACXX-5-1 AD10W1ACXX-4-1
## ENSE00001544499,ENSE00001544501 29110 41062
## ENSE00001544498,ENSE00001544497 0 0
## ENSE00002006242 14504 25271
## ENSE00001435714 72504 96329
## ENSE00001544494,ENSE00001993597 0 0
## BD1NYRACXX-5-2 AD10W1ACXX-4-2
## ENSE00001544499,ENSE00001544501 47738 34911
## ENSE00001544498,ENSE00001544497 0 0
## ENSE00002006242 25864 15013
## ENSE00001435714 86555 65928
## ENSE00001544494,ENSE00001993597 0 0
## BD1NYRACXX-5-3
## ENSE00001544499,ENSE00001544501 27028
## ENSE00001544498,ENSE00001544497 0
## ENSE00002006242 12725
## ENSE00001435714 82553
## ENSE00001544494,ENSE00001993597 0
Metabolomics data
Extract RP4 data from molgenis database
molgenis.connect(username, password)
## Loading required package: bitops
## Login success
##
## Run 'ls()' to see the available functions to interact with the molgenis database!
ls()
## [1] "BIOBANKS" "DATASETS"
## [3] "LLSMalesAbove70" "MDB"
## [5] "methData" "molgenis.add"
## [7] "molgenis.addAll" "molgenis.addList"
## [9] "molgenis.delete" "molgenis.env"
## [11] "molgenis.get" "molgenis.getAttributeMetaData"
## [13] "molgenis.getEntityMetaData" "molgenis.login"
## [15] "molgenis.logout" "molgenis.update"
## [17] "password" "phenotypes"
## [19] "PROXY" "RDB"
## [21] "rnaSeqData" "RP3DATADIR"
## [23] "SRMBASE" "username"
## [25] "USRPWD" "VIEWS"
subjects <- molgenis.get.all("subjects")
## Extracted 2000 rows...
## Extracted 3000 rows...
## Extracted 4000 rows...
## Extracted 5000 rows...
## Extracted 6000 rows...
## Extracted 7000 rows...
## Extracted 8000 rows...
## Extracted 9000 rows...
## Extracted 10000 rows...
## Extracted 11000 rows...
## Extracted 12000 rows...
## Extracted 13000 rows...
## Extracted 14000 rows...
## Extracted 15000 rows...
## Extracted 16000 rows...
## Extracted 17000 rows...
## Extracted 18000 rows...
## Extracted 19000 rows...
## Extracted 20000 rows...
## Extracted 21000 rows...
## Extracted 22000 rows...
## Extracted 23000 rows...
## Extracted 23728 rows...
dim(subjects)
## [1] 23728 40
head(subjects)
## biobank id bios_id
## 1 BIOMARCS BIOMARCS-{38788478-F7D7-4518-B57F-09F5EA6190EF} <NA>
## 2 BIOMARCS BIOMARCS-{38B676D4-E73F-48B4-B874-4B3805D42DB5} <NA>
## 3 BIOMARCS BIOMARCS-{395012FA-A34C-4478-A899-64B3AB3CA9B7} <NA>
## 4 BIOMARCS BIOMARCS-{396322D1-DEF6-4AC9-AC40-3B70B9A9EFD1} <NA>
## 5 BIOMARCS BIOMARCS-{3963929C-AFA9-4A9D-B095-C30999122838} <NA>
## 6 BIOMARCS BIOMARCS-{39DD6E7B-1A01-4D79-8757-4A84256E3C27} <NA>
## date_of_birth age_bloodcollection gender pedigree_information
## 1 1933-11-29T00:00:00+0019 NA true false
## 2 1940-04-10T00:00:00+0020 NA true false
## 3 1939-06-18T00:00:00+0120 NA true false
## 4 1960-09-18T00:00:00+0100 NA true false
## 5 1937-02-26T00:00:00+0019 NA true false
## 6 1952-08-28T00:00:00+0100 NA true false
## gwas_platform_used gwas_available_date dna_amount dna_source
## 1 true EDTA buffy coat
## 2 true EDTA buffy coat
## 3 true EDTA buffy coat
## 4 true EDTA buffy coat
## 5 true EDTA buffy coat
## 6 true EDTA buffy coat
## rna_amount rna_source date_of_inclusion smoking
## 1 false 2011-03-28T00:00:00+0200 false
## 2 false 2011-11-01T00:00:00+0100 false
## 3 false 2010-06-03T00:00:00+0200 false
## 4 false 2011-10-20T00:00:00+0200 true
## 5 false 2009-09-12T00:00:00+0200 false
## 6 false 2010-09-06T00:00:00+0200 true
## alcohol_consumption height weight waist_circumference hip_circumference
## 1 167 79 NA NA
## 2 167 78 NA NA
## 3 189 85 NA NA
## 4 189 108 NA NA
## 5 181 98 NA NA
## 6 176 70 NA NA
## hs_crp wbc hgb hct plt neut_percentage lymph_percentage mono_percentage
## 1 7.0 NA NA NA NA NA NA NA
## 2 NA NA NA NA NA NA NA NA
## 3 1.4 NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA NA
## 5 23.0 NA NA NA NA NA NA NA
## 6 NA NA NA NA NA NA NA NA
## eos_percentage baso_percentage luc_percentage tot_cholesterol
## 1 NA NA NA NA
## 2 NA NA NA NA
## 3 NA NA NA 4.1
## 4 NA NA NA NA
## 5 NA NA NA 5.1
## 6 NA NA NA 3.3
## hdl_cholesterol triglycerides systolic_blood_pressure
## 1 NA NA 170
## 2 NA NA 174
## 3 1.20 0.8 122
## 4 NA NA 144
## 5 0.88 0.6 150
## 6 NA NA 68
## diastolic_blood_pressure lipid_lowering_med blood_pressure_lowering_med
## 1 NA 1 true
## 2 NA 1 true
## 3 78 1 true
## 4 NA 0 true
## 5 NA 1 true
## 6 NA 1 true
## metabolic_syndrome diabetes
## 1 false
## 2 false
## 3 false
## 4 true
## 5 false
## 6 false
biobanks <- molgenis.get.all("biobanks")
dim(biobanks)
## [1] 24 93
head(biobanks)
## age_at_death age_at_inclusion antidepressant_use_blood_coll anxiety
## 1 true true true false
## 2 true true true false
## 3 false true false false
## 4 false true false false
## 5 false true true false
## 6 true true true false
## ascertainment_criterion caffeine_consumption
## 1 patients with type 2 diabetes false
## 2 pop-based family study true
## 3 healthy controls false
## 4 case control false
## 5 population-based, stratified for ethnic background false
## 6 pop-based true
## chronic_migraine comorbidities contact_person1_email
## 1 false false n.van_leeuwen@lumc.nl
## 2 false true a.demirkan@erasmusmc.nl
## 3 false false mihai.netea@radboudumc.nl
## 4 false false y.f.m.ramos@lumc.nl
## 5 false true m.b.snijder@amc.uva.nl
## 6 true true Lude@ludesign.nl
## contact_person1_name contact_person2_email contact_person2_name
## 1 N. van Leeuwen jm.dekker@vumc.nl J.M. Dekker
## 2 Ayse Demirkan j.vergeer-drop@erasmusmc.nl Jeannette Vergeer
## 3 Mihai Netea leo.joosten@radboudumc.nl Leo Joosten
## 4 Yolande Ramos
## 5 Marieke Snijder k.stronks@amc.uva.nl Karien Stronks
## 6 Lude Franke sasha.zhernakova@gmail.com Sasha Zhernakova
## contact_person3_email contact_person3_name creatinine csf
## 1 true false
## 2 true false
## 3 false false
## 4 false false
## 5 true false
## 6 Llscience@umcg.nl Salome Scholtens true false
## cur_use_platelet_inhibitors curr_use_ace_inhibitors
## 1 true true
## 2 true true
## 3 false false
## 4 false false
## 5 true true
## 6 true true
## curr_use_beta_blockers curr_use_ra_receptor_blockers curr_use_statins
## 1 true true true
## 2 true true true
## 3 false false false
## 4 false false false
## 5 true true true
## 6 true true true
## curr_use_vit_k_antagonist current_depression_diag
## 1 true false
## 2 true true
## 3 false false
## 4 false false
## 5 true false
## 6 true false
## depression_diag_instrument depression_ids_sr depression_scale_yasr
## 1 false false false
## 2 true false false
## 3 false false false
## 4 false false false
## 5 false false false
## 6 true false false
## depressive_symptom_score depressive_symptom_score_used diabetes_type_2
## 1 false false true
## 2 true true true
## 3 false false true
## 4 false false false
## 5 true true true
## 6 true true true
## diabetic_complications diagnosis_by_ogtt diagnosis_oa education_level
## 1 true false false true
## 2 false false true true
## 3 false false false true
## 4 false false true false
## 5 false false false true
## 6 true false false true
## family_history_cv_disease ft3 ft4 hba1c hba1c_longitudinal
## 1 false false false true true
## 2 true false false false false
## 3 true false false false false
## 4 false false false false false
## 5 true false false true false
## 6 true true true true true
## head_circumference headache_days heart_rate history_arrhythmia
## 1 false false true true
## 2 false false true true
## 3 false false false false
## 4 false false false false
## 5 false false true false
## 6 true true true true
## history_cabg history_coron_artery_disease history_device_implantation
## 1 true true true
## 2 true true true
## 3 false false false
## 4 false false false
## 5 false true false
## 6 true true true
## history_myocardial_infarction history_pci abbreviation intelligence
## 1 true true DZS_WF false
## 2 true true ERF_ERGO true
## 3 false false FUNCTGENOMICS false
## 4 false false GARP false
## 5 true false HELIUS false
## 6 true true LIFELINES false
## interview_data_dementia lifetime_depression lifetime_depression_diag
## 1 false false false
## 2 true true true
## 3 false false false
## 4 false false false
## 5 false false false
## 6 true false true
## lvef lvef_method migraine_frequency migraine_with_aura
## 1 false false false false
## 2 true false false true
## 3 false false false false
## 4 false false false false
## 5 false false false false
## 6 true true false true
## migraine_without_aura mmse_and_other_csf_tests mri_ct_ecg_brain
## 1 false false false
## 2 true true true
## 3 false false false
## 4 false false false
## 5 false false false
## 6 true true false
## name ntprobnp_bnp_anp_level
## 1 Diabetes Zorgsysteem west-Friesland false
## 2 ERF_ERGO false
## 3 Radboud false
## 4 Genetica ARtrose en Progressie false
## 5 HEalthy LIfe in an Urban Setting false
## 6 false
## osteoarthritis osteoarthritis_longitudinal painscores_joints personality
## 1 false false false false
## 2 true false true true
## 3 false false false false
## 4 true false false false
## 5 true false false false
## 6 true true true true
## pi_email pi_name
## 1 l.m.t_hart@lumc.nl L.M. 't Hart (LUMC) / J.M. Dekker VUmc)
## 2 c.vanduijn@erasmusmc.nl Cornelia van Duijn
## 3 mihai.netea@radboudumc.nl Mihai Netea
## 4 i.meulenbelt@lumc.nl Ingrid Meulenbelt
## 5 a.h.zwinderman@amc.uva.nl Koos Zwinderman
## 6 cisca.wijmenga@gmail.com Cisca Wijmenga
## pr_interval prev_diag_aortic_aneurysm prev_diag_diast_heart_failure
## 1 true true true
## 2 true true true
## 3 false false false
## 4 false false false
## 5 true false false
## 6 true true false
## prev_diag_heart_failure prev_diag_hemorrhagic_stroke
## 1 true true
## 2 true true
## 3 false false
## 4 false false
## 5 false false
## 6 true false
## prev_diag_per_vascular_disease prev_diag_stroke
## 1 true true
## 2 true true
## 3 false false
## 4 false false
## 5 false true
## 6 true true
## prev_diag_syst_heart_failure prev_diag_thromb_stroke
## 1 true true
## 2 true true
## 3 false false
## 4 false false
## 5 false false
## 6 false false
## psychological_wellbeing pulse_rate qrs_duration qrs_voltage qt_time
## 1 false false true false false
## 2 false true true true true
## 3 false false false false false
## 4 false false false false false
## 5 false true true true true
## 6 false false true false true
## qtc_time self_esteem self_rated_health tsh use_of_angiotensin_ii
## 1 true false false false true
## 2 true false false false true
## 3 false false true false false
## 4 false false false false false
## 5 true false true false true
## 6 false false true true true
## use_of_anti_depressive_drugs use_of_anti_epilectic_drugs
## 1 true true
## 2 true true
## 3 false false
## 4 false false
## 5 true true
## 6 true true
## use_of_anticonception use_of_beta_blockers use_of_triptans womac_scores
## 1 true true true false
## 2 false true true false
## 3 true false false false
## 4 false false false false
## 5 true true true false
## 6 true true true true
## x_ray_photographs
## 1 false
## 2 true
## 3 false
## 4 false
## 5 false
## 6 false
samples <- molgenis.get.all("samples")
## Extracted 2000 rows...
## Extracted 3000 rows...
## Extracted 4000 rows...
## Extracted 5000 rows...
## Extracted 6000 rows...
## Extracted 7000 rows...
## Extracted 8000 rows...
## Extracted 9000 rows...
## Extracted 10000 rows...
## Extracted 11000 rows...
## Extracted 12000 rows...
## Extracted 13000 rows...
## Extracted 14000 rows...
## Extracted 15000 rows...
## Extracted 16000 rows...
## Extracted 17000 rows...
## Extracted 18000 rows...
## Extracted 19000 rows...
## Extracted 20000 rows...
## Extracted 21000 rows...
## Extracted 22000 rows...
## Extracted 23000 rows...
## Extracted 24000 rows...
## Extracted 24112 rows...
dim(samples)
## [1] 24112 10
head(samples)
## biobank subject_id id date_collection
## 1 ALPHAOMEGA ALPHAOMEGA-9265 ALPHAOMEGA-2722 2005-02-15T00:00:00+0100
## 2 ALPHAOMEGA ALPHAOMEGA-4450 ALPHAOMEGA-2736 2005-02-15T00:00:00+0100
## 3 ALPHAOMEGA ALPHAOMEGA-3401 ALPHAOMEGA-2743 2005-02-16T00:00:00+0100
## 4 ALPHAOMEGA ALPHAOMEGA-1280 ALPHAOMEGA-2752 2005-02-17T00:00:00+0100
## 5 ALPHAOMEGA ALPHAOMEGA-6748 ALPHAOMEGA-276 2002-09-26T00:00:00+0200
## 6 ALPHAOMEGA ALPHAOMEGA-5782 ALPHAOMEGA-2764 2005-02-21T00:00:00+0100
## date_inclusion sample_matrix fasting time_handling temp_storage
## 1 <NA> EDTA plasma false NA -80
## 2 <NA> EDTA plasma true NA -80
## 3 <NA> EDTA plasma false NA -80
## 4 <NA> EDTA plasma false NA -80
## 5 <NA> EDTA plasma true NA -80
## 6 <NA> EDTA plasma false NA -80
## time_storage
## 1 110
## 2 110
## 3 110
## 4 110
## 5 139
## 6 110
tail(samples)
## biobank subject_id id date_collection
## 24107 VUNTR VUNTR-A918C VUNTR-9222_66256-01 2008-10-01T00:00:00+0200
## 24108 VUNTR VUNTR-A623C VUNTR-9226_62616-01 2008-10-01T00:00:00+0200
## 24109 VUNTR VUNTR-A1148D VUNTR-9227_62057-02 2008-10-01T00:00:00+0200
## 24110 VUNTR VUNTR-A1148C VUNTR-9228_62057-01 2008-10-01T00:00:00+0200
## 24111 VUNTR VUNTR-A1552C VUNTR-9229_60436-01 2008-10-01T00:00:00+0200
## 24112 VUNTR VUNTR-A2462C VUNTR-9230_60384-01 2008-10-01T00:00:00+0200
## date_inclusion sample_matrix fasting time_handling temp_storage
## 24107 <NA> EDTA plasma true 6 -30
## 24108 <NA> EDTA plasma true 6 -30
## 24109 <NA> EDTA plasma true 6 -30
## 24110 <NA> EDTA plasma true 6 -30
## 24111 <NA> EDTA plasma true 6 -30
## 24112 <NA> EDTA plasma true 6 -30
## time_storage
## 24107 NA
## 24108 NA
## 24109 NA
## 24110 NA
## 24111 NA
## 24112 NA
measurements <- molgenis.get.all("measurements")
## Extracted 2000 rows...
## Extracted 3000 rows...
## Extracted 4000 rows...
## Extracted 5000 rows...
## Extracted 6000 rows...
## Extracted 7000 rows...
## Extracted 8000 rows...
## Extracted 9000 rows...
## Extracted 10000 rows...
## Extracted 11000 rows...
## Extracted 12000 rows...
## Extracted 13000 rows...
## Extracted 14000 rows...
## Extracted 15000 rows...
## Extracted 16000 rows...
## Extracted 17000 rows...
## Extracted 18000 rows...
## Extracted 19000 rows...
## Extracted 20000 rows...
## Extracted 21000 rows...
## Extracted 22000 rows...
## Extracted 23000 rows...
## Extracted 24000 rows...
## Extracted 24072 rows...
dim(measurements)
## [1] 24072 249
measurements[1:5, 1:5]
## id sample_id acace ace ala
## 1 BBMRI-PROSPER.31528441 PROSPER-31528441 0.02900 0.01190 0.4562
## 2 BBMRI-PROSPER.31608168 PROSPER-31608168 0.24870 0.01826 0.1326
## 3 BBMRI-PROSPER.31618227 PROSPER-31618227 0.02947 0.02237 0.4932
## 4 BBMRI-PROSPER.31709409 PROSPER-31709409 0.07189 0.01456 0.4230
## 5 BBMRI-PROSPER.31749591 PROSPER-31749591 0.02733 0.02391 0.3537
tbl <- table(subjects$biobank)
op <- par(mar = c(5, 10, 4, 2))
barplot(tbl[order(tbl)], horiz = TRUE, las = 2)
par(op)
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:IRanges':
##
## %within%
## The following object is masked from 'package:base':
##
## date
library(ggplot2)
subjects$date_of_birth <- as.character(subjects$date_of_birth)
subjects$date_of_birth[!is.na(subjects$date_of_birth)][subjects$date_of_birth[!is.na(subjects$date_of_birth)] ==
""] <- NA
subjects$age <- interval(gsub("T.*$", "", subjects$date_of_birth), Sys.Date())/duration(num = 1,
units = "years")
levels(subjects$gender) <- c("", "female", "male") ##c('', 'false', 'true')
biobank_ordered <- with(subjects, reorder(biobank, age, median, na.rm = TRUE))
gp <- ggplot(subjects, aes(biobank_ordered, age))
gp <- gp + geom_boxplot(aes(fill = factor(gender)))
gp <- gp + theme(axis.text.x = element_text(angle = 90, hjust = 0, size = 7))
gp
## Warning: Removed 8516 rows containing non-finite values (stat_boxplot).
measurements$biobank <- gsub("-.*$", "", measurements$sample_id)
biobank_ordered <- with(measurements, reorder(biobank, lac, median, na.rm = TRUE))
gp <- ggplot(measurements, aes(biobank_ordered, lac))
gp <- gp + geom_boxplot()
gp <- gp + theme(axis.text.x = element_text(angle = 90, hjust = 0, size = 7))
gp
## Warning: Removed 18 rows containing non-finite values (stat_boxplot).
metadata <- molgenis.getEntityMetaData("measurements")
description <- lapply(metadata$attributes, function(x) x$description)
head(description)
## $id
## [1] "Identifier"
##
## $sample_id
## [1] "Sample identifier"
##
## $acace
## [1] "Acetoacetate"
##
## $ace
## [1] "Acetate"
##
## $ala
## [1] "Alanine"
##
## $alb
## [1] "Albumin"
Linking Tables
Linking of the subject, sample and measurment tables is done through common identifiers within each table. Each table has a column named 'id' which represents the table-name id e.g. the 'id' column of the subjects' table represents the 'subject_id' end so on.
You can use the function 'merge' to combine the different tables as follows, for example, for all LLS data:
head(subjects[, c("id", "bios_id")])
## id bios_id
## 1 BIOMARCS-{38788478-F7D7-4518-B57F-09F5EA6190EF} <NA>
## 2 BIOMARCS-{38B676D4-E73F-48B4-B874-4B3805D42DB5} <NA>
## 3 BIOMARCS-{395012FA-A34C-4478-A899-64B3AB3CA9B7} <NA>
## 4 BIOMARCS-{396322D1-DEF6-4AC9-AC40-3B70B9A9EFD1} <NA>
## 5 BIOMARCS-{3963929C-AFA9-4A9D-B095-C30999122838} <NA>
## 6 BIOMARCS-{39DD6E7B-1A01-4D79-8757-4A84256E3C27} <NA>
measurements[1:5, c("id", "sample_id")]
## id sample_id
## 1 BBMRI-PROSPER.31528441 PROSPER-31528441
## 2 BBMRI-PROSPER.31608168 PROSPER-31608168
## 3 BBMRI-PROSPER.31618227 PROSPER-31618227
## 4 BBMRI-PROSPER.31709409 PROSPER-31709409
## 5 BBMRI-PROSPER.31749591 PROSPER-31749591
head(samples[, c("biobank", "id", "subject_id")])
## biobank id subject_id
## 1 ALPHAOMEGA ALPHAOMEGA-2722 ALPHAOMEGA-9265
## 2 ALPHAOMEGA ALPHAOMEGA-2736 ALPHAOMEGA-4450
## 3 ALPHAOMEGA ALPHAOMEGA-2743 ALPHAOMEGA-3401
## 4 ALPHAOMEGA ALPHAOMEGA-2752 ALPHAOMEGA-1280
## 5 ALPHAOMEGA ALPHAOMEGA-276 ALPHAOMEGA-6748
## 6 ALPHAOMEGA ALPHAOMEGA-2764 ALPHAOMEGA-5782
LLS <- subset(samples, grepl("LLS", biobank))
dim(LLS)
## [1] 3311 10
head(LLS)
## biobank subject_id id
## 3996 LLS_PARTOFFS LLS_PARTOFFS-323011 LLS_PARTOFFS-169
## 3997 LLS_PARTOFFS LLS_PARTOFFS-2443021 LLS_PARTOFFS-1690
## 3998 LLS_PARTOFFS LLS_PARTOFFS-2443020 LLS_PARTOFFS-1691
## 3999 LLS_PARTOFFS LLS_PARTOFFS-2143020 LLS_PARTOFFS-1692
## 4000 LLS_PARTOFFS LLS_PARTOFFS-2143021 LLS_PARTOFFS-1693
## 4001 LLS_PARTOFFS LLS_PARTOFFS-2283111 LLS_PARTOFFS-1694
## date_collection date_inclusion sample_matrix fasting
## 3996 2003-01-07T00:00:00+0100 <NA> EDTA plasma false
## 3997 2004-06-08T00:00:00+0200 <NA> EDTA plasma false
## 3998 2004-06-08T00:00:00+0200 <NA> EDTA plasma false
## 3999 2004-06-09T00:00:00+0200 <NA> EDTA plasma false
## 4000 2004-06-09T00:00:00+0200 <NA> EDTA plasma false
## 4001 2004-06-09T00:00:00+0200 <NA> EDTA plasma false
## time_handling temp_storage time_storage
## 3996 4 -80 135
## 3997 1 -80 118
## 3998 1 -80 118
## 3999 3 -80 118
## 4000 3 -80 118
## 4001 3 -80 118
mLLS <- merge(LLS, subjects, by.x = "subject_id", by.y = "id", all.x = TRUE,
suffixes = c("_samples", "_subjects"))
dim(mLLS)
## [1] 3311 50
head(mLLS)
## subject_id biobank_samples id
## 1 LLS_PARTOFFS-1013010 LLS_PARTOFFS LLS_PARTOFFS-582
## 2 LLS_PARTOFFS-1013020 LLS_PARTOFFS LLS_PARTOFFS-881
## 3 LLS_PARTOFFS-1023010 LLS_PARTOFFS LLS_PARTOFFS-1397
## 4 LLS_PARTOFFS-1023011 LLS_PARTOFFS LLS_PARTOFFS-1398
## 5 LLS_PARTOFFS-1023040 LLS_PARTOFFS LLS_PARTOFFS-1126
## 6 LLS_PARTOFFS-103010 LLS_PARTOFFS LLS_PARTOFFS-40
## date_collection date_inclusion sample_matrix fasting
## 1 2003-08-19T00:00:00+0200 <NA> EDTA plasma false
## 2 2003-11-10T00:00:00+0100 <NA> EDTA plasma false
## 3 2004-03-15T00:00:00+0100 <NA> EDTA plasma false
## 4 2004-03-15T00:00:00+0100 <NA> EDTA plasma false
## 5 2003-12-23T00:00:00+0100 <NA> EDTA plasma false
## 6 2002-11-01T00:00:00+0100 <NA> EDTA plasma false
## time_handling temp_storage time_storage biobank_subjects bios_id
## 1 4 -80 128 LLS_PARTOFFS
## 2 3 -80 125 LLS_PARTOFFS
## 3 3 -80 121 LLS_PARTOFFS 1397
## 4 3 -80 121 LLS_PARTOFFS 1398
## 5 4 -80 124 LLS_PARTOFFS
## 6 2 -80 137 LLS_PARTOFFS 40
## date_of_birth age_bloodcollection gender pedigree_information
## 1 1935-09-19T00:00:00+0119 NA female true
## 2 1933-01-22T00:00:00+0019 NA female true
## 3 1950-12-23T00:00:00+0100 NA female true
## 4 1951-06-25T00:00:00+0100 NA male true
## 5 1958-03-02T00:00:00+0100 NA female true
## 6 1944-04-27T00:00:00+0200 NA female true
## gwas_platform_used gwas_available_date dna_amount dna_source
## 1 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 2 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 3 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 4 Illumina Omni-express 2012-01-01 00:00:00 true EDTA buffy coat
## 5 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 6 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## rna_amount rna_source date_of_inclusion smoking
## 1 true Whole blood (PAXgene) 2003-08-19T00:00:00+0200
## 2 true Whole blood (PAXgene) 2003-11-10T00:00:00+0100 false
## 3 true Whole blood (PAXgene) 2004-03-15T00:00:00+0100 false
## 4 true Whole blood (PAXgene) 2004-03-15T00:00:00+0100 false
## 5 true Whole blood (PAXgene) 2003-12-23T00:00:00+0100
## 6 true Whole blood (PAXgene) 2002-11-01T00:00:00+0100 false
## alcohol_consumption height weight waist_circumference hip_circumference
## 1 NA NA NA NA
## 2 true 155 51 NA NA
## 3 true 164 65 NA NA
## 4 true NA NA NA NA
## 5 NA NA NA NA
## 6 true 159 70 NA NA
## hs_crp wbc hgb hct plt neut_percentage lymph_percentage
## 1 0.55 4.29 8.3 0.394 233 52.4 32.9
## 2 1.16 4.73 9.6 0.427 159 36.1 48.9
## 3 1.23 6.41 8.3 0.389 316 59.6 28.5
## 4 0.16 5.02 8.4 0.391 218 66.8 23.5
## 5 0.77 5.98 9.2 0.451 306 69.9 21.2
## 6 3.25 5.28 7.9 0.398 292 51.4 35.8
## mono_percentage eos_percentage baso_percentage luc_percentage
## 1 8.2 2.1 0.7 3.8
## 2 9.3 1.2 1.0 3.5
## 3 5.1 4.1 0.7 2.0
## 4 4.9 2.7 0.7 1.4
## 5 5.2 1.3 0.7 1.8
## 6 7.3 2.0 0.7 2.8
## tot_cholesterol hdl_cholesterol triglycerides systolic_blood_pressure
## 1 4.95 1.64 1.25 NA
## 2 5.84 2.00 1.14 NA
## 3 6.39 1.70 3.67 NA
## 4 4.66 0.98 4.19 NA
## 5 5.11 1.63 0.80 NA
## 6 5.77 1.60 2.06 NA
## diastolic_blood_pressure lipid_lowering_med blood_pressure_lowering_med
## 1 NA 1 false
## 2 NA 0 false
## 3 NA 0 false
## 4 NA 1 true
## 5 NA 0 false
## 6 NA 0 true
## metabolic_syndrome diabetes age
## 1 <NA> false 80.71781
## 2 <NA> false 83.37534
## 3 <NA> false 65.44658
## 4 <NA> true 64.94247
## 5 <NA> 58.25205
## 6 <NA> false 72.10685
mmLLS <- merge(mLLS, measurements, by.x = "id", by.y = "sample_id", all.x = TRUE,
suffixes = c("_merged", "measurements"))
## Warning in merge.data.frame(mLLS, measurements, by.x = "id", by.y =
## "sample_id", : column name 'id' is duplicated in the result
dim(mmLLS)
## [1] 3311 299
head(mmLLS)
## id subject_id biobank_samples
## 1 LLS_PARTOFFS-10 LLS_PARTOFFS-53021 LLS_PARTOFFS
## 2 LLS_PARTOFFS-100 LLS_PARTOFFS-233030 LLS_PARTOFFS
## 3 LLS_PARTOFFS-1000 LLS_PARTOFFS-1173060 LLS_PARTOFFS
## 4 LLS_PARTOFFS-1001 LLS_PARTOFFS-1173120 LLS_PARTOFFS
## 5 LLS_PARTOFFS-1002 LLS_PARTOFFS-1173100 LLS_PARTOFFS
## 6 LLS_PARTOFFS-1003 LLS_PARTOFFS-1173080 LLS_PARTOFFS
## date_collection date_inclusion sample_matrix fasting
## 1 2002-09-27T00:00:00+0200 <NA> EDTA plasma false
## 2 2002-12-04T00:00:00+0100 <NA> EDTA plasma false
## 3 2003-11-28T00:00:00+0100 <NA> EDTA plasma false
## 4 2003-11-28T00:00:00+0100 <NA> EDTA plasma false
## 5 2003-11-28T00:00:00+0100 <NA> EDTA plasma false
## 6 2003-11-28T00:00:00+0100 <NA> EDTA plasma false
## time_handling temp_storage time_storage biobank_subjects bios_id
## 1 5 -80 139 LLS_PARTOFFS <NA>
## 2 4 -80 136 LLS_PARTOFFS
## 3 6 -80 125 LLS_PARTOFFS
## 4 4 -80 125 LLS_PARTOFFS
## 5 5 -80 125 LLS_PARTOFFS 1002
## 6 5 -80 125 LLS_PARTOFFS
## date_of_birth age_bloodcollection gender pedigree_information
## 1 1946-03-28T00:00:00+0100 NA female true
## 2 1947-12-18T00:00:00+0100 NA male true
## 3 1938-10-21T00:00:00+0020 NA male true
## 4 1949-03-07T00:00:00+0100 NA female true
## 5 1945-07-08T00:00:00+0200 NA female true
## 6 1940-12-24T00:00:00+0200 NA male true
## gwas_platform_used gwas_available_date dna_amount dna_source
## 1 Illumina Omni-express 2012-01-01 00:00:00 true EDTA buffy coat
## 2 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 3 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 4 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 5 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## 6 Illumina 660 2012-01-01 00:00:00 true EDTA buffy coat
## rna_amount rna_source date_of_inclusion smoking
## 1 true Whole blood (PAXgene) 2002-09-27T00:00:00+0200 true
## 2 true Whole blood (PAXgene) 2002-12-04T00:00:00+0100
## 3 true Whole blood (PAXgene) 2003-11-28T00:00:00+0100 false
## 4 true Whole blood (PAXgene) 2003-11-28T00:00:00+0100 false
## 5 true Whole blood (PAXgene) 2003-11-28T00:00:00+0100 false
## 6 true Whole blood (PAXgene) 2003-11-28T00:00:00+0100 false
## alcohol_consumption height weight waist_circumference hip_circumference
## 1 true 174 75 NA NA
## 2 NA NA NA NA
## 3 true 175 78 NA NA
## 4 true 167 64 NA NA
## 5 true 160 74 NA NA
## 6 true 167 83 NA NA
## hs_crp wbc hgb hct plt neut_percentage lymph_percentage
## 1 1.73 7.08 9.6 0.463 372 55.0 34.6
## 2 1.34 4.31 9.4 0.437 246 59.7 25.5
## 3 7.02 7.60 9.1 0.443 241 74.6 16.4
## 4 0.71 5.97 8.6 0.433 274 69.6 21.4
## 5 1.09 6.92 8.5 0.405 254 60.5 30.2
## 6 4.01 5.66 10.3 0.507 194 53.7 34.2
## mono_percentage eos_percentage baso_percentage luc_percentage
## 1 5.4 3.0 0.6 1.4
## 2 6.6 4.2 2.9 1.2
## 3 5.6 1.4 0.4 1.6
## 4 4.6 2.3 0.5 1.7
## 5 5.1 2.1 0.5 1.6
## 6 6.6 2.0 0.7 2.8
## tot_cholesterol hdl_cholesterol triglycerides systolic_blood_pressure
## 1 7.69 1.01 1.50 NA
## 2 5.80 1.01 4.54 NA
## 3 4.04 1.01 1.66 NA
## 4 5.66 1.54 2.32 NA
## 5 5.97 1.42 1.06 NA
## 6 6.11 1.74 2.18 NA
## diastolic_blood_pressure lipid_lowering_med blood_pressure_lowering_med
## 1 NA 0 false
## 2 NA 0 true
## 3 NA 0 false
## 4 NA 0 false
## 5 NA 0 false
## 6 NA 0 false
## metabolic_syndrome diabetes age id acace
## 1 false 70.18904 BBMRI-LLS_PARTOFFS.10 0.003705
## 2 <NA> false 68.46301 BBMRI-LLS_PARTOFFS.100 0.007047
## 3 <NA> false 77.62740 BBMRI-LLS_PARTOFFS.1000 0.057550
## 4 <NA> false 67.24384 BBMRI-LLS_PARTOFFS.1001 0.039930
## 5 <NA> false 70.90959 BBMRI-LLS_PARTOFFS.1002 0.025510
## 6 <NA> false 75.44932 BBMRI-LLS_PARTOFFS.1003 0.045990
## ace ala alb apoa1 apob apob_apoa1 bohbut cit cla
## 1 0.02638 0.3544 0.09067 1.650 1.4050 0.8518 0.03281 0.04117 0.06249
## 2 0.02098 0.3569 0.08845 1.629 1.1800 0.7245 0.04961 0.07549 0.01911
## 3 0.03499 0.2318 0.08536 1.326 0.8481 0.6397 0.09579 0.04290 0.01517
## 4 0.03966 0.2608 0.08722 1.697 1.0380 0.6115 0.08170 0.09927 0.03234
## 5 0.03837 0.3309 0.09331 1.607 1.0040 0.6249 0.06824 0.05616 0.03013
## 6 0.02615 0.3283 0.09608 1.930 1.1000 0.5702 0.18570 0.10930 0.02280
## cla_fa crea dag dag_tg dha dha_fa estc falen faw3
## 1 0.4267 0.05741 0.005182 0.003562 0.2108 1.4390 4.786 17.02 0.6226
## 2 0.1210 0.06608 0.046690 0.016250 0.1262 0.7990 2.967 17.67 0.4184
## 3 0.1562 0.05893 0.009926 0.007891 0.1245 1.2820 2.313 17.65 0.3243
## 4 0.2470 0.04218 0.035250 0.021610 0.1210 0.9248 3.275 17.38 0.3542
## 5 0.2801 0.05559 0.013520 0.018010 0.1380 1.2840 3.459 17.03 0.3864
## 6 0.1682 0.06723 0.019080 0.011930 0.1460 1.0770 3.663 18.27 0.4711
## faw3_fa faw6 faw6_fa freec glc gln gp hdl_c hdl_d hdl_tg
## 1 4.251 4.689 32.01 2.001 3.862 0.4322 1.382 1.6190 10.020 0.1778
## 2 2.649 4.593 29.08 1.229 5.831 0.3969 1.649 1.1300 9.778 0.2212
## 3 3.341 3.352 34.53 1.015 4.540 0.4414 1.548 0.9519 9.555 0.1227
## 4 2.706 4.262 32.56 1.328 4.246 0.5418 1.359 1.4230 9.947 0.1457
## 5 3.594 3.698 34.39 1.428 4.901 0.4521 1.267 1.4300 9.850 0.1016
## 6 3.476 4.233 31.24 1.580 6.019 0.4272 1.357 1.7870 10.070 0.1868
## hdl2_c hdl3_c his idl_c idl_c_percentage idl_ce idl_ce_percentage
## 1 0.7916 0.8271 0.04683 1.1440 63.48 0.7836 43.49
## 2 0.7628 0.3668 0.01640 0.5548 61.06 0.4241 46.68
## 3 0.5535 0.3985 0.05765 0.4923 62.47 0.3516 44.61
## 4 0.9848 0.4378 0.06723 0.6946 65.45 0.4883 46.01
## 5 0.9213 0.5084 0.05599 0.7886 65.46 0.5450 45.24
## 6 1.3100 0.4764 0.06068 0.7583 62.03 0.5429 44.41
## idl_fc idl_fc_percentage idl_l idl_p idl_pl idl_pl_percentage
## 1 0.3603 20.00 1.8020 0.00000017640 0.4668 25.90
## 2 0.1307 14.39 0.9086 0.00000009239 0.2165 23.83
## 3 0.1407 17.85 0.7881 0.00000007802 0.2100 26.64
## 4 0.2063 19.44 1.0610 0.00000010340 0.2712 25.55
## 5 0.2436 20.22 1.2050 0.00000011670 0.3205 26.60
## 6 0.2154 17.62 1.2220 0.00000012160 0.3172 25.95
## idl_tg idl_tg_percentage ile l_hdl_c l_hdl_c_percentage l_hdl_ce
## 1 0.19120 10.610 0.04400 0.3633 53.01 0.2705
## 2 0.13730 15.110 0.07118 0.1973 43.03 0.1582
## 3 0.08584 10.890 0.05212 0.1327 42.17 0.1094
## 4 0.09552 9.001 0.08965 0.3381 48.92 0.2564
## 5 0.09553 7.931 0.03739 0.2927 46.25 0.2189
## 6 0.14700 12.030 0.05390 0.4758 45.91 0.3596
## l_hdl_ce_percentage l_hdl_fc l_hdl_fc_percentage l_hdl_l l_hdl_p
## 1 39.47 0.09278 13.540 0.6854 0.0000010799999
## 2 34.51 0.03908 8.524 0.4585 0.0000007434000
## 3 34.76 0.02335 7.418 0.3147 0.0000005113000
## 4 37.10 0.08168 11.820 0.6911 0.0000010890000
## 5 34.59 0.07380 11.660 0.6328 0.0000009991001
## 6 34.70 0.11620 11.210 1.0360 0.0000016460000
## l_hdl_pl l_hdl_pl_percentage l_hdl_tg l_hdl_tg_percentage l_ldl_c
## 1 0.2640 38.52 0.05810 8.477 1.4570
## 2 0.2304 50.25 0.03080 6.719 0.6435
## 3 0.1656 52.61 0.01641 5.215 0.5803
## 4 0.3296 47.68 0.02350 3.400 0.7997
## 5 0.3230 51.05 0.01709 2.700 0.9788
## 6 0.5163 49.82 0.04422 4.267 0.8960
## l_ldl_c_percentage l_ldl_ce l_ldl_ce_percentage l_ldl_fc
## 1 69.07 1.0620 50.37 0.3944
## 2 62.75 0.4727 46.09 0.1708
## 3 65.72 0.4018 45.50 0.1785
## 4 67.41 0.5670 47.79 0.2327
## 5 69.26 0.6902 48.84 0.2886
## 6 65.36 0.6428 46.89 0.2532
## l_ldl_fc_percentage l_ldl_l l_ldl_p l_ldl_pl l_ldl_pl_percentage
## 1 18.70 2.1090 0.0000002959 0.4813 22.82
## 2 16.66 1.0260 0.0000001474 0.2709 26.42
## 3 20.21 0.8831 0.0000001232 0.2424 27.45
## 4 19.61 1.1860 0.0000001657 0.3046 25.68
## 5 20.42 1.4130 0.0000001957 0.3513 24.86
## 6 18.47 1.3710 0.0000001943 0.3405 24.84
## l_ldl_tg l_ldl_tg_percentage l_vldl_c l_vldl_c_percentage l_vldl_ce
## 1 0.17090 8.103 0.09075 31.04 0.05192
## 2 0.11110 10.830 0.24130 22.63 0.11850
## 3 0.06031 6.830 0.07841 21.73 0.04339
## 4 0.08206 6.917 0.12130 23.15 0.05801
## 5 0.08320 5.887 0.03990 24.73 0.02391
## 6 0.13450 9.807 0.11140 24.17 0.05330
## l_vldl_ce_percentage l_vldl_fc l_vldl_fc_percentage l_vldl_l
## 1 17.76 0.03883 13.280 0.2923
## 2 11.12 0.12280 11.520 1.0660
## 3 12.02 0.03502 9.703 0.3609
## 4 11.07 0.06332 12.080 0.5241
## 5 14.82 0.01599 9.907 0.1614
## 6 11.57 0.05807 12.600 0.4608
## l_vldl_p l_vldl_pl l_vldl_pl_percentage l_vldl_tg
## 1 0.000000004834 0.06405 21.91 0.1375
## 2 0.000000018390 0.19020 17.84 0.6347
## 3 0.000000006301 0.05993 16.60 0.2226
## 4 0.000000009043 0.08432 16.09 0.3185
## 5 0.000000002791 0.02615 16.21 0.0953
## 6 0.000000007895 0.07842 17.02 0.2711
## l_vldl_tg_percentage la la_fa lac ldl_c ldl_d ldl_tg leu
## 1 47.05 3.678 25.11 1.1470 2.858 23.61 0.2977 0.05075
## 2 59.53 4.004 25.35 1.0570 1.227 23.46 0.2143 0.05892
## 3 61.67 2.801 28.84 1.0200 1.109 23.60 0.1017 0.06372
## 4 60.76 3.644 27.84 0.9523 1.514 23.55 0.1449 0.09815
## 5 59.06 2.954 27.47 1.3060 1.890 23.58 0.1400 0.05319
## 6 58.82 3.527 26.02 1.8790 1.708 23.58 0.2397 0.07647
## m_hdl_c m_hdl_c_percentage m_hdl_ce m_hdl_ce_percentage m_hdl_fc
## 1 0.5278 59.66 0.4139 46.78 0.11400
## 2 0.4069 45.63 0.3280 36.79 0.07887
## 3 0.3476 48.05 0.2819 38.97 0.06566
## 4 0.4606 48.48 0.3724 39.20 0.08820
## 5 0.4736 51.49 0.3697 40.20 0.10390
## 6 0.5756 47.85 0.4413 36.68 0.13430
## m_hdl_fc_percentage m_hdl_l m_hdl_p m_hdl_pl m_hdl_pl_percentage
## 1 12.880 0.8847 0.000002028 0.3108 35.13
## 2 8.845 0.8917 0.000002132 0.4095 45.92
## 3 9.077 0.7234 0.000001714 0.3292 45.51
## 4 9.283 0.9502 0.000002243 0.4372 46.02
## 5 11.300 0.9197 0.000002141 0.4043 43.96
## 6 11.160 1.2030 0.000002820 0.5630 46.80
## m_hdl_tg m_hdl_tg_percentage m_ldl_c m_ldl_c_percentage m_ldl_ce
## 1 0.04601 5.201 0.8815 71.13 0.6572
## 2 0.07533 8.448 0.3658 58.95 0.2392
## 3 0.04662 6.445 0.3303 64.42 0.2146
## 4 0.05230 5.504 0.4392 65.18 0.2986
## 5 0.04179 4.544 0.5671 69.63 0.4095
## 6 0.06441 5.354 0.5028 63.81 0.3481
## m_ldl_ce_percentage m_ldl_fc m_ldl_fc_percentage m_ldl_l m_ldl_p
## 1 53.04 0.2243 18.10 1.2390 0.00000024270
## 2 38.55 0.1266 20.40 0.6205 0.00000012290
## 3 41.86 0.1157 22.56 0.5127 0.00000009882
## 4 44.31 0.1406 20.87 0.6738 0.00000013090
## 5 50.27 0.1576 19.35 0.8145 0.00000015810
## 6 44.18 0.1547 19.64 0.7880 0.00000015530
## m_ldl_pl m_ldl_pl_percentage m_ldl_tg m_ldl_tg_percentage m_vldl_c
## 1 0.2769 22.34 0.08086 6.525 0.2449
## 2 0.2003 32.28 0.05442 8.770 0.4056
## 3 0.1593 31.08 0.02308 4.502 0.1934
## 4 0.2000 29.68 0.03458 5.132 0.2384
## 5 0.2115 25.97 0.03584 4.400 0.1516
## 6 0.2221 28.18 0.06308 8.006 0.2344
## m_vldl_c_percentage m_vldl_ce m_vldl_ce_percentage m_vldl_fc
## 1 43.77 0.16310 29.16 0.08175
## 2 25.85 0.21400 13.64 0.19160
## 3 25.67 0.10760 14.28 0.08585
## 4 28.06 0.14100 16.60 0.09740
## 5 33.32 0.09792 21.53 0.05366
## 6 30.38 0.13090 16.97 0.10350
## m_vldl_fc_percentage m_vldl_l m_vldl_p m_vldl_pl
## 1 14.61 0.5594 0.00000001570 0.10910
## 2 12.21 1.5690 0.00000004720 0.29990
## 3 11.40 0.7534 0.00000002276 0.14160
## 4 11.46 0.8496 0.00000002548 0.15790
## 5 11.80 0.4549 0.00000001336 0.08907
## 6 13.42 0.7716 0.00000002276 0.14970
## m_vldl_pl_percentage m_vldl_tg m_vldl_tg_percentage mufa mufa_fa pc
## 1 19.50 0.2054 36.72 3.454 23.58 2.368
## 2 19.12 0.8634 55.03 4.373 27.69 1.962
## 3 18.80 0.4183 55.53 2.250 23.18 1.485
## 4 18.58 0.4534 53.36 3.216 24.57 1.999
## 5 19.58 0.2143 47.10 2.704 25.14 1.851
## 6 19.40 0.3875 50.22 3.895 28.74 2.505
## phe pufa pufa_fa pyr remnant_c s_hdl_c s_hdl_c_percentage
## 1 0.04166 5.311 36.26 0.1114 2.310 0.5023 49.98
## 2 0.04510 5.012 31.73 0.1030 1.839 0.3623 33.23
## 3 0.04048 3.677 37.87 0.1131 1.267 0.4052 38.49
## 4 0.05951 4.617 35.27 0.1080 1.667 0.3930 35.79
## 5 0.04440 4.085 37.99 0.1160 1.568 0.5056 45.34
## 6 0.04973 4.704 34.71 0.1418 1.748 0.4794 37.87
## s_hdl_ce s_hdl_ce_percentage s_hdl_fc s_hdl_fc_percentage s_hdl_l
## 1 0.3848 38.30 0.1174 11.69 1.005
## 2 0.2341 21.47 0.1282 11.76 1.090
## 3 0.2868 27.25 0.1184 11.24 1.053
## 4 0.2606 23.73 0.1324 12.06 1.098
## 5 0.3934 35.28 0.1122 10.06 1.115
## 6 0.3110 24.57 0.1684 13.30 1.266
## s_hdl_p s_hdl_pl s_hdl_pl_percentage s_hdl_tg s_hdl_tg_percentage
## 1 0.000004470 0.4430 44.09 0.05957 5.928
## 2 0.000004970 0.6441 59.08 0.08376 7.683
## 3 0.000004746 0.5960 56.62 0.05145 4.888
## 4 0.000004948 0.6542 59.57 0.05092 4.637
## 5 0.000004990 0.5730 51.39 0.03648 3.272
## 6 0.000005659 0.7301 57.67 0.05655 4.466
## s_ldl_c s_ldl_c_percentage s_ldl_ce s_ldl_ce_percentage s_ldl_fc
## 1 0.5199 69.84 0.3936 52.87 0.12640
## 2 0.2174 52.43 0.1385 33.40 0.07889
## 3 0.1979 60.18 0.1276 38.80 0.07033
## 4 0.2748 61.15 0.1869 41.60 0.08787
## 5 0.3436 67.43 0.2489 48.84 0.09473
## 6 0.3093 60.38 0.2150 41.98 0.09426
## s_ldl_fc_percentage s_ldl_l s_ldl_p s_ldl_pl s_ldl_pl_percentage
## 1 16.98 0.7444 0.0000002635 0.1785 23.98
## 2 19.02 0.4147 0.0000001510 0.1485 35.82
## 3 21.38 0.3289 0.0000001155 0.1126 34.25
## 4 19.55 0.4493 0.0000001591 0.1463 32.57
## 5 18.59 0.5096 0.0000001788 0.1450 28.45
## 6 18.40 0.5122 0.0000001834 0.1607 31.38
## s_ldl_tg s_ldl_tg_percentage s_vldl_c s_vldl_c_percentage s_vldl_ce
## 1 0.04599 6.177 0.3741 41.97 0.2456
## 2 0.04876 11.760 0.3444 31.12 0.1875
## 3 0.01833 5.573 0.2552 35.08 0.1502
## 4 0.02822 6.280 0.3024 39.95 0.1842
## 5 0.02099 4.120 0.2801 44.68 0.1773
## 6 0.04221 8.241 0.3160 40.44 0.1891
## s_vldl_ce_percentage s_vldl_fc s_vldl_fc_percentage s_vldl_l
## 1 27.55 0.1285 14.41 0.8915
## 2 16.94 0.1569 14.18 1.1060
## 3 20.65 0.1049 14.43 0.7274
## 4 24.34 0.1182 15.61 0.7568
## 5 28.28 0.1028 16.41 0.6268
## 6 24.20 0.1269 16.24 0.7813
## s_vldl_p s_vldl_pl s_vldl_pl_percentage s_vldl_tg
## 1 0.00000004408 0.2436 27.33 0.2737
## 2 0.00000005748 0.2453 22.17 0.5168
## 3 0.00000003721 0.1651 22.70 0.3072
## 4 0.00000003791 0.1716 22.67 0.2829
## 5 0.00000003070 0.1505 24.00 0.1963
## 6 0.00000003892 0.1826 23.37 0.2828
## s_vldl_tg_percentage serum_c serum_tg sfa sfa_fa sm tg_pg totcho
## 1 30.71 6.787 1.5180 5.882 40.16 0.6856 0.6091 2.974
## 2 46.71 4.196 3.0670 6.409 40.58 0.3414 1.4670 2.235
## 3 42.22 3.327 1.4430 3.782 38.95 0.4188 0.8645 1.799
## 4 37.38 4.603 1.7000 5.257 40.16 0.5256 0.8147 2.392
## 5 31.31 4.887 0.9563 3.965 36.87 0.4774 0.4211 2.207
## 6 36.19 5.243 1.7760 4.953 36.55 0.4250 0.6577 2.801
## totfa totpg tyr unsatdeg val vldl_c vldl_d vldl_tg xl_hdl_c
## 1 14.650 2.389 0.04587 1.171 0.1545 1.1660 35.66 0.8507 0.22530
## 2 15.790 1.959 0.06946 1.085 0.1113 1.2850 39.96 2.4940 0.16310
## 3 9.709 1.455 0.05004 1.181 0.1827 0.7745 37.84 1.1320 0.06647
## 4 13.090 2.002 0.11030 1.121 0.2065 0.9724 38.25 1.3140 0.23100
## 5 10.750 1.783 0.06058 1.229 0.1339 0.7794 35.81 0.6191 0.15790
## 6 13.550 2.432 0.09906 1.218 0.1766 0.9896 37.48 1.2020 0.25610
## xl_hdl_c_percentage xl_hdl_ce xl_hdl_ce_percentage xl_hdl_fc
## 1 52.17 0.19530 45.22 0.03002
## 2 58.40 0.12150 43.50 0.04160
## 3 67.92 0.05457 55.77 0.01189
## 4 57.73 0.17100 42.74 0.06000
## 5 54.80 0.11580 40.19 0.04207
## 6 50.70 0.18810 37.24 0.06796
## xl_hdl_fc_percentage xl_hdl_l xl_hdl_p xl_hdl_pl
## 1 6.951 0.43190 0.00000043410 0.19250
## 2 14.890 0.27940 0.00000027500 0.08489
## 3 12.160 0.09785 0.00000009559 0.02313
## 4 15.000 0.40000 0.00000038760 0.15010
## 5 14.600 0.28810 0.00000027880 0.12400
## 6 13.460 0.50510 0.00000049630 0.22740
## xl_hdl_pl_percentage xl_hdl_tg xl_hdl_tg_percentage xl_vldl_c
## 1 44.56 0.014130 3.271 0.011350
## 2 30.39 0.031340 11.220 0.064820
## 3 23.64 0.008258 8.439 0.016840
## 4 37.51 0.019010 4.751 0.031350
## 5 43.04 0.006249 2.169 0.004626
## 6 45.01 0.021650 4.286 0.026650
## xl_vldl_c_percentage xl_vldl_ce xl_vldl_ce_percentage xl_vldl_fc
## 1 17.64 0.003116 4.845 0.008232
## 2 19.95 0.034680 10.680 0.030140
## 3 21.23 0.009798 12.350 0.007041
## 4 20.26 0.016340 10.560 0.015010
## 5 29.54 0.002569 16.410 0.002057
## 6 20.40 0.013730 10.510 0.012930
## xl_vldl_fc_percentage xl_vldl_l xl_vldl_p xl_vldl_pl
## 1 12.800 0.06432 0.0000000006560 0.010790
## 2 9.278 0.32490 0.0000000033340 0.052500
## 3 8.875 0.07933 0.0000000008151 0.011370
## 4 9.702 0.15470 0.0000000015850 0.024540
## 5 13.140 0.01566 0.0000000001534 0.002668
## 6 9.894 0.13070 0.0000000013330 0.022550
## xl_vldl_pl_percentage xl_vldl_tg xl_vldl_tg_percentage xs_vldl_c
## 1 16.77 0.042190 65.59 0.4368
## 2 16.16 0.207500 63.88 0.2064
## 3 14.33 0.051130 64.45 0.2253
## 4 15.86 0.098840 63.88 0.2687
## 5 17.04 0.008364 53.42 0.3013
## 6 17.26 0.081450 62.34 0.2920
## xs_vldl_c_percentage xs_vldl_ce xs_vldl_ce_percentage xs_vldl_fc
## 1 49.42 0.2878 32.57 0.14890
## 2 38.75 0.1352 25.38 0.07120
## 3 49.01 0.1495 32.50 0.07590
## 4 49.96 0.1813 33.72 0.08738
## 5 51.67 0.1922 32.95 0.10910
## 6 46.90 0.1871 30.05 0.10490
## xs_vldl_fc_percentage xs_vldl_l xs_vldl_p xs_vldl_pl
## 1 16.85 0.8838 0.00000006895 0.2776
## 2 13.36 0.5328 0.00000004427 0.1414
## 3 16.50 0.4598 0.00000003644 0.1202
## 4 16.25 0.5378 0.00000004230 0.1523
## 5 18.71 0.5832 0.00000004480 0.1819
## 6 16.85 0.6227 0.00000004922 0.1857
## xs_vldl_pl_percentage xs_vldl_tg xs_vldl_tg_percentage xxl_vldl_c
## 1 31.41 0.16940 19.17 0.008250
## 2 26.54 0.18500 34.72 0.022160
## 3 26.13 0.11430 24.87 0.005346
## 4 28.32 0.11680 21.72 0.010200
## 5 31.19 0.09996 17.14 0.001834
## 6 29.82 0.14490 23.28 0.009104
## xxl_vldl_c_percentage xxl_vldl_ce xxl_vldl_ce_percentage xxl_vldl_fc
## 1 26.65 0.006399 20.670 0.0018510
## 2 17.89 0.011980 9.672 0.0101800
## 3 19.71 0.003108 11.460 0.0022380
## 4 16.53 0.004910 7.962 0.0052870
## 5 22.99 0.001438 18.040 0.0003954
## 6 18.26 0.004670 9.365 0.0044330
## xxl_vldl_fc_percentage xxl_vldl_l xxl_vldl_p xxl_vldl_pl
## 1 5.979 0.030960 0.00000000014470 0.0003335
## 2 8.218 0.123800 0.00000000057710 0.0149700
## 3 8.250 0.027130 0.00000000012590 0.0030720
## 4 8.573 0.061680 0.00000000028780 0.0078650
## 5 4.958 0.007975 0.00000000003677 0.0011910
## 6 8.890 0.049870 0.00000000023120 0.0063730
## xxl_vldl_pl_percentage xxl_vldl_tg xxl_vldl_tg_percentage
## 1 1.077 0.02238 72.28
## 2 12.090 0.08672 70.02
## 3 11.330 0.01871 68.97
## 4 12.750 0.04361 70.71
## 5 14.940 0.00495 62.07
## 6 12.780 0.03439 68.96
## abnormal_macromolecule_a low_glucose low_glutamine_high_glutamate
## 1 false false false
## 2 false false false
## 3 false false false
## 4 false false false
## 5 false false false
## 6 false false false
## low_protein_content high_citrate high_ethanol high_lactate high_pyruvate
## 1 false false false false false
## 2 false false false false false
## 3 false false false false false
## 4 false false false false false
## 5 false false false false false
## 6 false false false false false
## serum_sample unidentified_small_molecule_a unidentified_small_molecule_b
## 1 false false false
## 2 false false false
## 3 false false false
## 4 false false false
## 5 false false false
## 6 false false false
## unknown_acetylated_compound isopropyl_alcohol polysaccharides
## 1 false false false
## 2 false false false
## 3 false false false
## 4 false false false
## 5 false false false
## 6 false false false
## aminocaproic_acid fast biobank
## 1 false false LLS_PARTOFFS
## 2 false false LLS_PARTOFFS
## 3 false false LLS_PARTOFFS
## 4 false false LLS_PARTOFFS
## 5 false false LLS_PARTOFFS
## 6 false false LLS_PARTOFFS
Using metabolomic SummarizedExperiments
data(metabolomics_RP3RP4_overlap)
metabolomicData
## class: SummarizedExperiment0
## dim: 247 3882
## metadata(0):
## assays(1): measurements
## rownames(247): acace ace ... aminocaproic_acid fast
## metadata column names(0):
## colnames: NULL
## colData names(51): biobank subject_id ... temp_storage
## time_storage
colData(metabolomicData)
## DataFrame with 3882 rows and 51 columns
## biobank subject_id bios_id date_of_birth age_bloodcollection
## <factor> <factor> <character> <factor> <numeric>
## 1 VUNTR VUNTR-A20A A20A 56.7
## 2 VUNTR VUNTR-A20B A20B 56.0
## 3 VUNTR VUNTR-A20C A20C 31.0
## 4 VUNTR VUNTR-A21A A21A 65.0
## 5 VUNTR VUNTR-A21B A21B 59.7
## ... ... ... ... ... ...
## 3878 VUNTR VUNTR-A56C A56C 32.6
## 3879 VUNTR VUNTR-A573C A573C 23.6
## 3880 VUNTR VUNTR-A573D A573D 23.6
## 3881 VUNTR VUNTR-A574C A574C 22.4
## 3882 VUNTR VUNTR-A575C A575C 22.8
## gender pedigree_information
## <factor> <factor>
## 1 true true
## 2 false true
## 3 true true
## 4 true true
## 5 false true
## ... ... ...
## 3878 false true
## 3879 false true
## 3880 false true
## 3881 true true
## 3882 true true
## gwas_platform_used gwas_available_date
## <factor> <factor>
## 1 Illumina Omni 1M 06-2015
## 2 Illumina Omni 1M 06-2015
## 3 Illumina Omni 1M 06-2015
## 4 Illumina Omni 1M 06-2015
## 5 Illumina Omni 1M 06-2015
## ... ... ...
## 3878 Illumina Omni 1M; Affymetrix 6.0 907K 06-2015
## 3879 Affymetrix Genome-Wide Human SNP Array 6.0 06-2015
## 3880 Affymetrix Genome-Wide Human SNP Array 6.0 06-2015
## 3881 Affymetrix 6.0 907K 06-2015
## 3882
## dna_amount dna_source rna_amount rna_source date_of_inclusion
## <factor> <factor> <factor> <factor> <factor>
## 1 true blood true blood 2008-10-01T00:00:00+0200
## 2 true blood true blood 2008-10-01T00:00:00+0200
## 3 true blood true blood 2008-11-01T00:00:00+0100
## 4 true blood true blood 2008-03-01T00:00:00+0100
## 5 true blood true blood 2008-03-01T00:00:00+0100
## ... ... ... ... ... ...
## 3878 true blood true blood 2008-07-01T00:00:00+0200
## 3879 true blood true blood 2012-09-01T00:00:00+0200
## 3880 true blood true blood 2012-09-01T00:00:00+0200
## 3881 true blood true blood 2012-04-01T00:00:00+0200
## 3882 true blood true blood 2012-07-01T00:00:00+0200
## smoking alcohol_consumption height weight waist_circumference
## <factor> <factor> <numeric> <numeric> <numeric>
## 1 true 183 89.5 96
## 2 false 178 106.2 105
## 3 true 189 70.5 73
## 4 false 176 83.3 94
## 5 false 169 64.6 69
## ... ... ... ... ... ...
## 3878 false 159 54.0 76
## 3879 false 174 61.9 72
## 3880 false 174 60.5 69
## 3881 false 183 94.7 89
## 3882 false 188 68.6 74
## hip_circumference hs_crp wbc hgb hct plt
## <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
## 1 98 2.40 11.7 9.7 0.455 221
## 2 123 3.74 7.0 9.1 0.403 273
## 3 94 0.67 5.0 9.3 0.435 162
## 4 105 7.50 7.6 9.7 0.486 317
## 5 95 6.66 6.5 8.1 0.408 97
## ... ... ... ... ... ... ...
## 3878 97 7.300 7.80 7.5 0.356 223
## 3879 96 2.600 5.44 8.7 0.416 196
## 3880 92 2.320 6.56 8.4 0.403 186
## 3881 109 0.344 4.52 9.8 0.465 278
## 3882 89 1.280 6.68 9.4 0.437 262
## neut_percentage lymph_percentage mono_percentage eos_percentage
## <numeric> <numeric> <numeric> <numeric>
## 1 63.8 28.4 5.2 2.4
## 2 55.9 31.7 8.1 2.5
## 3 48.4 37.5 8.3 5.6
## 4 69.4 21.0 7.4 2.2
## 5 65.3 23.8 7.1 2.9
## ... ... ... ... ...
## 3878 72.3 23.1 3.7 0.7
## 3879 43.9 44.3 9.9 1.7
## 3880 41.0 47.9 9.5 1.5
## 3881 47.8 37.8 13.1 1.1
## 3882 57.5 28.0 10.9 3.4
## baso_percentage luc_percentage tot_cholesterol hdl_cholesterol
## <numeric> <numeric> <numeric> <numeric>
## 1 0.2 NA 6.85 0.93
## 2 1.8 NA 5.06 1.44
## 3 0.2 NA 3.64 1.25
## 4 0.0 NA 5.20 1.50
## 5 0.9 NA 5.78 2.30
## ... ... ... ... ...
## 3878 0.2 NA 5.67 2.22
## 3879 0.2 NA 4.80 1.30
## 3880 0.2 NA 4.50 1.20
## 3881 0.2 NA 6.30 0.80
## 3882 0.1 NA 4.20 1.40
## triglycerides systolic_blood_pressure diastolic_blood_pressure
## <numeric> <numeric> <numeric>
## 1 1.81 NA NA
## 2 0.96 NA NA
## 3 0.42 NA NA
## 4 0.85 NA NA
## 5 0.79 NA NA
## ... ... ... ...
## 3878 1.31 NA NA
## 3879 1.33 128.0 87.5
## 3880 1.16 133.5 81.5
## 3881 3.18 143.0 77.5
## 3882 1.30 132.0 81.5
## lipid_lowering_med blood_pressure_lowering_med metabolic_syndrome
## <integer> <factor> <factor>
## 1 0 false
## 2 0 false
## 3 0 false
## 4 0 false
## 5 0 false
## ... ... ... ...
## 3878 0 false
## 3879 0 false
## 3880 0 false
## 3881 0 false
## 3882 0 false
## diabetes real_bios_id biobank.1 subject_id.1
## <factor> <character> <factor> <factor>
## 1 NTR-A20A-NTR16223-9207 VUNTR VUNTR-A20A
## 2 NTR-A20B-NTR16225-9208 VUNTR VUNTR-A20B
## 3 NTR-A20C-NTR16565-9388 VUNTR VUNTR-A20C
## 4 NTR-A21A-NTR13461-7740 VUNTR VUNTR-A21A
## 5 NTR-A21B-7741 VUNTR VUNTR-A21B
## ... ... ... ... ...
## 3878 NTR-A56C-NTR15095-8604 VUNTR VUNTR-A56C
## 3879 NTR-A573C-NT0027615-10346 VUNTR VUNTR-A573C
## 3880 NTR-A573D-NT0027614-10345 VUNTR VUNTR-A573D
## 3881 NTR-A574C-NT0027387-10113 VUNTR VUNTR-A574C
## 3882 NTR-A575C-10252 VUNTR VUNTR-A575C
## sample_id date_collection date_inclusion
## <factor> <factor> <character>
## 1 VUNTR-9207_64109-31 2008-10-01T00:00:00+0200 NA
## 2 VUNTR-9208_64109-41 2008-10-01T00:00:00+0200 NA
## 3 VUNTR-9388_64109-02 2008-11-01T00:00:00+0100 NA
## 4 VUNTR-7740_65698-31 2008-03-01T00:00:00+0100 NA
## 5 VUNTR-7741_65698-41 2008-03-01T00:00:00+0100 NA
## ... ... ... ...
## 3878 VUNTR-8604_68402-01 2008-07-01T00:00:00+0200 NA
## 3879 VUNTR-10346_1062613 2012-09-01T00:00:00+0200 NA
## 3880 VUNTR-10345_1062614 2012-09-01T00:00:00+0200 NA
## 3881 VUNTR-10113_1019101 2012-04-01T00:00:00+0200 NA
## 3882 VUNTR-10252_1049462 2012-07-01T00:00:00+0200 NA
## sample_matrix fasting time_handling temp_storage time_storage
## <factor> <factor> <numeric> <numeric> <numeric>
## 1 EDTA plasma false 6 -30 NA
## 2 EDTA plasma false 6 -30 NA
## 3 EDTA plasma true 6 -30 NA
## 4 EDTA plasma true 6 -30 NA
## 5 EDTA plasma true 6 -30 NA
## ... ... ... ... ... ...
## 3878 EDTA plasma true 6 -30 NA
## 3879 EDTA plasma true 6 -30 NA
## 3880 EDTA plasma true 6 -30 NA
## 3881 EDTA plasma true 6 -30 NA
## 3882 EDTA plasma true 6 -30 NA
meas <- assays(metabolomicData)$measurements
remove <- apply(meas, 1, function(x) sum(is.na(x)) == ncol(meas))
metabolomicData <- metabolomicData[!remove, ]
metabolomicData
## class: SummarizedExperiment0
## dim: 231 3882
## metadata(0):
## assays(1): measurements
## rownames(231): acace ace ... xxl_vldl_tg xxl_vldl_tg_percentage
## metadata column names(0):
## colnames: NULL
## colData names(51): biobank subject_id ... temp_storage
## time_storage
Genotype data
The impute2 genotype files have been transformed to tabix-files. These tabix files contain dosages and are filter on MAF 0.05 and INFO 0.04. Additionally, the rs-number, chrosomome name and position are added to these files as well as the sample identifiers (gwas_id).
Reading Impute2 tabix files
TabixFile
creates a reference to a Tabix file (and its index). Internally the object tbx
contains a pointer to the file. This mechanism allows us to read the data in chunks, e.g. when the whole file does not fit in memory, and perform some operation on each chunk.
The next code chunk show how you can do this using plain R.
gzipped <- dir(file.path(RP3DATADIR, "GWAS_ImputationGoNLv5/dosages", BIOBANKS[1]),
pattern = "gz$", full.names = TRUE)
chunk <- read.dosages(gzipped[1], yieldSize = 5000)
## Reading chunk...
chunk[1:5, 1:10]
## snp_id rs_id position exp_freq_a1 info certainty type chr rsid
## 1 --- 1-77560 77560 0.001 0.450 0.999 0 1 1-77560
## 2 --- 1-83516 83516 0.001 0.426 0.998 0 1 1-83516
## 3 --- 1-87885 87885 0.001 0.504 0.999 0 1 1-87885
## 4 --- 1-249389 249389 0.002 0.402 0.997 0 1 1-249389
## 5 --- 1-362911 362911 0.002 0.514 0.998 0 1 1-362911
## pos
## 1 77560
## 2 83516
## 3 87885
## 4 249389
## 5 362911
chunk <- read.dosages(gzipped[1], yieldSize = 5000, type = "GRanges")
## Reading chunk...
chunk
## GRanges object with 5000 ranges and 3 metadata columns:
## seqnames ranges strand | rsid ref
## <Rle> <IRanges> <Rle> | <character> <character>
## [1] chr1 [ 77560, 77560] * | 1-77560 T
## [2] chr1 [ 83516, 83516] * | 1-83516 C
## [3] chr1 [ 87885, 87885] * | 1-87885 A
## [4] chr1 [249389, 249389] * | 1-249389 A
## [5] chr1 [362911, 362911] * | 1-362911 G
## ... ... ... ... ... ... ...
## [4996] chr1 [1957299, 1957299] * | rs3820007 C
## [4997] chr1 [1957414, 1957414] * | 1-1957414 T
## [4998] chr1 [1958532, 1958532] * | 1-1958532 G
## [4999] chr1 [1959238, 1959238] * | rs114869768 G
## [5000] chr1 [1959261, 1959261] * | rs28574670 A
## alt
## <character>
## [1] C
## [2] T
## [3] C
## [4] T
## [5] T
## ... ...
## [4996] T
## [4997] C
## [4998] A
## [4999] A
## [5000] G
## -------
## seqinfo: 1 sequence from an unspecified genome; no seqlengths
chunk <- read.dosages(gzipped[1], yieldSize = 5000, type = "SummarizedExperiment")
## Reading chunk...
chunk
## class: RangedSummarizedExperiment
## dim: 5000 768
## metadata(0):
## assays(1): dosage
## rownames: NULL
## rowRanges metadata column names(3): rsid ref alt
## colnames: NULL
## colData names(1): gwas_id
colData(chunk)
## DataFrame with 768 rows and 1 column
## gwas_id
## <character>
## 1 7208
## 2 6434
## 3 8640
## 4 4267
## 5 8725
## ... ...
## 764 2477002
## 765 5399001
## 766 1457001
## 767 543002
## 768 8127001
rowRanges(chunk)
## GRanges object with 5000 ranges and 3 metadata columns:
## seqnames ranges strand | rsid ref
## <Rle> <IRanges> <Rle> | <character> <character>
## [1] chr1 [ 77560, 77560] * | 1-77560 T
## [2] chr1 [ 83516, 83516] * | 1-83516 C
## [3] chr1 [ 87885, 87885] * | 1-87885 A
## [4] chr1 [249389, 249389] * | 1-249389 A
## [5] chr1 [362911, 362911] * | 1-362911 G
## ... ... ... ... ... ... ...
## [4996] chr1 [1957299, 1957299] * | rs3820007 C
## [4997] chr1 [1957414, 1957414] * | 1-1957414 T
## [4998] chr1 [1958532, 1958532] * | 1-1958532 G
## [4999] chr1 [1959238, 1959238] * | rs114869768 G
## [5000] chr1 [1959261, 1959261] * | rs28574670 A
## alt
## <character>
## [1] C
## [2] T
## [3] C
## [4] T
## [5] T
## ... ...
## [4996] T
## [4997] C
## [4998] A
## [4999] A
## [5000] G
## -------
## seqinfo: 1 sequence from an unspecified genome; no seqlengths
assay(chunk)[1:5, 1:5]
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0 0 0.000 0 0
## [2,] 0 0 0.001 0 0
## [3,] 0 0 0.001 0 0
## [4,] 0 0 0.001 0 0
## [5,] 0 0 0.000 0 0
read.dosages
with default type=data.frame
returns a data.frame
containing the specified chunk of the dosages-file. The other type
options return only the genomic locations of the chunk as a GRanges
-object or the same information as the data.frame
but as a SummarizedExperiment
.
The GenomicFiles, based on BiocParallel, provide functionality to read the chunks in parallel. The following code chunk show how to use these functions.
Use cases
Session info
sessionInfo()
## R version 3.2.0 (2015-04-16)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu precise (12.04.5 LTS)
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 parallel methods stats graphics grDevices utils
## [8] datasets base
##
## other attached packages:
## [1] ggplot2_2.1.0 lubridate_1.5.6
## [3] rjson_0.2.15 RCurl_1.95-4.8
## [5] bitops_1.0-6 BIOSRutils_0.0.7
## [7] SummarizedExperiment_1.0.2 Biobase_2.30.0
## [9] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3
## [11] IRanges_2.4.8 S4Vectors_0.8.11
## [13] BiocGenerics_0.16.1 knitr_1.13
## [15] BiocStyle_1.8.0 BiocInstaller_1.20.3
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.5 formatR_1.4 futile.logger_1.4.1
## [4] plyr_1.8.3 XVector_0.10.0 futile.options_1.0.0
## [7] tools_3.2.0 zlibbioc_1.16.0 digest_0.6.9
## [10] jsonlite_0.9.20 evaluate_0.9 gtable_0.2.0
## [13] yaml_2.1.13 stringr_1.0.0 Biostrings_2.38.4
## [16] grid_3.2.0 BiocParallel_1.4.3 rmarkdown_0.9.6.9
## [19] lambda.r_1.1.7 magrittr_1.5 Rsamtools_1.22.0
## [22] scales_0.4.0 htmltools_0.3.5 colorspace_1.2-6
## [25] labeling_0.3 stringi_1.0-1 munsell_0.4.3
References
Attachments (4)
- measurementlactate-1.png (37.1 KB) - added by 8 years ago.
- methylaidoutliersum-1.png (22.5 KB) - added by 8 years ago.
- subjectsagedist-1.png (81.5 KB) - added by 8 years ago.
- subjectsbiobanks-1.png (27.6 KB) - added by 8 years ago.
Download all attachments as: .zip