This is a function to derive the case and control AFs from GWAS summary
statistics when the user has access to the whole sample AF, the sample sizes,
and the OR (or beta).
If user has SE instead of sample AF use CaseControl_SE()
CaseControl_AF(
data,
N_case = 0,
N_control = 0,
OR_colname = "OR",
AF_total_colname = "AF"
)
dataframe with each row being a variant and columns for AF_total and OR
the number of cases in the sample
the number of controls in the sample
a string containing the exact column name in 'data' with the OR
a string containing the exact column name in 'data' with the whole sample AF
returns a dataframe with two columns (AF_case, AF_control) and rows equal to the number of variants
https://github.com/wolffha/CCAFE
https://github.com/wolffha/CCAFE for further documentation
library(CCAFE)
data("sampleDat")
sampleDat <- as.data.frame(sampleDat)
nCase_sample = 16550
nControl_sample = 403923
# get the estimated case and control AFs
af_method_results <- CaseControl_AF(data = sampleDat,
N_case = nCase_sample,
N_control = nControl_sample,
OR_colname = "OR",
AF_total_colname = "true_maf_pop")
head(af_method_results)
#> CHR POS REF ALT true_maf_case true_maf_control beta SE
#> 1 chr1 226824710 C T 2.048e-01 2.041e-01 0.0003441 0.01471
#> 2 chr1 117812346 G A 4.480e-01 4.471e-01 -0.0013920 0.01196
#> 3 chr1 230838863 C T 1.647e-01 1.642e-01 0.0042250 0.01620
#> 4 chr1 93121792 ATT A 0.000e+00 0.000e+00 -0.1058000 3.53000
#> 5 chr1 240236388 G A 1.114e-05 3.775e-05 -1.2540000 1.25300
#> 6 chr1 12385196 G A 1.390e-02 1.402e-02 -0.0152700 0.05046
#> gnomad_maf OR true_maf_pop AF_case AF_control
#> 1 2.04050e-01 1.0003442 2.041276e-01 2.041813e-01 2.041254e-01
#> 2 4.45508e-01 0.9986090 4.471354e-01 4.468049e-01 4.471490e-01
#> 3 1.82612e-01 1.0042339 1.642197e-01 1.647775e-01 1.641968e-01
#> 4 1.07925e-04 0.8996046 0.000000e+00 0.000000e+00 0.000000e+00
#> 5 8.82145e-05 0.2853611 3.670262e-05 1.077692e-05 3.776488e-05
#> 6 1.15995e-02 0.9848460 1.401528e-02 1.381395e-02 1.402353e-02