Package 'powerEQTL'

Title: Power and Sample Size Calculation for eQTL Analysis
Description: Power and sample size calculation for eQTL analysis based on ANOVA or simple linear regression. It can also calculate power/sample size for testing the association of a SNP to a continuous type phenotype.
Authors: Xianjun Dong [aut, ctb], Tzuu-Wang Chang [aut, ctb], Scott T. Weiss [aut, ctb], Weiliang Qiu [aut, cre]
Maintainer: Weiliang Qiu <[email protected]>
License: GPL (>=2)
Version: 0.1.3
Built: 2024-09-10 02:37:18 UTC
Source: https://github.com/sterding/powereqtl

Help Index


Calculation of Minimum Detectable Effect Size for EQTL Analysis Based on Un-Balanced One-Way ANOVA

Description

Calculation of minimum detectable effect size (δ/σ\delta/\sigma) for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA.

Usage

minEffectEQTL.ANOVA(MAF,
                typeI = 0.05,
                nTests = 2e+05,
                myntotal = 200,
                mypower = 0.8,
                verbose = TRUE)

Arguments

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

myntotal

integer. Number of subjects.

mypower

Desired power for the eQTL analysis.

verbose

logic. indicating if intermediate results should be output.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power=Pr(FF1α(k1,Nk)FFk1,Nk,λ),power=Pr\left(\left.F\geq F_{1-\alpha}\left(k-1, N-k\right)\right| F\sim F_{k-1, N-k, \lambda}\right),

where k=3k=3 is the number of groups of subjects, NN is the total number of subjects, F1α(k1,Nk)F_{1-\alpha}\left(k-1, N-k\right) is the 100(1α)100(1-\alpha)-th percentile of central F distribution with degrees of freedoms k1k-1 and NkN-k, and Fk1,Nk,λF_{k-1, N-k, \lambda} is the non-central F distribution with degrees of freedoms k1k-1 and NkN-k and non-central parameter (ncp) λ\lambda. The ncp λ\lambda is equal to

λ=Nσ2i=1kwi(μiμ)2,\lambda=\frac{N}{\sigma^2}\sum_{i=1}^{k} w_i \left(\mu_i-\mu\right)^2,

where μi\mu_i is the mean gene expression level for the ii-th group of subjects, wiw_i is the weight for the ii-th group of subjects, σ2\sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and μ\mu is the weighted mean gene expression level

μ=i=1kwiμi.\mu=\sum_{i=1}^{k}w_i \mu_i.

The weights wiw_i are the sample proportions for the 3 groups of subjects. Hence, i=13wi=1\sum_{i=1}^{3}w_i = 1.

We assume that μ2μ1=μ3μ2=δ\mu_2-\mu_1=\mu_3-\mu_2=\delta, where μ1\mu_1, μ2\mu_2, and μ3\mu_3 are the mean gene expression level for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively.

Denote pp as the minor allele frequency (MAF) of a SNP. Under Hardy-Weinberg equilibrium, we have genotype frequencies: p2=p2p_2=p^2, p1=2pqp_1=2 p q, and p0=q2p_0=q^2, where p2p_2, p1p_1, and p0p_0 are genotype for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively, q=1pq=1-p. Then ncp can be simplified as

ncp=2pqN(δσ)2,ncp=2pq N\left(\frac{\delta}{\sigma}\right)^2,

Value

minimum detectable effect size δ/σ\delta/\sigma.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

powerEQTL.ANOVA, powerEQTL.ANOVA2, ssEQTL.ANOVA, ssEQTL.ANOVA2

Examples

minEffectEQTL.ANOVA(
          MAF = 0.1,
          typeI = 0.05,
          nTests = 200000,
          myntotal = 234,
          mypower = 0.8,
          verbose = TRUE)

Calculation of Minimum Detectable Minor Allele Frequency for EQTL Analysis Based on Un-Balanced One-Way ANOVA

Description

Calculation of minimum detectable minor allele frequency (MAF) for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA.

Usage

minMAFeQTL.ANOVA(effsize, 
                 typeI = 0.05,
                 nTests = 200000,
                 myntotal = 200,
                 mypower = 0.8,
                 verbose = TRUE)

Arguments

effsize

Effect size δ/σ\delta/\sigma.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

myntotal

integer. Number of subjects.

mypower

Desired power for the eQTL analysis.

verbose

logic. indicating if intermediate results should be output.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power=Pr(FF1α(k1,Nk)FFk1,Nk,λ),power=Pr\left(\left.F\geq F_{1-\alpha}\left(k-1, N-k\right)\right| F\sim F_{k-1, N-k, \lambda}\right),

where k=3k=3 is the number of groups of subjects, NN is the total number of subjects, F1α(k1,Nk)F_{1-\alpha}\left(k-1, N-k\right) is the 100(1α)100(1-\alpha)-th percentile of central F distribution with degrees of freedoms k1k-1 and NkN-k, and Fk1,Nk,λF_{k-1, N-k, \lambda} is the non-central F distribution with degrees of freedoms k1k-1 and NkN-k and non-central parameter (ncp) λ\lambda. The ncp λ\lambda is equal to

λ=Nσ2i=1kwi(μiμ)2,\lambda=\frac{N}{\sigma^2}\sum_{i=1}^{k} w_i \left(\mu_i-\mu\right)^2,

where μi\mu_i is the mean gene expression level for the ii-th group of subjects, wiw_i is the weight for the ii-th group of subjects, σ2\sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and μ\mu is the weighted mean gene expression level

μ=i=1kwiμi.\mu=\sum_{i=1}^{k}w_i \mu_i.

The weights wiw_i are the sample proportions for the 3 groups of subjects. Hence, i=13wi=1\sum_{i=1}^{3}w_i = 1.

We assume that μ2μ1=μ3μ2=δ\mu_2-\mu_1=\mu_3-\mu_2=\delta, where μ1\mu_1, μ2\mu_2, and μ3\mu_3 are the mean gene expression level for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively.

Denote pp as the minor allele frequency (MAF) of a SNP. Under Hardy-Weinberg equilibrium, we have genotype frequencies: p2=p2p_2=p^2, p1=2pqp_1=2 p q, and p0=q2p_0=q^2, where p2p_2, p1p_1, and p0p_0 are genotype for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively, q=1pq=1-p. Then ncp can be simplified as

ncp=2pqN(δσ)2,ncp=2pq N\left(\frac{\delta}{\sigma}\right)^2,

Value

minimum detectable MAF.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

powerEQTL.ANOVA, powerEQTL.ANOVA2, ssEQTL.ANOVA, ssEQTL.ANOVA2

Examples

minMAFeQTL.ANOVA(effsize = 1, 
                 typeI = 0.05,
                 nTests = 200000,
                 myntotal = 234,
                 mypower = 0.8,
                 verbose = TRUE)

Minimum Detectable Minor Allele Frequencey Calculation for EQTL Analysis Based on Simple Linear Regression

Description

Minimum detectable minor allele frequency (MAF) calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using simple linear regression.

Usage

minMAFeQTL.SLR(slope,
               typeI = 0.05,
               nTests = 200000,
               myntotal = 200,
               mypower = 0.8,
               mystddev = 0.13,
               verbose = TRUE)

Arguments

slope

Slope of the simple linear regression.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

myntotal

integer. Number of subjects.

mypower

Desired power for the eQTL analysis.

mystddev

Standard deviation of the random error term ϵ\epsilon in simple linear regression.

verbose

logic. indicating if intermediate results should be output.

Details

To test if a SNP is associated with a gene probe, we use the simple linear regression

yi=β0+β1xi+ϵi,y_i = \beta_0+\beta_1 x_i + \epsilon_i,

where yiy_i is the gene expression level of the ii-th subject, xix_i is the genotype of the ii-th subject, and ϵi\epsilon_i is the random error term. Additive coding for genotype is used. To test if the SNP is associated with the gene probe, we test the null hypothesis H0:β1=0H_0: \beta_1=0.

Denote pp as the minor allele frequency (MAF) of the SNP. Under Hardy-Weinberg equilibrium, we can calculate the variance of genotype of the SNP: σx2=2p(1p)\sigma^2_x=2 p (1-p), where σx2\sigma^2_x is the variance of the predictor (i.e. the SNP) xix_i.

We then can use Dupont and Plummer's (1998) power/sample size calculation formula to calculate the minimum detectable slope, adjusting for multiple testing.

Value

The estimated minimum detectable MAF.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Dupont, W.D. and Plummer, W.D.. Power and Sample Size Calculations for Studies Involving Linear Regression. Controlled Clinical Trials. 1998;19:589-601.

See Also

powerEQTL.SLR, ssEQTL.SLR

Examples

minMAFeQTL.SLR(slope = 0.1299513,
               typeI = 0.05,
               nTests = 200000,
               myntotal = 176,
               mypower = 0.8,
               mystddev = 0.13,
               verbose = TRUE)

Minimum Detectable Slope Calculation for EQTL Analysis Based on Simple Linear Regression

Description

Minimum detectable slope calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using simple linear regression.

Usage

minSlopeEQTL.SLR(
  MAF,
  typeI = 0.05,
  nTests = 2e+05,
  myntotal = 200,
  mypower = 0.8,
  mystddev = 0.13,
  verbose = TRUE)

Arguments

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

myntotal

integer. Number of subjects.

mypower

Desired power for the eQTL analysis.

mystddev

Standard deviation of the random error term ϵ\epsilon in simple linear regression.

verbose

logic. indicating if intermediate results should be output.

Details

To test if a SNP is associated with a gene probe, we use the simple linear regression

yi=β0+β1xi+ϵi,y_i = \beta_0+\beta_1 x_i + \epsilon_i,

where yiy_i is the gene expression level of the ii-th subject, xix_i is the genotype of the ii-th subject, and ϵi\epsilon_i is the random error term. Additive coding for genotype is used. To test if the SNP is associated with the gene probe, we test the null hypothesis H0:β1=0H_0: \beta_1=0.

Denote pp as the minor allele frequency (MAF) of the SNP. Under Hardy-Weinberg equilibrium, we can calculate the variance of genotype of the SNP: σx2=2p(1p)\sigma^2_x=2 p (1-p), where σx2\sigma^2_x is the variance of the predictor (i.e. the SNP) xix_i.

We then can use Dupont and Plummer's (1998) power/sample size calculation formula to calculate the minimum detectable slope, adjusting for multiple testing.

Value

The estimated minimum detectable slope.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Dupont, W.D. and Plummer, W.D.. Power and Sample Size Calculations for Studies Involving Linear Regression. Controlled Clinical Trials. 1998;19:589-601.

See Also

powerEQTL.SLR, ssEQTL.SLR

Examples

minSlopeEQTL.SLR(
  MAF = 0.1,
  typeI = 0.05,
  nTests = 2e+05,
  myntotal = 176,
  mypower = 0.8,
  mystddev = 0.13,
  verbose = TRUE)

Power Calculation for EQTL Analysis Based on Un-Balanced One-Way ANOVA

Description

Power calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA.

Usage

powerEQTL.ANOVA(MAF,
                typeI = 0.05,
                nTests = 2e+05,
                myntotal = 200,
                mystddev = 0.13,
                deltaVec = c(0.13, 0.13),
                verbose = TRUE)

Arguments

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

myntotal

integer. Number of subjects.

mystddev

Standard deviation of gene expression levels in one group of subjects. Assume all 3 groups of subjects (mutation homozygote, heterozygote, wild-type homozygote) have the same standard deviation of gene expression levels.

deltaVec

A vector having 2 elements. The first element is equal to μ2μ1\mu_2-\mu_1 and the second elementis equalt to μ3μ2\mu_3-\mu_2, where μ1\mu_1 is the mean gene expression level for the mutation homozygotes, μ2\mu_2 is the mean gene expression level for the heterozygotes, and μ3\mu_3 is the mean gene expression level for the wild-type gene expression level.

verbose

logic. indicating if intermediate results should be output.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power=Pr(FF1α(k1,Nk)FFk1,Nk,λ),power=Pr\left(\left.F\geq F_{1-\alpha}\left(k-1, N-k\right)\right| F\sim F_{k-1, N-k, \lambda}\right),

where k=3k=3 is the number of groups of subjects, NN is the total number of subjects, F1α(k1,Nk)F_{1-\alpha}\left(k-1, N-k\right) is the 100(1α)100(1-\alpha)-th percentile of central F distribution with degrees of freedoms k1k-1 and NkN-k, and Fk1,Nk,λF_{k-1, N-k, \lambda} is the non-central F distribution with degrees of freedoms k1k-1 and NkN-k and non-central parameter (ncp) λ\lambda. The ncp λ\lambda is equal to

λ=Nσ2i=1kwi(μiμ)2,\lambda=\frac{N}{\sigma^2}\sum_{i=1}^{k} w_i \left(\mu_i-\mu\right)^2,

where μi\mu_i is the mean gene expression level for the ii-th group of subjects, wiw_i is the weight for the ii-th group of subjects, σ2\sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and μ\mu is the weighted mean gene expression level

μ=i=1kwiμi.\mu=\sum_{i=1}^{k}w_i \mu_i.

The weights wiw_i are the sample proportions for the 3 groups of subjects. Hence, i=13wi=1\sum_{i=1}^{3}w_i = 1.

Value

power of the test after Bonferroni correction for multiple testing.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

minEffectEQTL.ANOVA, powerEQTL.ANOVA2, ssEQTL.ANOVA, ssEQTL.ANOVA2

Examples

powerEQTL.ANOVA(
          MAF = 0.1,
          typeI = 0.05,
          nTests = 200000,
          myntotal = 234,
          mystddev = 0.13,
          deltaVec = c(0.13, 0.13))

Power Calculation for EQTL Analysis Based on Un-Balanced One-Way ANOVA

Description

Power calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA (assuming Hardy-Weinberg equilibrium).

Usage

powerEQTL.ANOVA2(effsize,
                MAF,
                typeI = 0.05,
                nTests = 2e+05,
                myntotal = 200,
                verbose = TRUE)

Arguments

effsize

effect size δ/σ\delta/\sigma, where δ=μ2μ1=μ3μ2\delta=\mu_2-\mu_1=\mu_3-\mu_2, μ1\mu_1, μ2\mu_2, μ3\mu_3 are the mean gene expression level of mutation homozygotes, heterozygotes, and wild-type homozygotes, and σ\sigma is the standard deviation of gene expression levels (assuming each genotype group has the same variance).

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

myntotal

integer. Number of subjects.

verbose

logic. indicating if intermediate results should be output.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power=Pr(FF1α(k1,Nk)FFk1,Nk,λ),power=Pr\left(\left.F\geq F_{1-\alpha}\left(k-1, N-k\right)\right| F\sim F_{k-1, N-k, \lambda}\right),

where k=3k=3 is the number of groups of subjects, NN is the total number of subjects, F1α(k1,Nk)F_{1-\alpha}\left(k-1, N-k\right) is the 100(1α)100(1-\alpha)-th percentile of central F distribution with degrees of freedoms k1k-1 and NkN-k, and Fk1,Nk,λF_{k-1, N-k, \lambda} is the non-central F distribution with degrees of freedoms k1k-1 and NkN-k and non-central parameter (ncp) λ\lambda. The ncp λ\lambda is equal to

λ=Nσ2i=1kwi(μiμ)2,\lambda=\frac{N}{\sigma^2}\sum_{i=1}^{k} w_i \left(\mu_i-\mu\right)^2,

where μi\mu_i is the mean gene expression level for the ii-th group of subjects, wiw_i is the weight for the ii-th group of subjects, σ2\sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and μ\mu is the weighted mean gene expression level

μ=i=1kwiμi.\mu=\sum_{i=1}^{k}w_i \mu_i.

The weights wiw_i are the sample proportions for the 3 groups of subjects. Hence, i=13wi=1\sum_{i=1}^{3}w_i = 1.

We assume that μ2μ1=μ3μ2=δ\mu_2-\mu_1=\mu_3-\mu_2=\delta, where μ1\mu_1, μ2\mu_2, and μ3\mu_3 are the mean gene expression level for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively.

Denote pp as the minor allele frequency (MAF) of a SNP. Under Hardy-Weinberg equilibrium, we have genotype frequencies: p2=p2p_2=p^2, p1=2pqp_1=2 p q, and p0=q2p_0=q^2, where p2p_2, p1p_1, and p0p_0 are genotype for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively, q=1pq=1-p. Then ncp can be simplified as

ncp=2pqN(δσ)2,ncp=2pq N\left(\frac{\delta}{\sigma}\right)^2,

Value

power of the test after Bonferroni correction for multiple testing.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

minEffectEQTL.ANOVA, powerEQTL.ANOVA, ssEQTL.ANOVA, ssEQTL.ANOVA2

Examples

powerEQTL.ANOVA2(effsize = 1,
                MAF = 0.1,
                typeI = 0.05,
                nTests = 2e+05,
                myntotal = 234,
                verbose = TRUE)

Power Calculation for EQTL Analysis Based on Simple Linear Regression

Description

Power calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using simple linear regression.

Usage

powerEQTL.SLR(
  MAF,
  typeI = 0.05,
  nTests = 2e+05,
  slope = 0.13,
  myntotal = 200,
  mystddev = 0.13,
  verbose = TRUE)

Arguments

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

slope

Slope β1\beta_1 of the simple linear regression

yi=β0+β1xi+ϵi,y_i = \beta_0+\beta_1 x_i + \epsilon_i,

where yiy_i is the gene expression level of the ii-th subject, xix_i is the genotype of the ii-th subject, and ϵi\epsilon_i is the random error term. Additive coding for genotype is used.

myntotal

integer. Number of subjects.

mystddev

Standard deviation of the random error term ϵ\epsilon in simple linear regression.

verbose

logic. indicating if intermediate results should be output.

Details

To test if a SNP is associated with a gene probe, we use the simple linear regression

yi=β0+β1xi+ϵi,y_i = \beta_0+\beta_1 x_i + \epsilon_i,

where yiy_i is the gene expression level of the ii-th subject, xix_i is the genotype of the ii-th subject, and ϵi\epsilon_i is the random error term. Additive coding for genotype is used. To test if the SNP is associated with the gene probe, we test the null hypothesis H0:β1=0H_0: \beta_1=0.

Denote pp as the minor allele frequency (MAF) of the SNP. Under Hardy-Weinberg equilibrium, we can calculate the variance of genotype of the SNP: σx2=2p(1p)\sigma^2_x=2 p (1-p), where σx2\sigma^2_x is the variance of the predictor (i.e. the SNP) xix_i.

We then can use Dupont and Plummer's (1998) power/sample size calculation formula to calculate the minimum detectable slope, adjusting for multiple testing.

Value

power of the test after Bonferroni correction for multiple testing.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Dupont, W.D. and Plummer, W.D.. Power and Sample Size Calculations for Studies Involving Linear Regression. Controlled Clinical Trials. 1998;19:589-601.

See Also

ssEQTL.SLR, minSlopeEQTL.SLR

Examples

powerEQTL.SLR(
  MAF = 0.1,
  typeI = 0.05,
  nTests = 2e+05,
  slope = 0.13,
  myntotal = 176,
  mystddev = 0.13,
  verbose = TRUE)

Sample Size Calculation for EQTL Analysis Based on Un-Balanced One-Way ANOVA

Description

Sample size calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA.

Usage

ssEQTL.ANOVA(
  MAF,
  typeI = 0.05,
  nTests = 2e+05,
  mypower = 0.8,
  mystddev = 0.13,
  deltaVec = c(0.13, 0.13))

Arguments

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

mypower

Desired power for the eQTL analysis.

mystddev

Standard deviation of gene expression levels in one group of subjects. Assume all 3 groups of subjects (mutation homozygote, heterozygote, wild-type homozygote) have the same standard deviation of gene expression levels.

deltaVec

A vector having 2 elements. The first element is equal to μ2μ1\mu_2-\mu_1 and the second elementis equalt to μ3μ2\mu_3-\mu_2, where μ1\mu_1 is the mean gene expression level for the mutation homozygotes, μ2\mu_2 is the mean gene expression level for the heterozygotes, and μ3\mu_3 is the mean gene expression level for the wild-type gene expression level.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power=Pr(FF1α(k1,Nk)FFk1,Nk,λ),power=Pr\left(\left.F\geq F_{1-\alpha}\left(k-1, N-k\right)\right| F\sim F_{k-1, N-k, \lambda}\right),

where k=3k=3 is the number of groups of subjects, NN is the total number of subjects, F1α(k1,Nk)F_{1-\alpha}\left(k-1, N-k\right) is the 100(1α)100(1-\alpha)-th percentile of central F distribution with degrees of freedoms k1k-1 and NkN-k, and Fk1,Nk,λF_{k-1, N-k, \lambda} is the non-central F distribution with degrees of freedoms k1k-1 and NkN-k and non-central parameter (ncp) λ\lambda. The ncp λ\lambda is equal to

λ=Nσ2i=1kwi(μiμ)2,\lambda=\frac{N}{\sigma^2}\sum_{i=1}^{k} w_i \left(\mu_i-\mu\right)^2,

where μi\mu_i is the mean gene expression level for the ii-th group of subjects, wiw_i is the weight for the ii-th group of subjects, σ2\sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and μ\mu is the weighted mean gene expression level

μ=i=1kwiμi.\mu=\sum_{i=1}^{k}w_i \mu_i.

The weights wiw_i are the sample proportions for the 3 groups of subjects. Hence, i=13wi=1\sum_{i=1}^{3}w_i = 1.

Value

sample size required for the eQTL analysis to achieve the desired power.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

minEffectEQTL.ANOVA, powerEQTL.ANOVA, powerEQTL.ANOVA2, ssEQTL.ANOVA2

Examples

ssEQTL.ANOVA(MAF = 0.1,
       typeI = 0.05,
       nTests = 200000,
       mypower = 0.8,
       mystddev = 0.13,
       deltaVec = c(0.13, 0.13))

Sample Size Calculation for EQTL Analysis Based on Un-Balanced One-Way ANOVA

Description

Sample size calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA.

Usage

ssEQTL.ANOVA2(
  effsize,
  MAF,
  typeI = 0.05,
  nTests = 2e+05,
  mypower = 0.8
)

Arguments

effsize

effect size δ/σ\delta/\sigma, where δ=μ2μ1=μ3μ2\delta=\mu_2-\mu_1=\mu_3-\mu_2, μ1\mu_1, μ2\mu_2, μ3\mu_3 are the mean gene expression level of mutation homozygotes, heterozygotes, and wild-type homozygotes, and σ\sigma is the standard deviation of gene expression levels (assuming each genotype group has the same variance).

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

mypower

Desired power for the eQTL analysis.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power=Pr(FF1α(k1,Nk)FFk1,Nk,λ),power=Pr\left(\left.F\geq F_{1-\alpha}\left(k-1, N-k\right)\right| F\sim F_{k-1, N-k, \lambda}\right),

where k=3k=3 is the number of groups of subjects, NN is the total number of subjects, F1α(k1,Nk)F_{1-\alpha}\left(k-1, N-k\right) is the 100(1α)100(1-\alpha)-th percentile of central F distribution with degrees of freedoms k1k-1 and NkN-k, and Fk1,Nk,λF_{k-1, N-k, \lambda} is the non-central F distribution with degrees of freedoms k1k-1 and NkN-k and non-central parameter (ncp) λ\lambda. The ncp λ\lambda is equal to

λ=Nσ2i=1kwi(μiμ)2,\lambda=\frac{N}{\sigma^2}\sum_{i=1}^{k} w_i \left(\mu_i-\mu\right)^2,

where μi\mu_i is the mean gene expression level for the ii-th group of subjects, wiw_i is the weight for the ii-th group of subjects, σ2\sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and μ\mu is the weighted mean gene expression level

μ=i=1kwiμi.\mu=\sum_{i=1}^{k}w_i \mu_i.

The weights wiw_i are the sample proportions for the 3 groups of subjects. Hence, i=13wi=1\sum_{i=1}^{3}w_i = 1.

We assume that μ2μ1=μ3μ2=δ\mu_2-\mu_1=\mu_3-\mu_2=\delta, where μ1\mu_1, μ2\mu_2, and μ3\mu_3 are the mean gene expression level for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively.

Denote pp as the minor allele frequency (MAF) of a SNP. Under Hardy-Weinberg equilibrium, we have genotype frequencies: p2=p2p_2=p^2, p1=2pqp_1=2 p q, and p0=q2p_0=q^2, where p2p_2, p1p_1, and p0p_0 are genotype for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively, q=1pq=1-p. Then ncp can be simplified as

ncp=2pqN(δσ)2,ncp=2pq N\left(\frac{\delta}{\sigma}\right)^2,

Value

sample size required for the eQTL analysis to achieve the desired power.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

minEffectEQTL.ANOVA, powerEQTL.ANOVA, powerEQTL.ANOVA2, ssEQTL.ANOVA

Examples

ssEQTL.ANOVA2(
  effsize = 1,
  MAF = 0.1,
  typeI = 0.05,
  nTests = 2e+05,
  mypower = 0.8
)

Sample Size Calculation for EQTL Analysis Based on Simple Linear Regression

Description

Sample size calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using simple linear regression.

Usage

ssEQTL.SLR(
  MAF,
  typeI = 0.05,
  nTests = 2e+05,
  slope = 0.13,
  mypower = 0.8,
  mystddev = 0.13,
  n.lower = 2.01,
  n.upper = 1e+30,
  verbose = TRUE)

Arguments

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

slope

Slope β1\beta_1 of the simple linear regression

yi=β0+β1xi+ϵi,y_i = \beta_0+\beta_1 x_i + \epsilon_i,

where yiy_i is the gene expression level of the ii-th subject, xix_i is the genotype of the ii-th subject, and ϵi\epsilon_i is the random error term. Additive coding for genotype is used.

mypower

Desired power for the eQTL analysis.

mystddev

Standard deviation of the random error term ϵ\epsilon.

n.lower

integer. Lower bound of the total number of subjects.

n.upper

integer. Upper bound of the total number of subjects.

verbose

logic. indicating if intermediate results should be output.

Details

To test if a SNP is associated with a gene probe, we use the simple linear regression

yi=β0+β1xi+ϵi,y_i = \beta_0+\beta_1 x_i + \epsilon_i,

where yiy_i is the gene expression level of the ii-th subject, xix_i is the genotype of the ii-th subject, and ϵi\epsilon_i is the random error term. Additive coding for genotype is used. To test if the SNP is associated with the gene probe, we test the null hypothesis H0:β1=0H_0: \beta_1=0.

Denote pp as the minor allele frequency (MAF) of the SNP. Under Hardy-Weinberg equilibrium, we can calculate the variance of genotype of the SNP: σx2=2p(1p)\sigma^2_x=2 p (1-p), where σx2\sigma^2_x is the variance of the predictor (i.e. the SNP) xix_i.

We then can use Dupont and Plummer's (1998) power/sample size calculation formula to calculate the minimum detectable slope, adjusting for multiple testing.

Value

sample size required for the eQTL analysis to achieve the desired power.

Author(s)

Xianjun Dong <[email protected]>, Tzuu-Wang Chang <[email protected]>, Scott T. Weiss <[email protected]>, Weiliang Qiu <[email protected]>

References

Dupont, W.D. and Plummer, W.D.. Power and Sample Size Calculations for Studies Involving Linear Regression. Controlled Clinical Trials. 1998;19:589-601.

See Also

powerEQTL.SLR, minSlopeEQTL.SLR

Examples

ssEQTL.SLR(
  MAF = 0.1,
  typeI = 0.05,
  nTests = 2e+05,
  slope = 0.13,
  mypower = 0.8,
  mystddev = 0.13,
  n.lower = 2.01,
  n.upper = 1e+30,
  verbose = TRUE)