• PAML使用 - codeml的配置文件 - [科研|Research]

    2008-01-29

    Tag:evolution

    seqfile = seqfile.phy * sequence data filename
    treefile = tree.tre * tree structure file name
    outfile = result.mlc * main result file name
    *这三行分别替换为自己的文件:seqfile为序列文件,treefile为树文件,outfile为结果文件

    noisy = 3 * 0,1,2,3,9: how much rubbish on the screen
    verbose = 1 * 0: concise; 1: detailed, 2: too much
    runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic
    * 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise
    *这三行让使用者决定电脑处理数据的方式,一般不用改动。如果只两两序列比对(pairwise), runmode为-2,而不需要树文件.

    seqtype = 1 * 1:codons; 2:AAs; 3:codons-->AAs
    CodonFreq = 2 * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table
    *F3x4: codon frequencies are calculated from average nucleotide frequencies at the three codon positions.
    *ndata = 10
    clock = 0 * 0: no clock, 1:clock; 2:local clock; 3:CombinedAnalysis
    aaDist = 0 * 0:equal, +:geometric; -:linear, 1-6:G1974,Miyata,c,p,v,a
    aaRatefile = dat/mtArt.dat * only used for aa seqs with model=empirical(_F)
    * dayhoff.dat, jones.dat, wag.dat, mtmam.dat, or your own
    *seqtype当序列为DNA是1,而protein时为2,本例为DNA。其它几行一般不用改。

    model = 1
    * models for codons:
    * 0:one, 1:b, 2:2 or more dN/dS ratios for branches
    * models for AAs or codon-translated AAs:
    * 0:poisson, 1:proportional, 2:Empirical, 3:Empirical+F
    * 6:FromCodon, 7:AAClasses, 8:REVaa_0, 9:REVaa(nr=189)
    *在使用branch models或branch-site models时需要修改这个数值。
    * 0:one omega ratio for all branches; 1:separate omega for each branch; 2:user specified dN/dS ratios for branches

    *以下几个参数的详细设定方式请参考程序使用手册中的codon substitution models的部分?
    NSsites = 0 * 0:one w;1:neutral;2:selection; 3:discrete;4:freqs;
    * 5:gamma;6:2gamma;7:beta;8:beta&w;9:betaγ
    * 10:betaγ+1; 11:beta&normal>1; 12:0&2normal>1;
    * 13:3normal>0
    *这个参数改变的是site models的类型,请注意不同的model在不同的NSsites条件下,结果中出现
    *的p,ω等表示序列变化的参数数目与意义也不一样。至于哪一组数值可信,需要使用likelihood ratio test作检验。

    icode = 0 * 0:universal code; 1:mammalian mt; 2-10:see below
    Mgene = 0
    * codon: 0:rates, 1:separate; 2:diff pi, 3:diff kapa, 4:all diff
    * AA: 0:rates, 1:separate
    *这里的参数原则上也不需要改动,但是注意icode会受DNA序列的类型而对电脑的运算方式产生影响。

    fix_kappa = 0 * 1: kappa fixed, 0: kappa to be estimated
    kappa = 2 * initial or fixed kappa
    fix_omega = 0 * 1: omega or omega_1 fixed, 0: estimate
    omega = .4 * initial or fixed omega, for codons or codon-based AAs
    *PAML会要求使用者决定kappa & omega的起始值。
    * kappa: ts/tv, the transition/transversion rate ratio; omega: dN/dS

    fix_alpha = 1 * 0: estimate gamma shape parameter; 1: fix it at alpha
    alpha = 0. * initial or fixed alpha, 0:infinity (constant rate)
    Malpha = 0 * different alphas for genes
    ncatG = 8 * # of categories in dG of NSsites models
    getSE = 0 * 0: don't want them, 1: want S.E.s of estimates
    *如果需要标准差就把getSE设定为1.
    RateAncestor = 1 * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)

    Small_Diff = .5e-6
    cleandata = 1 * remove sites with ambiguity data (1:yes, 0:no)?
    *如果序列中有GAP等机器无法读取的部分就设定为1.
    *fix_blength = -1 * 0: ignore, -1: random, 1: initial, 2: fixed
    method = 0 * Optimization method 0: simultaneous; 1: one branch a time

    * Genetic codes: 0:universal, 1:mammalian mt., 2:yeast mt., 3:mold mt.,
    * 4: invertebrate mt., 5: ciliate nuclear, 6: echinoderm mt.,
    * 7: euplotid mt., 8: alternative yeast nu. 9: ascidian mt., 10: blepharisma nu.
    * These codes correspond to transl_table 1 to 11 of GENEBANK.

    ==================

    :) I write some simple ways to test likelihood ratio test on variable codon rates. These two models assume codon rates varied among all branchs.

    1. M1a vs. M2a:   M1a(model=0, NSsites=1), M2a( model=0, NSsites=2), the df = 2, LRT = 2dl = abs(2 X (l1- l0)).

    2. M7 vs. M8:   M1a(model=0, NSsites=7), M2a( model=0, NSsites=8), the df = 2, LRT = 2dl = abs(2 X (l1-l0)).

    The chi-square value can be calucated by PROGRAM CHI2.EXE in PAML.  If the p-value <0.05, then we can conclude that some sites are under positive selection.





    评论

  • Perfect guideline,much better than Yang's manual which is too sophisticate for newuser.Can you say something more details about how to perform the likelihood ratio test in NSsite model? Looking forward to your update.

发表评论

您将收到博主的回复邮件
记住我