split bam files to (variable) pre-defined number of small bam files depending on the sample

﹥>﹥吖頭↗ 提交于 2019-12-11 19:26:40

问题


I want to split multiple bam files to pre-determined number of smaller bam files. I do not know how to specify the output because the number of smaller bam files is variable depending on which samples I am splitting.

I have read https://bitbucket.org/snakemake/snakemake/issues/865/pre-determined-dynamic-output

I do not see how checkpoint is helping me in my case.

SAMPLE_cluster = { "SampleA" : [ "1", "2", "3" ], "SampleB" :  [ "1" ], "SampleC" : [ "1", "2" ] }

rule split_bam:
    input: "{sample}.bam"
    output: expand("split_bam/{{sample}}_{cluster_id}.bam", cluster_id = ?)
    shell:
       """
       split_bam {input} {output}
       """
rule index_split_bam:
    input: "split_bam/{sample}_{cluster_id}.bam"
    output: "split_bam/{sample}_{cluster_id}.bam.bai"
    shell:
        """
        samtools index {input}
        """

A for loop works for me as in the link above, but the anonymous rule annoys me.

How to specify the output for the split_bam rule? I have read Snakemake: unknown output/input files after splitting by chromosome this works because the number of chromosomes is fixed for a single sample. If there are multiple samples and the number of chromosomes is different for different samples, it will be similar to my problem.

来源:https://stackoverflow.com/questions/56637350/split-bam-files-to-variable-pre-defined-number-of-small-bam-files-depending-on

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!