Snakemake - Override LSF (bsub) cluster config in a rule-specific manner

我的梦境 提交于 2019-12-07 15:46:30

You can directly add resources to each of your rules :

rule all:
    'a_out.txt' , 'b_out.txt'

rule a:
    input:
        'a.txt'
    output:
        'a_out.txt'
    resources:
        mem_mb=40000
    shell:
        'touch {output}'
rule b:
    input:
        'b.txt'
    output:
        'b_out.txt'
    resources:
        mem_mb=20000
    shell:
        'touch {output}'

And then, you should remove the resources parameter from your .json, so that the command line would not override the snakefile:

new.cluster.json:

{
    "__default__":
    {
        "output": "logs/cluster/{rule}.{wildcards}.out",
        "error": "logs/cluster/{rule}.{wildcards}.err"
    },
}

In new.cluster.json you can actually define resources for specific rules. So in your case you would do the following

{
    "__default__":
    {
        "memory": 20000,
        "resources": "\"rusage[mem=8000] span[hosts=1]\"",
        "output": "logs/cluster/{rule}.{wildcards}.out",
        "error": "logs/cluster/{rule}.{wildcards}.err"
    },
    "b":
    {
        "memory": 40000,
        "resources": "\"rusage[mem=15000] span[hosts=1]\"",
        "output": "logs/cluster/{rule}.{wildcards}.out",
        "error": "logs/cluster/{rule}.{wildcards}.err"
    },
}

Then in the Snakefile you can refer to these resources by importing new.cluster.json and referring to it in your rule

import json

with open('new.cluster.json') as fh:
    cluster_config = json.load(fh)

rule all:
    'a_out.txt' , 'b_out.txt'

rule a:
    input:
        'a.txt'
    output:
        'a_out.txt'
    shell:
        'touch {output}'
rule b:
    input:
        'b.txt'
    output:
        'b_out.txt'
    resources:
        mem_mb=cluster_config["b"]["memory"]
    shell:
        'touch {output}'

If you take a look through this repository, you can see how I use these cluster configs in the wild.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!