Producer consumer implementation in a block device driver?

问题

I'm trying to implement a producer-consumer like situation in my block level driver (on linux kernel version 2.6.39.1). My block driver's make_request_fn receives a stream of struct bio from a userlevel application. On receiving these BIOs, they are queued up. Next, I create a new struct bio which will hold all the information present in the queued BIOs. This new "merged_bio" will be submitted only if there is a struct request slot available in the lower level driver's request_queue. In the meantime, my block driver will continue to receive BIOs from the userlevel application and will queue them up (a dense load situation). Now, to reduce latency, I want to ensure that as and when BIOs are queued up, a batch comprising of the oldest BIOs are dequeued thereby minimizing idling time in the queue. I am confused as to how I should achieve this pipelined situation (with low per-request latency) wherein my driver's make_request_fn is the producer of BIOs and a function calling submit_bio() is the consumer of these BIOs. Possible approaches include:

Tasklets -- Have a tasklet continuously consume BIOs from the queue once a struct request slot becomes free in the request_queue. This approach won't work because the tasklet handler function is not atomic as it calls submit_bio(), which in turn has a call to schedule().
Work queues -- They can have higher latency than tasklets since the handler function for the work queue can sleep. This could affect my driver's performance since the BIOs may be submitted to the lower level driver at a much later time than when it was actually enqueued by the make_request_fn. I'm unsure about how large the impact would be on the performance (my driver implements a fast logging device.) Another issue with work queues is that how and when to schedule the work queue? The merged_bio has to be submitted once the request slot becomes available. So, there has to be some kind of "signaling" mechanism which would schedule the work queue as soon as a struct request becomes available. I don't see how the request_queue can be monitored continuously for a free slot without a signaling or poll()ing mechanism and then explicitly scheduling the work queue. And yes, we cannot call submit_bio() from the callback function of the previously completed BIO.
Kernel threads -- I was considering this as a 3rd option. I don't have much idea about kernel threads, but here is how I was planning to got about it: My driver's make_request_fn would continue to enqueue BIOs. A new kernel thread would be created which would continuously consume BIOs from the queue only when a struct request slot becomes available (and not otherwise). So, each time this kernel thread is scheduled, it would check for an empty request slot and then consume a batch of BIOs from the queue and call submit_bio().
Something more smarter??

Can members of stackoverflow help me choose a smart and efficient way to implement this scenario? Thank you!

UPDATE: I tried the workqueue method but it just resulted in my kernel crashing (/var/log/messages contained garbled text, so I don't have any logs to share). Here is how I went about implementing the workqueue:

The data structure which will be used by the work queue:

struct my_work_struct {
   struct work_struct wk;
   pitdev_t *pd;    /* Pointer to my block device */
   int subdev_index;    /* Indexes the disk that is currently in picture -- pd->subdev[subdev_index] */
};

struct pitdev_struct {
/* Driver related data */
struct my_work_struct *work;
} *pd;

typedef struct pitdev_struct pitdev_t;

Initialize my work item:

/* Allocate memory for both pd and pd->work */

INIT_WORK(&pd->work->wk, my_work_fn);

pd->work->pd = pd;
pd->work->subdev_index = 0;

My work function definition:

void my_work_fn(struct work_struct *work)
{
   struct my_work_struct *temp = container_of(work, struct my_work_struct, wk);
   pitdev_t *pd = temp->pd;
   int sub_index = temp->subdev_index;

   /* Create a BIO and submit it*/
   submit_bio(WRITE, merged_bio);
}

In merged_bio->bi_end_io, I schedule my work item:

schedule_work(&pd->work->wk);

This is done so that the next BIO to be submitted is scheduled soon after the previous BIO has been successfully transferred. The first call to submit_bio() is done without using the work queue. This first call to submit_bio() is taking place without any problems; when called I call submit_bio() using a work item, the system crashes.

Any ideas?

来源：https://stackoverflow.com/questions/6742410/producer-consumer-implementation-in-a-block-device-driver

标签

Linux

linux-kernel

block

device-driver

producer-consumer