问题
I'm trying to implement a producer-consumer like situation in my block level driver (on linux kernel version 2.6.39.1). My block driver's make_request_fn
receives a stream of struct bio
from a userlevel application. On receiving these BIOs, they are queued up. Next, I create a new struct bio
which will hold all the information present in the queued BIOs. This new "merged_bio" will be submitted only if there is a struct request
slot available in the lower level driver's request_queue
. In the meantime, my block driver will continue to receive BIOs from the userlevel application and will queue them up (a dense load situation). Now, to reduce latency, I want to ensure that as and when BIOs are queued up, a batch comprising of the oldest BIOs are dequeued thereby minimizing idling time in the queue. I am confused as to how I should achieve this pipelined situation (with low per-request latency) wherein my driver's make_request_fn
is the producer of BIOs and a function calling submit_bio()
is the consumer of these BIOs. Possible approaches include:
Tasklets -- Have a tasklet continuously consume BIOs from the queue once a
struct request
slot becomes free in therequest_queue
. This approach won't work because the tasklet handler function is not atomic as it callssubmit_bio()
, which in turn has a call toschedule()
.Work queues -- They can have higher latency than tasklets since the handler function for the work queue can sleep. This could affect my driver's performance since the BIOs may be submitted to the lower level driver at a much later time than when it was actually enqueued by the
make_request_fn
. I'm unsure about how large the impact would be on the performance (my driver implements a fast logging device.) Another issue with work queues is that how and when to schedule the work queue? Themerged_bio
has to be submitted once the request slot becomes available. So, there has to be some kind of "signaling" mechanism which would schedule the work queue as soon as astruct request
becomes available. I don't see how therequest_queue
can be monitored continuously for a free slot without a signaling orpoll()
ing mechanism and then explicitly scheduling the work queue. And yes, we cannot callsubmit_bio()
from the callback function of the previously completed BIO.Kernel threads -- I was considering this as a 3rd option. I don't have much idea about kernel threads, but here is how I was planning to got about it: My driver's
make_request_fn
would continue to enqueue BIOs. A new kernel thread would be created which would continuously consume BIOs from the queue only when astruct request
slot becomes available (and not otherwise). So, each time this kernel thread is scheduled, it would check for an empty request slot and then consume a batch of BIOs from the queue and callsubmit_bio()
.Something more smarter??
Can members of stackoverflow help me choose a smart and efficient way to implement this scenario? Thank you!
UPDATE: I tried the workqueue method but it just resulted in my kernel crashing (/var/log/messages contained garbled text, so I don't have any logs to share). Here is how I went about implementing the workqueue:
The data structure which will be used by the work queue:
struct my_work_struct {
struct work_struct wk;
pitdev_t *pd; /* Pointer to my block device */
int subdev_index; /* Indexes the disk that is currently in picture -- pd->subdev[subdev_index] */
};
struct pitdev_struct {
/* Driver related data */
struct my_work_struct *work;
} *pd;
typedef struct pitdev_struct pitdev_t;
Initialize my work item:
/* Allocate memory for both pd and pd->work */
INIT_WORK(&pd->work->wk, my_work_fn);
pd->work->pd = pd;
pd->work->subdev_index = 0;
My work function definition:
void my_work_fn(struct work_struct *work)
{
struct my_work_struct *temp = container_of(work, struct my_work_struct, wk);
pitdev_t *pd = temp->pd;
int sub_index = temp->subdev_index;
/* Create a BIO and submit it*/
submit_bio(WRITE, merged_bio);
}
In merged_bio->bi_end_io
, I schedule my work item:
schedule_work(&pd->work->wk);
This is done so that the next BIO to be submitted is scheduled soon after the previous BIO has been successfully transferred. The first call to submit_bio()
is done without using the work queue. This first call to submit_bio()
is taking place without any problems; when called I call submit_bio()
using a work item, the system crashes.
Any ideas?
来源:https://stackoverflow.com/questions/6742410/producer-consumer-implementation-in-a-block-device-driver