Re: [PATCH 08/17] bio: reimplement bio_copy_user_iov()

Previous thread: [PATCH 03/17] blk-map: improve alignment checking for blk_rq_map_user_iov() by Tejun Heo on Wednesday, April 1, 2009 - 6:44 am. (1 message)

Next thread: [PATCH 00/24] Series short description by Alan Cox on Wednesday, April 1, 2009 - 7:04 am. (25 messages)
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Hello, all.

These patches are available in the following git tree but it's on top
of the previous blk-map-related-fixes patchset which needs some
updating, so this posting is just for review and comments.  This
patchset needs to spend quite some time in RCs even if it gets acked
eventually, so definitely no .30 material.  Please also note that I
haven't updated the comment about bio chaining violating queue limit,
so please ignore that part.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-map
 http://git.kernel.org/?p=linux/kernel/git/tj/misc.git;a=shortlog;h=blk-map

The goals of this patchset are

* Clean up and refactor blk/bio PC map API interface and
  implementation.

* Add blk_rq_map_kern_sg() so that in-kernel user can issue
  multi-segment PC request.  scsi_lib.c currently does this directly
  by building rq itself.

* This is also in the general direction of making block layer
  interface more strictk.  For PC requests, users only get to deal
  with requests and buffers.

This patchset uses sg list to pass multi-segment memory area into
blk_rq_map_kern_sg() and also uses sgl to simplify internal
implementation.

Boaz Harrosh has some objections against both of them.  For the
former, I can't really think of using anything other than sgl for the
API.  For the second point, the use of sgl inside bio for map
implementation does add overhead for blk_rq_map_user*() calls because
it first maps to sgl and then gets interpreted to bvec.  The reason
that sgl is used for implementation is because it has richer API,
mainly sg_mapping_iter.

So, it basically comes down to whether reducing the overhead of
alloc/copying and freeing sgl one more time for each SG_IO call is
worth the added complexity.  I don't really think it will show up
anywhere but if this is a problem, switching over to bvec for internal
implementation wouldn't be too difficult although then copying back
and forth would require some hairy code.  Also, please note that the
new code is ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup

bio is about to go through major update.  Take the chance and cleanup
bio.h such that

* forward declaration of structs are in one place.

* collect bio_copy/map*() prototypes in one place.

* function prototypes don't have unncessary extern in front of them
  and have their parameters named.  (dropping extern makes it much
  easier to have named parameters)

* dummy integrity APIs are inline functions instead of macros so that
  type check still occurs and unused variable warnings aren't
  triggered.

* fix return values of dummy bio_integrity_set/get_tag(),
  bio_integrity_prep() and bio_integrity_clone().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/bio.h |  178 +++++++++++++++++++++++++++++++--------------------
 1 files changed, 108 insertions(+), 70 deletions(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 8647dd9..4bf7442 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -354,56 +354,64 @@ struct bio_pair {
 	atomic_t			cnt;
 	int				error;
 };
-extern struct bio_pair *bio_split(struct bio *bi, int first_sectors);
-extern void bio_pair_release(struct bio_pair *dbio);
 
-extern struct bio_set *bioset_create(unsigned int, unsigned int);
-extern void bioset_free(struct bio_set *);
-
-extern struct bio *bio_alloc(gfp_t, int);
-extern struct bio *bio_kmalloc(gfp_t, int);
-extern struct bio *bio_alloc_bioset(gfp_t, int, struct bio_set *);
-extern void bio_put(struct bio *);
-extern void bio_free(struct bio *, struct bio_set *);
-
-extern void bio_endio(struct bio *, int);
 struct request_queue;
-extern int bio_phys_segments(struct request_queue *, struct bio *);
-
-extern void __bio_clone(struct bio *, struct bio *);
-extern struct bio *bio_clone(struct bio *, gfp_t);
-
-extern void bio_init(struct bio *);
-
-extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
-extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
-			   unsigned int, ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup

blk-map and bio use sg_iovec for addr-len segments although there
isn't anything sg-specific about the API.  This is mostly due to
historical reasons.  sg_iovec is by definition identical to iovec.
Use iovec instead.  This removes bogus dependency on scsi sg and will
allow use of iovec helpers.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c        |    5 ++---
 block/scsi_ioctl.c     |    8 +++-----
 fs/bio.c               |   23 +++++++++++------------
 include/linux/bio.h    |    6 +++---
 include/linux/blkdev.h |    8 ++++----
 5 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 29aa60d..4f0221a 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -5,7 +5,6 @@
 #include <linux/module.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
-#include <scsi/sg.h>		/* for struct sg_iovec */
 
 #include "blk.h"
 
@@ -64,7 +63,7 @@ static int __blk_rq_unmap_user(struct bio *bio)
  *    unmapping.
  */
 int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
-			struct rq_map_data *map_data, struct sg_iovec *iov,
+			struct rq_map_data *map_data, struct iovec *iov,
 			int iov_count, unsigned int len, gfp_t gfp_mask)
 {
 	struct bio *bio = ERR_PTR(-EINVAL);
@@ -130,7 +129,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
 		    struct rq_map_data *map_data, void __user *ubuf,
 		    unsigned long len, gfp_t gfp_mask)
 {
-	struct sg_iovec iov;
+	struct iovec iov;
 
 	iov.iov_base = ubuf;
 	iov.iov_len = len;
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index c8e8868..73cfd91 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -289,7 +289,7 @@ static int sg_io(struct request_queue *q, struct gendisk *bd_disk,
 	if (hdr->iovec_count) {
 		const int size = sizeof(struct sg_iovec) * hdr->iovec_count;
 		size_t iov_data_len;
-		struct sg_iovec *iov;
+		struct iovec *iov;
 
 		iov = kmalloc(size, GFP_KERNEL);
 		if (!iov) {
@@ ...
From: Boaz Harrosh
Date: Wednesday, April 1, 2009 - 7:50 am

OK, The actual one user in sg.c passes a void*, so no casts are
needed. (I couldn't find where are the type-casts of old users)

Should we make this a part of a bigger cleanup that removes
sg_iovec, from Kernel altogether and only makes a #define for
user-mode?

BTW: 
    user-mode scsi/sg.h does not come from the Kernels exported
    headers. It comes with the gcc distribution.
    If we remove it alltogether it will not affect anybody.

If you want I can help with this little chore?


--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 8:32 am

Hello,


Sure, that would be a nice cleanup.  If dropping sg_iovec doesn't
affect userland, I think it would better to just drop it.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup

bio confusingly uses @write_to_vm and @reading for data directions,
both of which mean the opposite of the usual block/bio convention of
using READ and WRITE w.r.t. IO devices.  The only place where the
inversion is necessary is when caling get_user_pages_fast() in
bio_copy_user_iov() as the gup uses the VM convention of read/write
w.r.t. VM.

This patch converts all bio functions to use READ/WRITE rw parameter
and let the one place where inversion is necessary to rw == READ.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c     |   10 +++++-----
 fs/bio.c            |   50 +++++++++++++++++++++++++-------------------------
 include/linux/bio.h |   18 +++++++++---------
 3 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index b0b65ef..29aa60d 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -68,15 +68,15 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 			int iov_count, unsigned int len, gfp_t gfp_mask)
 {
 	struct bio *bio = ERR_PTR(-EINVAL);
-	int read = rq_data_dir(rq) == READ;
+	int rw = rq_data_dir(rq);
 
 	if (!iov || iov_count <= 0)
 		return -EINVAL;
 
 	if (!map_data)
-		bio = bio_map_user_iov(q, NULL, iov, iov_count, read, gfp_mask);
+		bio = bio_map_user_iov(q, NULL, iov, iov_count, rw, gfp_mask);
 	if (bio == ERR_PTR(-EINVAL))
-		bio = bio_copy_user_iov(q, map_data, iov, iov_count, read,
+		bio = bio_copy_user_iov(q, map_data, iov, iov_count, rw,
 					gfp_mask);
 	if (IS_ERR(bio))
 		return PTR_ERR(bio);
@@ -177,7 +177,7 @@ EXPORT_SYMBOL(blk_rq_unmap_user);
 int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 		    unsigned int len, gfp_t gfp_mask)
 {
-	int reading = rq_data_dir(rq) == READ;
+	int rw = rq_data_dir(rq);
 	int do_copy = 0;
 	struct bio *bio;
 
@@ -188,7 +188,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 
 	do_copy = !blk_rq_aligned(q, kbuf, len) || ...
From: Boaz Harrosh
Date: Thursday, April 2, 2009 - 1:36 am

Hi one more nit picking just if you are at it. If you want
I can do the work and send it to you so you can squash it into this patch.

can we pleas have an inline that does that? Like bio_set_dir()?
and change all users. You will be surprised how many there are.

It gives me an hart attack every time I have to write yet another


Boaz
--

From: Tejun Heo
Date: Thursday, April 2, 2009 - 2:02 am

Things like this are actually what I'm trying to clean up.  If you
look at the whole series, at the end, there remains only one place
which sets the flag in the blk/bio map paths and all users of
blk_rq_map_*() interface couldn't care less about the flag because
they provide only the necessary information through strictly defined
API.

Thanks.

-- 
tejun
--

From: Boaz Harrosh
Date: Thursday, April 2, 2009 - 2:07 am

Right, I know. and its great.

I'm saying, all these other block-layer places and all
these places above block like stacking drivers and filesystems
That use generic_make_request and friends. If at it these can be

I'll do it, no problems. Would you push it with your patchset?

Thanks
Boaz
--

From: Tejun Heo
Date: Thursday, April 2, 2009 - 2:13 am

Sure thing.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: better atomic mapping handling

sg-miter supported atomic mapping using single flag - SG_MITER_ATOMIC.
It implicly used KM_BIO_SRC_IRQ and required irq to be disabled for
each iteration to protect the kernel mapping.  This was too limiting
and didn't allow multiple iterators to be used concurrently (e.g. for
sgl -> sgl copying).

This patch adds a new interface sg_miter_start_atomic() which takes
km_type argument and drops @flags from sg_miter_start() so that
km_type can be specified by the caller for atomic iterators.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pierre Ossman <drzeus-mmc@drzeus.cx>
---
 drivers/mmc/host/sdhci.c    |    4 +-
 include/linux/scatterlist.h |   11 +++++++--
 lib/scatterlist.c           |   49 ++++++++++++++++++++++++++++++++----------
 3 files changed, 47 insertions(+), 17 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index accb592..559aca7 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -704,8 +704,8 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_data *data)
 	}
 
 	if (!(host->flags & SDHCI_REQ_USE_DMA)) {
-		sg_miter_start(&host->sg_miter,
-			data->sg, data->sg_len, SG_MITER_ATOMIC);
+		sg_miter_start_atomic(&host->sg_miter,
+				      data->sg, data->sg_len, KM_BIO_SRC_IRQ);
 		host->blocks = data->blocks;
 	}
 
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index e599698..2249caa 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -6,6 +6,7 @@
 #include <linux/mm.h>
 #include <linux/string.h>
 #include <asm/io.h>
+#include <asm/kmap_types.h>
 
 struct sg_table {
 	struct scatterlist *sgl;	/* the list */
@@ -254,11 +255,15 @@ struct sg_mapping_iter {
 	struct scatterlist	*__sg;		/* current entry */
 	unsigned int		__nents;	/* nr of remaining entries */
 	unsigned int		__offset;	/* offset within sg */
-	unsigned int		__flags;
+	unsigned int		__flags;	/* internal flags */
+	enum ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: more modular implementation

Break down bio_copy_user_iov() into the following steps.

1. bci and page allocation
2. copying data if WRITE
3. create bio accordingly

bci is now responsible for managing any copy related resources.  Given
source iov, bci_create() allocates bci and fills it with enough pages
to cover the source iov.  The allocated pages are described with a
sgl.

Note that new allocator always rounds up rq_map_data->offset to page
boundary to simplify implementation and guarantee enough DMA padding
area at the end.  As the only user, scsi sg, always passes in zero
offset, this doesn't cause any actual behavior difference.  Also,
nth_page() is used to walk to the next page rather than directly
adding to struct page *.

Copying back and forth is done using bio_memcpy_sgl_uiov() which is
implemented using sg mapping iterator and iov iterator.

The last step is done using bio_create_from_sgl().

This patch by itself adds one more level of indirection via sgl and
more code but components factored out here will be used for future
code refactoring.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/bio.c |  465 +++++++++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 293 insertions(+), 172 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 1cd97e3..f13aef0 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -26,6 +26,7 @@
 #include <linux/mempool.h>
 #include <linux/workqueue.h>
 #include <linux/blktrace_api.h>
+#include <linux/scatterlist.h>
 #include <linux/pfn.h>
 #include <trace/block.h>
 
@@ -653,102 +654,303 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
 	return __bio_add_page(q, bio, page, len, offset, q->max_sectors);
 }
 
-struct bio_copy_info {
-	struct bio_vec		*iovecs;
-	struct iovec		*src_iov;
-	int			src_count;
-	int			is_our_pages;
-};
+/**
+ *	bio_sgl_free_pages - free pages from sgl
+ *	@sgl: target sgl
+ *	@nents: nents for @sgl
+ *
+ *	Free all pages pointed to by @sgl.  Note ...
From: Boaz Harrosh
Date: Wednesday, April 1, 2009 - 8:50 am

Hi dear Tejun

I've looked hard and deep into your patchset, and I would like to
suggest an improvement.

[Option 1]
What your code is actually using from sgl-code base is:
 for_each_sg
 sg_mapping_iter and it's
	sg_miter_start, sg_miter_next
 ... (what else)

I would like if you can define above for bvec(s) just the way you like
them. Then code works directly on the destination bvect inside the final
bio. One less copy no intermediate allocation, and no kmalloc of
bigger-then-page buffers.

These are all small inlines, duplicating those will not affect
Kernel size at all. You are not using the chaining ability of sgl(s)
so it can be simplified. You will see that not having the intermediate
copy simplifies the code even more.

Since no out-side user currently needs sgl(s) no functionality is lost.

[Option 2]
Keep pointer to sgl and not bvec at bio, again code works on final destination.
Later users of block layer that call blk_rq_fill_sgl (blk_rq_map_sg) will just
get a copy of the pointer and another allocation and copy is gained.
This option will spill outside of the current patches scope. Into bvec hacking
code.


I do like your long term vision of separating the DMA part from the virtual part
of scatterlists. Note how they are actually two disjoint lists altogether. After
the dma_map does its thing the dma physical list might be shorter then virtual
and sizes might not correspond at all. The dma mapping code regards the dma part
as an empty list that gets appended while processing, any segments match is
accidental. (That is: inside the scatterlist the virtual address most probably
does not match the dma address)

So [option 1] matches more closely to that vision.

Historically code was doing
  Many-sources => scatterlist => biovec => scatterlist => dma-scatterlist

Only at 2.6.30 we can say that we shorten a step to do:
  Many-sources => biovec => scatterlist => dma-scatterlist

Now you want to return the extra step, I hate it.
[Option 2] can make that ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 4:57 pm

Hello, Boaz.


Yeah, we can switch to dealing with bvecs instead of sgls.  For kernel
mappings, it wouldn't make any difference.  For user single mappings,
it'll remove an intermediate step.  For user sg mappings, if we make
gup use sgl, it won't make much difference.

Frankly, for copy paths, I don't think all this matters at all.  If
each request is small, other overheads will dominate.  For large
requests, copying to and from sgl isn't something worth worrying
about.

For mapping paths, if we change gup to use sgl (we really can't make
it swallow bvec), I don't think it will make whole lot of difference
one way or the other.  It might end up using slightly more cachelines,
but that will be about it.

I do agree using bvec as internal page carrier would be nicer (sans
having to implement stuff manually) but I'm not convinced whether

It's not my favorite part either but I'm just not convinced it matters

Yeah, this is the root cause.  We don't have a common API to carry
segments so we end up doing impedence matching when crossing
boundaries.  Drivers expect sgl at the interface.  Filesystems
obviously expect bio with bvecs.  gup expects array of pages (this one

I'm not hugely against using bvec inside.  I just didn't see much
difference and went for something easier, so yeah, it definitely is an
option.



Then, that's a bug.  Yeah, high order allocations fail more easily as
time passes.  But low order (1 maybe 2) allocations aren't too bad.
If it matters we can make bio_kmalloc() use vmalloc() for bvecs if it
goes over PAGE_SIZE but again given that nobody reported the spurious
GFP_DMA in bounce_gfp which triggers frenzy OOM killing spree for
large SG_IOs, I don't think this matters all that much.

Thanks.

-- 
tejun
--

From: Boaz Harrosh
Date: Thursday, April 2, 2009 - 1:24 am

"gup use sgl" was just a stupid suggestion to try eliminate one more
allocation. Realy Realy stupid.

for gup there is a much better, safer, way. Just allocate on the stack
a constant size say 64 pointers and add an inner-loop on the 64 pages.
The extra code loop is much cheaper then an allocation and is one less
point of failure.

So now that we got gup out of the way, will you go directly to bvecs?


It is. Again, I tested that. Each new allocation in the order of
the number of pages in a request adds 3-5% on a RAM I/O


gup to use sgl, is out of the way, that was stupid idea to try eliminate

We can reach a point where there is no intermediate dynamic alloctions
and no point of failures, but for the allocation of the final bio+biovec.

And that "implement stuff manually" will also enable internal bio
code cleanups since now biovec is that much smarter.


Much much higher, not just 100% higher. And it will kill the

Fixing the bug will not help, the allocation will fail and the IO will

No, we are not talking about 1 or 2. Today since Jens put the scatterlist
chaining we can have lots and lots of them chained together. At the time
Jens said that bio had the chaining option for a long time, only scatterlists

No that is a regression, vmaloc is an order of a magnitude slower then
just plain BIO chaining


This BUG was only for HW, ie ata. But for stuff like iscsi FB sas
they did fine.

The fact of the matter is that people, me included, run in systems with
very large request for a while now, large like 4096 pages which is

I love your cleanups, and your courage, which I don't have.
I want to help in any way I can. If you need just tell me  what,
and I'll be glad to try or test anything. This is fun I like to work
and think on these things

Thanks
Boaz

--

From: Tejun Heo
Date: Thursday, April 2, 2009 - 1:59 am

Hello, Boaz.



As said before, I don't think exposing bio at the driver interface is
a good idea copy or not.  For internal implementation, if there's

Sure, if you do things against ram, anything will show up.  Do you

That is not the only measure.  Things are always traded off using
multiple evaluation metrics.  The same goes with the _NO_ allocation


There is no need to obsess about no point of failure.  Obsessing about

Didn't know nommu would kill high order allocation or were you talking

Under enough memory pressure, Non-fs IOs are gonna fail somewhere
along the line.  If you don't like that, use mempool backed allocation
or preallocated data structures.  Also, I believe it is generally

Alright, let's than chain bios but let's please do it inside bio


The bug was for all controllers which couldn't do 64bit DMA on 64bit

The patchset was written the way it's written because I couldn't see
any pressing performance requirement for the current existing users.
Optimizing for SG_IO against memory is simply not a good idea.

If you have in-kernel users which often require large transfers, sure,
chaining bio would be better.  Still, let's do it _inside_ bio.  Would
your use case be happy with something like blk_rq_map_kern_bvec(rq,
bvec) which can append to rq?  If we convert internal implementation
over to bvec, blk_rq_map_kern_sg() can be sg->bvec converting wrapper

Thanks a lot but it's not like the code and Jens are scary monsters I

Great, your reviews are comments are very helpful, so please keep them
coming.  :-)

Jens, Tomo, what do you guys think?

Thanks.

-- 
tejun
--

From: Boaz Harrosh
Date: Thursday, April 2, 2009 - 2:54 am

I want to draw your attention, to something you might be forgetting
sorry not to make it clearer before.

There is no users for blk_rq_map_kern_sgl(), and no need for any such 
driver interface.

This is because there is a pending patchest done by TOMO, which is
in it's 4th incarnation by now, which removes the very last users
of scsi_execute_async() and it's use of sgl. together with all that
mess at scsi_lib.c. Removing scsi_execute_async() will remove the only
potential user of blk_rq_map_kern_sgl(). So it can be killed at birth.

The pachset was tested and debugged by all scsi ULD maintainers
and is in very stable state. I hope it can make it into 2.6.30
but for sure it will make it into 2.6.31, which is the time frame
we are talking about.

James any chance this patchset can make 2.6.30?

Please note that the important historical users of scsi_execute_async()
sg.c and sr.c don't use it already in 2.6.28-29, and the last bugs of that
change, where wrinkled out.


Reads from some SSDs/FLASH is just as fast as RAM. Battery backup RAM
is very common. Pete Wyckoff reached speeds over RDMA which come close

I agree completely. "no mess" - first priority, "no point of failure"
- far behind second. "Lots of work" - not considered



I agree, exactly my point, eliminate any extra allocation is the

Yes, I completely agree. This can/should be internal bio business.
Doing it this way will also add robustness to other users of bio
like filesystems raid-engines and staking drivers.


When they come they come in groups. I'm talking about steady stream
of these that come for minutes. think any streaming application
like backup or plain cp of very large files, video prepossessing, ...

See above I answered that. No need for blk_rq_map_kern_bvec() nor
blk_rq_map_kern_sg(), currently. OSD absolutely needs the direct
use of bios via the fs route, and the other users are going out
fast.

And yes bio chaining at the bio level is a dream come true
for me, and it drops 3 ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:38 pm

Hello, Boaz.


One quick question.  Don't the above paths map kernel pages to a
request?

Thanks.

-- 
tejun
--

From: Boaz Harrosh
Date: Thursday, April 2, 2009 - 12:34 am

From: Tejun Heo
Date: Thursday, April 2, 2009 - 12:51 am

In that case it would go through blk_rq_map_kern_sg(), so unless you
wanna change the blk_rq_map_kern_sg() interface such that it uses
bvec, the sgl overhead doesn't matter anyway.  Of course, if it has
been buidling bios directly, that is a different story.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: add multi segment support to kernel bio mapping

Implement bio_{map|copy}_kern_sgl() and implement
bio_{map|copy}_kern() in terms of them.  As all map/copy helpers
support sgl, it's quite simple.  The sgl versions will be used to
extend blk_rq_map_kern*() interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/bio.c            |  254 +++++++++++++++++++++++++++++++-------------------
 include/linux/bio.h |    5 +
 2 files changed, 162 insertions(+), 97 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 04bc5c2..9c921f9 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -26,7 +26,6 @@
 #include <linux/mempool.h>
 #include <linux/workqueue.h>
 #include <linux/blktrace_api.h>
-#include <linux/scatterlist.h>
 #include <linux/pfn.h>
 #include <trace/block.h>
 
@@ -899,13 +898,55 @@ static int bio_memcpy_sgl_uiov(struct scatterlist *sgl, int nents,
 		}
 		WARN_ON_ONCE(slen);		/* iov too short */
 	}
-	WARN_ON_ONCE(iov_iter_count(&ii));	/* bio too short */
+	WARN_ON_ONCE(iov_iter_count(&ii));	/* sgl too short */
 done:
 	sg_miter_stop(&si);
 	return ret;
 }
 
 /**
+ *	bio_memcpy_sgl_sgl - copy data betweel two sgls
+ *	@dsgl: destination sgl
+ *	@dnents: number of entries in @dsgl
+ *	@ssgl: source sgl
+ *	@snents: number of entries in @ssgl
+ *	@d_km_type; km_type to use for mapping @dsgl
+ *	@s_km_type; km_type to use for mapping @ssgl
+ *
+ *	Copy data from @ssgl to @dsgl.  The areas should be of the
+ *	same size.
+ */
+static void bio_memcpy_sgl_sgl(struct scatterlist *dsgl, int dnents,
+			       struct scatterlist *ssgl, int snents,
+			       enum km_type d_km_type, enum km_type s_km_type)
+{
+	struct sg_mapping_iter si, di;
+
+	sg_miter_start_atomic(&di, dsgl, dnents, d_km_type);
+	sg_miter_start_atomic(&si, ssgl, snents, s_km_type);
+
+	while (sg_miter_next(&di)) {
+		void *daddr = di.addr;
+		size_t dlen = di.length;
+
+		while (dlen && sg_miter_next(&si)) {
+			size_t copy = min(dlen, si.length);
+
+			memcpy(daddr, si.addr, ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup

Collapse double underbar prefixed internal functions which have only
single user into their respective call sites.  This prepares for
reimplementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/bio.c |  129 +++++++++++++++++++++++--------------------------------------
 1 files changed, 49 insertions(+), 80 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index f13aef0..4540afc 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1044,10 +1044,20 @@ struct bio *bio_copy_user(struct request_queue *q, struct rq_map_data *md,
 	return bio_copy_user_iov(q, md, &iov, 1, rw, gfp);
 }
 
-static struct bio *__bio_map_user_iov(struct request_queue *q,
-				      struct block_device *bdev,
-				      struct iovec *iov, int count, int rw,
-				      gfp_t gfp)
+/**
+ *	bio_map_user_iov - map user iovec table into bio
+ *	@q: the struct request_queue for the bio
+ *	@bdev: destination block device
+ *	@iov:	the iovec.
+ *	@count: number of elements in the iovec
+ *	@rw: READ or WRITE
+ *	@gfp: memory allocation flags
+ *
+ *	Map the user space address into a bio suitable for io to a block
+ *	device. Returns an error pointer in case of error.
+ */
+struct bio *bio_map_user_iov(struct request_queue *q, struct block_device *bdev,
+			     struct iovec *iov, int count, int rw, gfp_t gfp)
 {
 	int i, j;
 	size_t tot_len = 0;
@@ -1141,6 +1151,15 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
 
 	bio->bi_bdev = bdev;
 	bio->bi_flags |= (1 << BIO_USER_MAPPED);
+
+	/*
+	 * subtle -- if __bio_map_user() ended up bouncing a bio,
+	 * it would normally disappear when its bi_end_io is run.
+	 * however, we need it for the unmap, so grab an extra
+	 * reference to it
+	 */
+	bio_get(bio);
+
 	return bio;
 
  out_unmap:
@@ -1180,38 +1199,15 @@ struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev,
 }
 
 /**
- *	bio_map_user_iov - map user iovec table into bio
- *	@q: the struct request_queue for the bio
- *	@bdev: ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: add new features to internal functions to prepare for future changes

Expand bci such that it supports sgl as source and implement
bio_memcpy_sgl_sgl() which can copy data between two sgls.  These will
be used to implement blk_rq_copy/map_kern_iov().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/bio.c |   82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 74 insertions(+), 8 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 1ca8b16..04bc5c2 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -758,8 +758,10 @@ err:
 struct bio_copy_info {
 	struct scatterlist	*copy_sgl;	/* one sg per page */
 	struct iovec		*src_iov;	/* source iovec from userland */
+	struct scatterlist	*src_sgl;	/* source sgl */
 	int			copy_nents;	/* #entries in copy_sgl */
 	int			src_count;	/* #entries in src_iov */
+	int			src_nents;	/* #entries in src_sgl */
 	size_t			len;		/* total length in bytes */
 	int			is_our_pages;	/* do we own copied pages? */
 };
@@ -776,6 +778,7 @@ static void bci_destroy(struct bio_copy_info *bci)
 		bio_sgl_free_pages(bci->copy_sgl, bci->copy_nents);
 	kfree(bci->copy_sgl);
 	kfree(bci->src_iov);
+	kfree(bci->src_sgl);
 	kfree(bci);
 }
 
@@ -783,6 +786,8 @@ static void bci_destroy(struct bio_copy_info *bci)
  *	bci_create - create a bci
  *	@src_iov: source iovec
  *	@src_count: number of entries in @src_iov
+ *	@src_sgl: source sgl
+ *	@src_nents: number of entries in @src_sgl
  *	@gfp: gfp for data structure allocations
  *	@page_gfp: gfp for page allocations
  *	@md: optional preallocated page pool
@@ -795,28 +800,47 @@ static void bci_destroy(struct bio_copy_info *bci)
  *	Pointer to the new bci on success, NULL on failure.
  */
 static struct bio_copy_info *bci_create(struct iovec *src_iov, int src_count,
-					gfp_t gfp, gfp_t page_gfp,
-					struct rq_map_data *md)
+				struct scatterlist *src_sgl, int src_nents,
+				gfp_t gfp, gfp_t page_gfp,
+				struct rq_map_data *md)
 {
 	struct bio_copy_info *bci;
+	struct ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: hack removal

SCSI needs to map sgl into rq for kernel PC requests; however, block
API didn't have such feature so it used its own rq mapping function
which hooked into block/bio internals and is generally considered an
ugly hack.  The private function may also produce requests which are
bigger than queue per-rq limits.

Block blk_rq_map_kern_sgl().  Kill the private implementation and use
it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 drivers/scsi/scsi_lib.c |  108 +----------------------------------------------
 1 files changed, 1 insertions(+), 107 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 3196c83..3fa5589 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -296,112 +296,6 @@ static void scsi_end_async(struct request *req, int uptodate)
 	__blk_put_request(req->q, req);
 }
 
-static int scsi_merge_bio(struct request *rq, struct bio *bio)
-{
-	struct request_queue *q = rq->q;
-
-	bio->bi_flags &= ~(1 << BIO_SEG_VALID);
-	if (rq_data_dir(rq) == WRITE)
-		bio->bi_rw |= (1 << BIO_RW);
-	blk_queue_bounce(q, &bio);
-
-	return blk_rq_append_bio(q, rq, bio);
-}
-
-static void scsi_bi_endio(struct bio *bio, int error)
-{
-	bio_put(bio);
-}
-
-/**
- * scsi_req_map_sg - map a scatterlist into a request
- * @rq:		request to fill
- * @sgl:	scatterlist
- * @nsegs:	number of elements
- * @bufflen:	len of buffer
- * @gfp:	memory allocation flags
- *
- * scsi_req_map_sg maps a scatterlist into a request so that the
- * request can be sent to the block layer. We do not trust the scatterlist
- * sent to use, as some ULDs use that struct to only organize the pages.
- */
-static int scsi_req_map_sg(struct request *rq, struct scatterlist *sgl,
-			   int nsegs, unsigned bufflen, gfp_t gfp)
-{
-	struct request_queue *q = rq->q;
-	int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
-	unsigned int data_len = bufflen, len, bytes, off;
-	struct scatterlist *sg;
-	struct page ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup

blk-map and bio are about to go through major update.  Make the
following renames to ease future changes.

* more concise/wieldy names: s/gfp_mask/gfp/, s/map_data/md/,
  s/iov_count/count/.  Note that count is and will continue to be used
  only for number of entries in iovec.  Similarly, nents will be used
  for number of entries in sgl.

* less confusing names: bio_map_data doesn't have much to do with
  mapping per-se but is aux info for copying.  Also, it's very
  confusing with rq_map_data which BTW is reserved pool of pages for
  copying.  Rename bio_map_data to bio_copy_info and everything
  related to bci_*().  This part of API is gonna receive major
  overhaul.  The names and semantics will get clearer with future
  changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c        |   37 ++++----
 fs/bio.c               |  231 +++++++++++++++++++++++------------------------
 include/linux/bio.h    |   37 ++++----
 include/linux/blkdev.h |    4 +-
 4 files changed, 150 insertions(+), 159 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 4f0221a..eb206df 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -43,11 +43,11 @@ static int __blk_rq_unmap_user(struct bio *bio)
  * blk_rq_map_user_iov - map user data to a request, for REQ_TYPE_BLOCK_PC usage
  * @q:		request queue where request should be inserted
  * @rq:		request to map data to
- * @map_data:   pointer to the rq_map_data holding pages (if necessary)
+ * @md:		pointer to the rq_map_data holding pages (if necessary)
  * @iov:	pointer to the iovec
- * @iov_count:	number of elements in the iovec
+ * @count:	number of elements in the iovec
  * @len:	I/O byte count
- * @gfp_mask:	memory allocation flags
+ * @gfp:	memory allocation flags
  *
  * Description:
  *    Data will be mapped directly for zero copy I/O, if possible. Otherwise
@@ -63,20 +63,19 @@ static int __blk_rq_unmap_user(struct bio *bio)
  *    unmapping.
  */
 int ...
From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: API cleanup

blk_rq_map_user_iov() took @len parameter which contains duplicate
information as the total length is available as the sum of all iov
segments.  This doesn't save anything either as the mapping function
should walk the whole iov on entry to check for alignment anyway.
Remove the superflous parameter.

Removing the superflous parameter removes the pathological corner case
where the caller passes in shorter @len than @iov but @iov mappings
ends up capped due to queue limits and bio->bi_size ends up matching
@len thus resulting in successful map.  With the superflous parameter
gone, blk-map/bio can now simply fail partial mappings.

Move partial mapping detection to bio_create_from_sgl() which is
shared by all map/copy paths and remove partial mapping handling from
all other places.

This change removes one of the two users of __blk_rq_unmap_user() and
it gets collapsed into blk_rq_unmap_user().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c        |   47 +++++++++++++++--------------------------------
 block/scsi_ioctl.c     |   11 +++--------
 drivers/scsi/sg.c      |    2 +-
 fs/bio.c               |   43 +++++++++++++++++--------------------------
 include/linux/blkdev.h |    2 +-
 5 files changed, 37 insertions(+), 68 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index dc4097c..f60f439 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -8,20 +8,6 @@
 
 #include "blk.h"
 
-static int __blk_rq_unmap_user(struct bio *bio)
-{
-	int ret = 0;
-
-	if (bio) {
-		if (bio_flagged(bio, BIO_USER_MAPPED))
-			bio_unmap_user(bio);
-		else
-			ret = bio_uncopy_user(bio);
-	}
-
-	return ret;
-}
-
 /**
  * blk_rq_map_user_iov - map user data to a request, for REQ_TYPE_BLOCK_PC usage
  * @q:		request queue where request should be inserted
@@ -29,7 +15,6 @@ static int __blk_rq_unmap_user(struct bio *bio)
  * @md:		pointer to the rq_map_data holding pages (if necessary)
  * @iov:	pointer to the iovec
  * ...
From: Boaz Harrosh
Date: Wednesday, April 1, 2009 - 10:12 am

I did not go to the bottom of this but how does that do not change user-space
API. It would now fail in code that would work before.

What if I want to try variable size commands, where I try the shorter
first, then a longer one (or vis versa). I would setup memory mapping to the biggest
command but issue a length that correspond to the encoded command inside the CDB.

The bi->bi_size is not only mapping size it is also the command size I want to
actually read/write.


--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 3:17 pm

Hello,



blk_rq_map_user_iov() never had the shorter data len functionality.
It fails requests with -EINVAL if iov_length(iov) != dxfer_len, so a
patch in the previous patchset adds the above iov_shorten() call to
trim iov if dxfer_len is shorter.  In this patch, it gets a bit
simpler as the length parameter isn't necessary.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup, removal of unused / undesirable API

With recent changes, the following functions aren't used anymore.

* bio_{map|copy}_{user|kern}()
* blk_rq_append_bio()

The following functions aren't used outside of block layer.

* bio_add_pc_page()
* bio_un{map|copy}_user()

Kill the first group and unexport the second.  As bio_add_pc_page() is
used only inside fs/bio.c, it's made static.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c        |   17 --------
 fs/bio.c               |  102 +----------------------------------------------
 include/linux/bio.h    |   12 ------
 include/linux/blkdev.h |    6 ---
 4 files changed, 3 insertions(+), 134 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 0474c09..dc4097c 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -8,23 +8,6 @@
 
 #include "blk.h"
 
-int blk_rq_append_bio(struct request_queue *q, struct request *rq,
-		      struct bio *bio)
-{
-	if (!rq->bio)
-		blk_rq_bio_prep(q, rq, bio);
-	else if (!ll_back_merge_fn(q, rq, bio))
-		return -EINVAL;
-	else {
-		rq->biotail->bi_next = bio;
-		rq->biotail = bio;
-
-		rq->data_len += bio->bi_size;
-	}
-	return 0;
-}
-EXPORT_SYMBOL(blk_rq_append_bio);
-
 static int __blk_rq_unmap_user(struct bio *bio)
 {
 	int ret = 0;
diff --git a/fs/bio.c b/fs/bio.c
index 9c921f9..fe796dc 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -627,8 +627,9 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
  *      smaller than PAGE_SIZE, so it is always possible to add a single
  *      page to an empty bio. This should only be used by REQ_PC bios.
  */
-int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page *page,
-		    unsigned int len, unsigned int offset)
+static int bio_add_pc_page(struct request_queue *q, struct bio *bio,
+			   struct page *page, unsigned int len,
+			   unsigned int offset)
 {
 	return __bio_add_page(q, bio, page, len, offset, q->max_hw_sectors);
 }
@@ ...
From: Boaz Harrosh
Date: Wednesday, April 1, 2009 - 6:54 am

This function is used by drivers/scsi/osd/osd_initiator.c
Currently in mainline. Please use allmodconfig to compile


I've posted propositions on how to fix osd_initiator.c which
involves making blk_map_kern() append to the request, and a new
blk_make_request(bio).

Any suggestions are welcome

Thanks
Boaz
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 7:06 am

Hello,


The patchse is against Jens's tree which hasn't received the changes

Yeap, I'll take a look at that.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: new API

Implement blk_rq_map_kern_sgl() using bio_copy_{map|kern}_sgl() and
reimplement blk_rq_map_kern() in terms of it.  As the bio helpers
already have all the necessary checks, all blk_rq_map_kern_sgl() has
to do is wrap them and initialize rq accordingly.  The implementation
closely resembles blk_rq_msp_user_iov().

This is an exported API and will be used to replace hack in scsi ioctl
implementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c        |   54 ++++++++++++++++++++++++++++++-----------------
 include/linux/blkdev.h |    2 +
 2 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index eb206df..0474c09 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -161,47 +161,61 @@ int blk_rq_unmap_user(struct bio *bio)
 EXPORT_SYMBOL(blk_rq_unmap_user);
 
 /**
- * blk_rq_map_kern - map kernel data to a request, for REQ_TYPE_BLOCK_PC usage
+ * blk_rq_map_kern_sg - map kernel data to a request, for REQ_TYPE_BLOCK_PC
  * @q:		request queue where request should be inserted
  * @rq:		request to fill
- * @kbuf:	the kernel buffer
- * @len:	length of user data
+ * @sgl:	area to map
+ * @nents:	number of elements in @sgl
  * @gfp:	memory allocation flags
  *
  * Description:
  *    Data will be mapped directly if possible. Otherwise a bounce
  *    buffer is used.
  */
-int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
-		    unsigned int len, gfp_t gfp)
+int blk_rq_map_kern_sg(struct request_queue *q, struct request *rq,
+		       struct scatterlist *sgl, int nents, gfp_t gfp)
 {
 	int rw = rq_data_dir(rq);
-	int do_copy = 0;
 	struct bio *bio;
 
-	if (len > (q->max_hw_sectors << 9))
-		return -EINVAL;
-	if (!len || !kbuf)
+	if (!sgl || nents <= 0)
 		return -EINVAL;
 
-	do_copy = !blk_rq_aligned(q, kbuf, len) || object_is_on_stack(kbuf);
-	if (do_copy)
-		bio = bio_copy_kern(q, kbuf, len, gfp, rw);
-	else
-		bio = bio_map_kern(q, kbuf, len, ...
From: Boaz Harrosh
Date: Wednesday, April 1, 2009 - 9:50 am

You might want to call bio_copy_kern_sg from within bio_map_kern_sg
and remove yet another export from bio layer


It could be nice to call blk_rq_append_bio() here
and support multiple calls to this member.
This will solve half of my problem with osd_initiator

Hmm .. but you wanted to drop multiple bio chaining


--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 3:25 pm

Hello,



I don't want to drop multiple bio chaining at all in itself.  I just
didn't see the current uses as, well, useful.  If building a sgl for a
request at once is difficult for your purpose, making blk_rq_map_*()
functions accumulate bios sounds like a good idea.  The primary goal
was to remove direct bio visibility/manipulation from low level

Yeap, will make it inline.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: cleanup

As bio_map_user_iov() is unique in the way it acquires pages to map
from all other three operations (user-copy, kern-map, kern-copy), the
function still needs to get the pages itself but the bio creation can
use bio_create_from_sgl().  Create sgl of mapped pages and use it to
create bio from it.  The code will be further simplified with future
changes to bio_create_from_sgl().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/bio.c |   79 ++++++++++++++++++++++++++------------------------------------
 1 files changed, 33 insertions(+), 46 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 4540afc..1ca8b16 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1059,13 +1059,13 @@ struct bio *bio_copy_user(struct request_queue *q, struct rq_map_data *md,
 struct bio *bio_map_user_iov(struct request_queue *q, struct block_device *bdev,
 			     struct iovec *iov, int count, int rw, gfp_t gfp)
 {
-	int i, j;
 	size_t tot_len = 0;
 	int nr_pages = 0;
+	int nents = 0;
+	struct scatterlist *sgl;
 	struct page **pages;
 	struct bio *bio;
-	int cur_page = 0;
-	int ret, offset;
+	int i, ret;
 
 	for (i = 0; i < count; i++) {
 		unsigned long uaddr = (unsigned long)iov[i].iov_base;
@@ -1088,70 +1088,57 @@ struct bio *bio_map_user_iov(struct request_queue *q, struct block_device *bdev,
 	if (!nr_pages || tot_len & q->dma_pad_mask)
 		return ERR_PTR(-EINVAL);
 
-	bio = bio_kmalloc(gfp, nr_pages);
-	if (!bio)
+	sgl = kmalloc(nr_pages * sizeof(sgl[0]), gfp);
+	if (!sgl)
 		return ERR_PTR(-ENOMEM);
+	sg_init_table(sgl, nr_pages);
 
 	ret = -ENOMEM;
-	pages = kcalloc(nr_pages, sizeof(struct page *), gfp);
+	pages = kcalloc(nr_pages, sizeof(pages[0]), gfp);
 	if (!pages)
-		goto out;
+		goto err_sgl;
 
 	for (i = 0; i < count; i++) {
 		unsigned long uaddr = (unsigned long)iov[i].iov_base;
 		unsigned long len = iov[i].iov_len;
-		unsigned long end = (uaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
-		unsigned long start = uaddr >> PAGE_SHIFT;
-		const int ...
From: Boaz Harrosh
Date: Wednesday, April 1, 2009 - 9:33 am

Fast look at all users of get_user_pages_fast().

This can be converted to take an sglist instead of pages* array.
Outside of this code all users either have a fixed-size allocated array
or hard coded one page. This here is the main user. (Out of 5)


--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 3:20 pm

Yeah, that would be nice, although it's understandable that gup taking
page array as parameter.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Wednesday, April 1, 2009 - 6:44 am

Impact: API cleanup

All initialized requests must and are guaranteed to have rq->q set.
Taking both @q and @rq parameters doesn't improve anything.  It only
adds pathological corner case where @q != !@rq->q.  Kill the
superflous @q parameter from blk_rq_map_{user|kern}*().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-map.c                            |   27 +++++++++++----------------
 block/bsg.c                                |    5 ++---
 block/scsi_ioctl.c                         |    6 +++---
 drivers/block/pktcdvd.c                    |    2 +-
 drivers/cdrom/cdrom.c                      |    2 +-
 drivers/scsi/device_handler/scsi_dh_alua.c |    2 +-
 drivers/scsi/device_handler/scsi_dh_emc.c  |    2 +-
 drivers/scsi/device_handler/scsi_dh_rdac.c |    2 +-
 drivers/scsi/scsi_lib.c                    |    7 +++----
 drivers/scsi/scsi_tgt_lib.c                |    3 +--
 drivers/scsi/sg.c                          |    6 +++---
 drivers/scsi/st.c                          |    3 +--
 include/linux/blkdev.h                     |   15 +++++++--------
 13 files changed, 36 insertions(+), 46 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index f60f439..885d359 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -10,7 +10,6 @@
 
 /**
  * blk_rq_map_user_iov - map user data to a request, for REQ_TYPE_BLOCK_PC usage
- * @q:		request queue where request should be inserted
  * @rq:		request to map data to
  * @md:		pointer to the rq_map_data holding pages (if necessary)
  * @iov:	pointer to the iovec
@@ -30,10 +29,10 @@
  *    original bio must be passed back in to blk_rq_unmap_user() for proper
  *    unmapping.
  */
-int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
-			struct rq_map_data *md, struct iovec *iov, int count,
-			gfp_t gfp)
+int blk_rq_map_user_iov(struct request *rq, struct rq_map_data *md,
+			struct iovec *iov, int count, gfp_t gfp)
 {
+	struct request_queue *q = rq->q;
 	struct bio *bio = ...
Previous thread: [PATCH 03/17] blk-map: improve alignment checking for blk_rq_map_user_iov() by Tejun Heo on Wednesday, April 1, 2009 - 6:44 am. (1 message)

Next thread: [PATCH 00/24] Series short description by Alan Cox on Wednesday, April 1, 2009 - 7:04 am. (25 messages)