Re: [PATCH/RFC] core: add a function to safely try to get device driver owner

Previous thread: RE: data corruption with stex (Promise HW-Raid) driver and device-mapper by Ed Lin - PTU on Monday, November 29, 2010 - 12:35 pm. (1 message)

Next thread: [PATCH] regulator: Add and use rdev_<level> macros by Joe Perches on Monday, November 29, 2010 - 12:54 pm. (7 messages)
From: Guennadi Liakhovetski
Date: Monday, November 29, 2010 - 12:43 pm

When two drivers interoperate without an explicit dependency, it is often
required to prevent one of them from being unloaded safely by dereferencing
dev-&gt;driver-&gt;owner. This patch provides a generic function to do this in a
race-free way.

Signed-off-by: Guennadi Liakhovetski &lt;g.liakhovetski@gmx.de&gt;
---

Not run-time tested in this form, but this is just a generalisation of the 
code in drivers/media/video/sh_mobile_ceu_camera.c::sh_mobile_ceu_probe(). 
If the idea is accepted in principle, I will replace that specific 
implementation with a call to this function, test... But I am not sure, if 
I'd be able to test it for races. If such testing is required on SMP, I'd 
have to write some test-case for it. Thoughts?

 drivers/base/dd.c      |   63 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/device.h |    1 +
 2 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index da57ee9..44c6672 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -17,10 +17,12 @@
  * This file is released under the GPLv2
  */
 
+#include &lt;linux/completion.h&gt;
 #include &lt;linux/device.h&gt;
 #include &lt;linux/delay.h&gt;
 #include &lt;linux/module.h&gt;
 #include &lt;linux/kthread.h&gt;
+#include &lt;linux/notifier.h&gt;
 #include &lt;linux/wait.h&gt;
 #include &lt;linux/async.h&gt;
 #include &lt;linux/pm_runtime.h&gt;
@@ -422,3 +424,64 @@ void dev_set_drvdata(struct device *dev, void *data)
 	dev-&gt;p-&gt;driver_data = data;
 }
 EXPORT_SYMBOL(dev_set_drvdata);
+
+struct bus_wait {
+	struct notifier_block	notifier;
+	struct completion	completion;
+	struct device		*dev;
+};
+
+static int bus_notify(struct notifier_block *nb,
+		      unsigned long action, void *data)
+{
+	struct device *dev = data;
+	struct bus_wait *wait = container_of(nb, struct bus_wait, notifier);
+
+	if (wait-&gt;dev != dev)
+		return NOTIFY_DONE;
+
+	switch (action) {
+	case BUS_NOTIFY_UNBOUND_DRIVER:
+		/* Protect from module unloading ...
From: Greg KH
Date: Monday, November 29, 2010 - 12:56 pm

Please create some kerneldoc information for this new function so that
people know how to use it and what it is for.

Also, do you want to provide a device_put_driver() function as well to
decrement the owner count once the person who grabed the driver is done

EXPORT_SYMBOL_GPL() please.

thanks,

greg k-h
--

From: Guennadi Liakhovetski
Date: Monday, November 29, 2010 - 1:11 pm

Well, once you've got the driver, you put it by just calling 
module_put(device-&gt;driver-&gt;owner), but you're right, I should add one for 
symmetry, and there I won't guard against the race, because if someone 
would be calling the put() without a successful get(), they deserve any 

Ok

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--

From: Jonathan Corbet
Date: Monday, November 29, 2010 - 1:17 pm

On Mon, 29 Nov 2010 20:43:28 +0100 (CET)

I must ask: why not, instead, make the dependency explicit?  In
particular, this looks like an application for the proposed media
controller code, which is meant to model the connections between otherwise
independent devices.  The fact that your example comes from V4L2 (which is
the current domain of the media controller) also argues that way.

jon
--

From: Guennadi Liakhovetski
Date: Monday, November 29, 2010 - 1:54 pm

Hi Jon


Sorry, don't see a good way to do this. This function is for a general 
dependency, where you don't have that driver, we are checking for register 
with us, so, the only way to get to it is via dev-&gt;driver-&gt;owner. And I 
also don't want to move registering the device into the dependant driver 
and then wait (with a timeout) for a driver to probe with it... I just 
want to verify, whether a driver has attached to that device and whether I 
can lock it down.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--

From: Greg KH
Date: Monday, November 29, 2010 - 2:41 pm

Wait, what?  The device is already bound to a driver, right, so why
would you care about &quot;locking&quot; the module into memory?  What could this

Who cares if a driver is attached to any device?  And again, why would
you want to &quot;lock it down&quot;?

confused,

greg k-h
--

From: Guennadi Liakhovetski
Date: Monday, November 29, 2010 - 3:10 pm

In my case I have two platform devices: CEU and CSI2. In some cases (with 
parallel sensors) CEU operates on its own. With serial (CSI-2) camera 
sensors we need the CSI2 driver. So, I want to 
try_module_get(csi2_dev-&gt;driver-&gt;owner) the CSI2 driver from my CEU 
driver. This call can Oops if not done safely. Am I missing something? Is 
there an easier way to achieve the same?

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--

From: Jonathan Corbet
Date: Monday, November 29, 2010 - 3:28 pm

On Mon, 29 Nov 2010 23:10:50 +0100 (CET)

This looks exactly like what the V4L2 subdev mechanism was meant to do.
Is there a reason you can't use that interface?

Thanks,

jon
--

From: Guennadi Liakhovetski
Date: Tuesday, November 30, 2010 - 12:18 am

Subdev doesn't solve this completely. Media controller, probably, will, 
not sure to what extent, but it is not yet in the mainline.

These my devices do use the subdev API, the CEU driver registers a 
v4l2_device, and the CSI2 driver and the attached sensor register 
v4l2_subdev instances. The problem is - when you register v4l2_subdev, you 
have to specify the respective v4l2_device. In case of sensor (and other 
I2C) drivers this is solved with I2C probing. Generally speaking, the 
v4l2_device registers a device for the subdev driver, and expects a driver 
to attach to it. Then, when the subdev driver probes with the newly 
registered device, it passes its v4l2_subdev instance back with its 
i2c_client object, and then the caller can match them. While doing so they 
also allow a race like

	if (!try_module_get(client-&gt;driver-&gt;driver.owner))

which is exactly what my patch is attempting to fix.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--

From: Greg KH
Date: Monday, November 29, 2010 - 3:32 pm

But again, why would some other driver ever care about what some random

Yes, from userspace load the module and then don't worry about it.

Don't ever think that poking around in a dev-&gt;driver field is safe at
all, it isn't.  I should just go hide the thing from the rest of the
kernel to keep this from happening, now that you mention it...

thanks,

greg k-h
--

From: Guennadi Liakhovetski
Date: Monday, November 29, 2010 - 4:11 pm

Exactly, that's why I'm proposing it for dd.c;)

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--

From: Greg KH
Date: Tuesday, November 30, 2010 - 9:56 am

Ok, but again go back to Jon's original proposal to just call the
functions in that driver from yours, causing the implicit module
ordering issue to be automatically resolved.

thanks,

greg k-h
--

From: Guennadi Liakhovetski
Date: Tuesday, November 30, 2010 - 10:09 am

Greg, in this specific case - yes, I could do this. But (1) there is no 
need for that - both drivers implement and use the v4l2-subdev API and 
thus stay generic. In the host driver this adds the convenience, that it 
doesn't have to call to the CSI2 driver explicitly at all - it just calls 
the v4l2-subdev function like &quot;call .s_mbus_fmt for all subdev drivers&quot; 
and the function is called for the sensor and the CSI2 driver. (2) what 
about the other location I pointed out earlier in the v4l2 core? There 
drivers are absolutely generic. I also suspect these are not the only 
cases, where this helper would come in handy. I added the media list to CC 
for any more opinions on this matter.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--

From: Greg KH
Date: Tuesday, November 30, 2010 - 10:15 am

I agree, it probably would not solve all of the different issues that
people might have for this type of thing, and this isn't the first time
I've heard it be requested either.

But, this patch is just trying to increment a module owner of a device
that is bound to a driver, which is the wrong level to be thinking of
it.

If you request a module to be loaded, what would possibly cause it to be
unbound that you need to have this &quot;safely&quot; in place?  Why would the
module be unloaded?  And if it was unloaded, doesn't that imply that
someone else wanted it unloaded so keeping that from happening would be
a bit rude, right?

Look at network modules, we always allow them to be unloaded, even if
the device is &quot;in use&quot; and that doesn't cause problems.  So why would
you need to do this when we are trying (over the past 10 years or so) to
move away from the &quot;lock the module in place because we know better than
the user&quot; model?

thanks,

greg k-h
--

From: Laurent Pinchart
Date: Tuesday, November 30, 2010 - 10:55 am

Hi Greg,



It depends on your definition of rude. I would consider the kernel even more 
rude if it accepted my unload request and then crashed.

I've recently run into a problem similar to Guennadi's with the OMAP3 ISP 
driver. The driver instantiates several V4L2 I2C sub-devices for the camera 
sensors and the lens and flash controllers. The sub-device drivers get 
platform data when they're probed, and receive callbacks to the board code to 
turn power on/off and configure clocks (it's a bit more complex than just 
that, but you get the idea). The board code callbacks then call to the OMAP3 
ISP driver to configure clocks, because the sensor clock is provided by the 
OMAP3 ISP.

Now, when the user opens the sensor's subdev device node (/dev/v4l-subdev*), 
the subdev open function will turn the sensor clock on. To do that it will 
call the OMAP3 ISP driver through board code. If the OMAP3 ISP driver is 
unloaded at that point things will go pretty bad.

The way we deal with this is to try_module_get() on the OMAP3 ISP driver in 

We need to lock the module in place if its code can be called from another 
driver. Coming up with a lock-free way to handle this would be similar to 
removing the try_module_get() call from cdev_get(). Maybe it could be done, 
but I'm not sure it should.

-- 
Regards,

Laurent Pinchart
--

From: Greg KH
Date: Tuesday, November 30, 2010 - 11:32 am

I totally agree, and that is a bug that should be fixed, but shouldn't
have anything to do with this proposed interface (i.e. locking the

Then the interface to call that driver should be properly reference
counted, right?  That has nothing to do with the driver core locking

Do it like the rest of the kernel does it, lock the module in place with
the module pointer it passed to you before calling open in that module.
Nothing new here at all.

thanks,

greg k-h
--

From: Laurent Pinchart
Date: Tuesday, November 30, 2010 - 1:43 pm

Hi Greg,


That doesn't work in this case, because we have two modules. Module A is the 
master and instantiates an I2C device handled by module B. Module B creates a 
character device and sets itself as the owner. When the corresponding device 
node is opened, module B's refcount is incremented, but module A refcount 
isn't, even though module B can call to module A through board code using 
function pointers provided in the platform data.

-- 
Regards,

Laurent Pinchart
--

From: Greg KH
Date: Tuesday, November 30, 2010 - 1:55 pm

Again, this is something we have been doing for years just fine.  Look
at the usb-serial core.  It &quot;owns&quot; the device node yet the child drivers
are the ones actually handling the data.

Just never call the function pointer unless the module is loaded, it's
that simple.  If you need to add proper module ownership to the platform
data pointers, so be it.

thanks,

greg k-h
--

From: Hans Verkuil
Date: Tuesday, November 30, 2010 - 3:19 pm

All this is pretty easy to add to the way subdevs are handled in v4l. The reason
nothing along those lines has been implemented yet is simply that it isn't
necessary at the moment. But once subdevs can have device nodes, then we need to
add it.

This definitely does not belong to the kernel core, it's a v4l core framework
thing.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by Cisco
--