When two drivers interoperate without an explicit dependency, it is often
required to prevent one of them from being unloaded safely by dereferencing
dev->driver->owner. This patch provides a generic function to do this in a
race-free way.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
---
Not run-time tested in this form, but this is just a generalisation of the
code in drivers/media/video/sh_mobile_ceu_camera.c::sh_mobile_ceu_probe().
If the idea is accepted in principle, I will replace that specific
implementation with a call to this function, test... But I am not sure, if
I'd be able to test it for races. If such testing is required on SMP, I'd
have to write some test-case for it. Thoughts?
drivers/base/dd.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/device.h | 1 +
2 files changed, 64 insertions(+), 0 deletions(-)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index da57ee9..44c6672 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -17,10 +17,12 @@
* This file is released under the GPLv2
*/
+#include <linux/completion.h>
#include <linux/device.h>
#include <linux/delay.h>
#include <linux/module.h>
#include <linux/kthread.h>
+#include <linux/notifier.h>
#include <linux/wait.h>
#include <linux/async.h>
#include <linux/pm_runtime.h>
@@ -422,3 +424,64 @@ void dev_set_drvdata(struct device *dev, void *data)
dev->p->driver_data = data;
}
EXPORT_SYMBOL(dev_set_drvdata);
+
+struct bus_wait {
+ struct notifier_block notifier;
+ struct completion completion;
+ struct device *dev;
+};
+
+static int bus_notify(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct device *dev = data;
+ struct bus_wait *wait = container_of(nb, struct bus_wait, notifier);
+
+ if (wait->dev != dev)
+ return NOTIFY_DONE;
+
+ switch (action) {
+ case BUS_NOTIFY_UNBOUND_DRIVER:
+ /* Protect from module unloading ...Please create some kerneldoc information for this new function so that people know how to use it and what it is for. Also, do you want to provide a device_put_driver() function as well to decrement the owner count once the person who grabed the driver is done EXPORT_SYMBOL_GPL() please. thanks, greg k-h --
Well, once you've got the driver, you put it by just calling module_put(device->driver->owner), but you're right, I should add one for symmetry, and there I won't guard against the race, because if someone would be calling the put() without a successful get(), they deserve any Ok Thanks Guennadi --- Guennadi Liakhovetski, Ph.D. Freelance Open-Source Software Developer http://www.open-technology.de/ --
On Mon, 29 Nov 2010 20:43:28 +0100 (CET) I must ask: why not, instead, make the dependency explicit? In particular, this looks like an application for the proposed media controller code, which is meant to model the connections between otherwise independent devices. The fact that your example comes from V4L2 (which is the current domain of the media controller) also argues that way. jon --
Hi Jon Sorry, don't see a good way to do this. This function is for a general dependency, where you don't have that driver, we are checking for register with us, so, the only way to get to it is via dev->driver->owner. And I also don't want to move registering the device into the dependant driver and then wait (with a timeout) for a driver to probe with it... I just want to verify, whether a driver has attached to that device and whether I can lock it down. Thanks Guennadi --- Guennadi Liakhovetski, Ph.D. Freelance Open-Source Software Developer http://www.open-technology.de/ --
Wait, what? The device is already bound to a driver, right, so why would you care about "locking" the module into memory? What could this Who cares if a driver is attached to any device? And again, why would you want to "lock it down"? confused, greg k-h --
In my case I have two platform devices: CEU and CSI2. In some cases (with parallel sensors) CEU operates on its own. With serial (CSI-2) camera sensors we need the CSI2 driver. So, I want to try_module_get(csi2_dev->driver->owner) the CSI2 driver from my CEU driver. This call can Oops if not done safely. Am I missing something? Is there an easier way to achieve the same? Thanks Guennadi --- Guennadi Liakhovetski, Ph.D. Freelance Open-Source Software Developer http://www.open-technology.de/ --
On Mon, 29 Nov 2010 23:10:50 +0100 (CET) This looks exactly like what the V4L2 subdev mechanism was meant to do. Is there a reason you can't use that interface? Thanks, jon --
Subdev doesn't solve this completely. Media controller, probably, will, not sure to what extent, but it is not yet in the mainline. These my devices do use the subdev API, the CEU driver registers a v4l2_device, and the CSI2 driver and the attached sensor register v4l2_subdev instances. The problem is - when you register v4l2_subdev, you have to specify the respective v4l2_device. In case of sensor (and other I2C) drivers this is solved with I2C probing. Generally speaking, the v4l2_device registers a device for the subdev driver, and expects a driver to attach to it. Then, when the subdev driver probes with the newly registered device, it passes its v4l2_subdev instance back with its i2c_client object, and then the caller can match them. While doing so they also allow a race like if (!try_module_get(client->driver->driver.owner)) which is exactly what my patch is attempting to fix. Thanks Guennadi --- Guennadi Liakhovetski, Ph.D. Freelance Open-Source Software Developer http://www.open-technology.de/ --
But again, why would some other driver ever care about what some random Yes, from userspace load the module and then don't worry about it. Don't ever think that poking around in a dev->driver field is safe at all, it isn't. I should just go hide the thing from the rest of the kernel to keep this from happening, now that you mention it... thanks, greg k-h --
Exactly, that's why I'm proposing it for dd.c;) Thanks Guennadi --- Guennadi Liakhovetski, Ph.D. Freelance Open-Source Software Developer http://www.open-technology.de/ --
Ok, but again go back to Jon's original proposal to just call the functions in that driver from yours, causing the implicit module ordering issue to be automatically resolved. thanks, greg k-h --
Greg, in this specific case - yes, I could do this. But (1) there is no need for that - both drivers implement and use the v4l2-subdev API and thus stay generic. In the host driver this adds the convenience, that it doesn't have to call to the CSI2 driver explicitly at all - it just calls the v4l2-subdev function like "call .s_mbus_fmt for all subdev drivers" and the function is called for the sensor and the CSI2 driver. (2) what about the other location I pointed out earlier in the v4l2 core? There drivers are absolutely generic. I also suspect these are not the only cases, where this helper would come in handy. I added the media list to CC for any more opinions on this matter. Thanks Guennadi --- Guennadi Liakhovetski, Ph.D. Freelance Open-Source Software Developer http://www.open-technology.de/ --
I agree, it probably would not solve all of the different issues that people might have for this type of thing, and this isn't the first time I've heard it be requested either. But, this patch is just trying to increment a module owner of a device that is bound to a driver, which is the wrong level to be thinking of it. If you request a module to be loaded, what would possibly cause it to be unbound that you need to have this "safely" in place? Why would the module be unloaded? And if it was unloaded, doesn't that imply that someone else wanted it unloaded so keeping that from happening would be a bit rude, right? Look at network modules, we always allow them to be unloaded, even if the device is "in use" and that doesn't cause problems. So why would you need to do this when we are trying (over the past 10 years or so) to move away from the "lock the module in place because we know better than the user" model? thanks, greg k-h --
Hi Greg, It depends on your definition of rude. I would consider the kernel even more rude if it accepted my unload request and then crashed. I've recently run into a problem similar to Guennadi's with the OMAP3 ISP driver. The driver instantiates several V4L2 I2C sub-devices for the camera sensors and the lens and flash controllers. The sub-device drivers get platform data when they're probed, and receive callbacks to the board code to turn power on/off and configure clocks (it's a bit more complex than just that, but you get the idea). The board code callbacks then call to the OMAP3 ISP driver to configure clocks, because the sensor clock is provided by the OMAP3 ISP. Now, when the user opens the sensor's subdev device node (/dev/v4l-subdev*), the subdev open function will turn the sensor clock on. To do that it will call the OMAP3 ISP driver through board code. If the OMAP3 ISP driver is unloaded at that point things will go pretty bad. The way we deal with this is to try_module_get() on the OMAP3 ISP driver in We need to lock the module in place if its code can be called from another driver. Coming up with a lock-free way to handle this would be similar to removing the try_module_get() call from cdev_get(). Maybe it could be done, but I'm not sure it should. -- Regards, Laurent Pinchart --
I totally agree, and that is a bug that should be fixed, but shouldn't have anything to do with this proposed interface (i.e. locking the Then the interface to call that driver should be properly reference counted, right? That has nothing to do with the driver core locking Do it like the rest of the kernel does it, lock the module in place with the module pointer it passed to you before calling open in that module. Nothing new here at all. thanks, greg k-h --
Hi Greg, That doesn't work in this case, because we have two modules. Module A is the master and instantiates an I2C device handled by module B. Module B creates a character device and sets itself as the owner. When the corresponding device node is opened, module B's refcount is incremented, but module A refcount isn't, even though module B can call to module A through board code using function pointers provided in the platform data. -- Regards, Laurent Pinchart --
Again, this is something we have been doing for years just fine. Look at the usb-serial core. It "owns" the device node yet the child drivers are the ones actually handling the data. Just never call the function pointer unless the module is loaded, it's that simple. If you need to add proper module ownership to the platform data pointers, so be it. thanks, greg k-h --
All this is pretty easy to add to the way subdevs are handled in v4l. The reason nothing along those lines has been implemented yet is simply that it isn't necessary at the moment. But once subdevs can have device nodes, then we need to add it. This definitely does not belong to the kernel core, it's a v4l core framework thing. Regards, Hans -- Hans Verkuil - video4linux developer - sponsored by Cisco --
