select() and procfs

Earlier this month Lucas Nussbaum pointed out that the following code hangs in 2.6 kernels:


  require 'thread'

  Thread.new do
    while true do
      puts "before read UPTIME" 
      str = IO.read("/proc/uptime")
      puts "after read UPTIME" 
    end
  end.join

This is the same problem I’ve run in to earlier, where threaded access to certain files in /proc hang the thread, but non-threaded access (and access to other files in /proc) work fine.

So what’s going on here? Thanks to Alban Crequy’s experimentation on the issue, I think I finally understand why this is happening.

When Ruby reads a file in single-threaded mode, it just uses read(). If any threads are active, however, it first polls the file descriptor with a select(). That select() is what’s hanging:
  select(4, [3], [], [], NULL <unfinished ...>

So it looks like select() is doing something weird on these files in /proc. To the best of my understanding, aided by Alban’s kernel hacking, the way it works in the kernel is like this: for each file in /proc that a device driver creates, it registers handlers with the kernels via a struct file_operations. One handler is poll, and if the driver doesn’t specify it, the file is essentially useless under select(): it will always hang until the timeout has been reached. (See fs/select.c:do_select() in the kernel.)

I’m sure I don’t understand all the issues involved here, but that seems wrong to me. The whole point of /proc is to have a nice filesystem interface to device driver and kernel information, and here the kernel is breaking that facade by giving select() weird behavior. And in fact, Alban found a thread that implies implementing poll in procfs is actually frowned upon.

So, from the Ruby POV, I’m not sure what the solution is. The kernel folks have chosed to have a funky /proc behavior, and I don’t think it’s clear whether Ruby should change its behavior to accomodate what is, essentially, access to “special” files. Or even whether it can change its behavior to avoid using select() (and instead use read() on a O_NONBLOCK fd) without a tremendous amount of work.

Comments

This entry has been archived and comments are no longer accepted.