Board index » database » Linux and broken O_DIRECT
|
Florian Weimer
Registered User |
|
Florian Weimer
Registered User |
Linux and broken O_DIRECT
2004-10-29 03:56:49 PM
dist/configure.ac in BDB 4.2 contains the following comment: # Linux has a broken O_DIRECT flag, but we allow people to override it from # the command line. In what way is Linux broken? Are there problems with metadata updates, like on other systems? Is the brokenness specific to certain kernel versions? - |
| bostic
Registered User |
2004-10-31 02:36:00 AM
Re:Linux and broken O_DIRECT
Florian Weimer <fw@deneb.enyo.de>wrote in message news:<871xfiot32.fsf@deneb.enyo.de>...
Quote# Linux has a broken O_DIRECT flag, but we allow people to override it from the open system call will fail if O_DIRECT is specified, + Systems where the open calls will succeed when the O_DIRECT flag is specified, but any subsequent read or write using the file descriptor returned by the open call will fail, + Systems where O_DIRECT worked with some filesystems but not with others, + Systems where buffers require specific alignment if they are to be written to a file descriptor for which O_DIRECT was specified; if a buffer isn't properly aligned the read/write call will fail (rather than the system falling back to a slower read/write). QuoteAre there problems with metadata updates, like on other systems? In the Berkeley DB 4.3 release, we default to not using the O_DIRECT flag. You can always override the default and configure O_DIRECT explicitly, however, using: env db_cv_open_o_direct=yes ../dist/configure [args] Regards, --keith =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Keith Bostic bostic@sleepycat.com Sleepycat Software Inc. keithbosticim (ymsgid) 118 Tower Rd. +1-781-259-3139 Lincoln, MA 01773 www.sleepycat.com - |
| Florian Weimer
Registered User |
2004-11-01 05:15:00 PM
Re:Linux and broken O_DIRECT
* Keith Bostic:
not cross page boundaries. In practice, this means that the bufferQuoteIn what way is Linux broken? [...] + Systems where buffers require specific alignment if they are to be written to a file descriptor for which O_DIRECT was specified; if a buffer isn't properly aligned the read/write call will fail (rather than the system falling back to a slower read/write). Linux 2.6 requires that block boundaries in the user-space buffer do has to be page-aligned. I think it's possible to work around this problem in the read/write routines at the expense of an extra copy. Since the kernel caching algorithms are bypassed, it might still be a win. - |
